From patchwork Sun Jun 26 12:55:05 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Roger Sayle <roger@nextmovesoftware.com>
X-Patchwork-Id: 55409
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 8D96E385736D
	for <patchwork@sourceware.org>; Sun, 26 Jun 2022 12:55:24 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from server.nextmovesoftware.com (server.nextmovesoftware.com
 [162.254.253.69])
 by sourceware.org (Postfix) with ESMTPS id 8AEE73858294
 for <gcc-patches@gcc.gnu.org>; Sun, 26 Jun 2022 12:55:07 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8AEE73858294
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none)
 header.from=nextmovesoftware.com
Authentication-Results: sourceware.org;
 spf=pass smtp.mailfrom=nextmovesoftware.com
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
 d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID:
 Date:Subject:To:From:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID:
 Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc
 :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:
 List-Subscribe:List-Post:List-Owner:List-Archive;
 bh=pn/vIOUOa6c4Yorx5zsWl72QcH9gonBdrlNl5R0A3NI=; b=A0T3/5r95fGWyasQtV6w4f/akP
 mvnraJNgplV1F6OrRGCZiSPXGe/QvUJKWeO8BOuX+r+Jnmk2AUp6LqR3qu0lK/jF/MGIqOZ04MTex
 B7q7ERToGMGjL+U6iKZ4F9LEj6XCpdEze0UInhY5c5cmSezi44qOvNCVuaYApZc7qIGJu/cG2kSLQ
 PSoJsTm0CWyWb7Tvt168JRd5JIYd/PqcRYv6MYv+NFBWDh/1DMWWpe96/xcZvxGIKcTAUD7F6Dddm
 Q7GzzVIcFi3T/7ObmQ++JBC2MLwTM23qqn/QeSIyzhe5+HbUMJyJl+RIbRAEHApZlTTrq+Q8prmlz
 iC4VIjaQ==;
Received: from host86-130-134-60.range86-130.btcentralplus.com
 ([86.130.134.60]:55379 helo=Dell)
 by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls
 TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2)
 (envelope-from <roger@nextmovesoftware.com>)
 id 1o5Rn8-0003K9-U0; Sun, 26 Jun 2022 08:55:07 -0400
From: "Roger Sayle" <roger@nextmovesoftware.com>
To: "'Jeff Law'" <jeffreyalaw@gmail.com>,
	<gcc-patches@gcc.gnu.org>
Subject: [PATCH take 2] middle-end: Support ABIs that pass FP values as wider
 integers.
Date: Sun, 26 Jun 2022 13:55:05 +0100
Message-ID: <028601d8895b$f60bfe40$e223fac0$@nextmovesoftware.com>
MIME-Version: 1.0
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AdiJWkaQ36JX7r5WRT20RPjD+Vvxdw==
Content-Language: en-gb
X-AntiAbuse: This header was added to track abuse,
 please include it with any abuse report
X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com
X-AntiAbuse: Original Domain - gcc.gnu.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - nextmovesoftware.com
X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id:
 roger@nextmovesoftware.com
X-Authenticated-Sender: server.nextmovesoftware.com:
 roger@nextmovesoftware.com
X-Source: 
X-Source-Args: 
X-Source-Dir: 
X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE,
 SPF_PASS,
 TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

Hi Jeff,
Sorry for the long delay getting back to this, but after deeper
investigation, it turns out that your tingling spider senses that
the original patch wasn't updating everywhere that was required
were spot on.  Although my nvptx testing showed no problems with -O2,
compiling the same tests with -O0 found several additional assertion
ICEs (exactly where you'd predicted they should/would be).

Here's a revised patch that updates five locations (up from the
previous two).  Finding any remaining locations (if any) might be
easier once folks are able to test things on their targets.


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures, and on nvptx-none, where it is
the middle-end portion of a pair of patches to allow the default ISA to
be advanced.  Ok for mainline?


2022-06-26  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR target/104489
	* calls.cc (precompute_register_parameters): Allow promotion
	of floating point values to be passed in wider integer modes.
	(expand_call): Allow floating point results to be returned in
	wider integer modes.
	* cfgexpand.cc (expand_value_return): Allow backends to promote
	a scalar floating point return value to a wider integer mode.
	* expr.cc (expand_expr_real_1) <expand_decl_rtl>: Likewise, allow
	backends to promote scalar FP PARM_DECLs to wider integer modes.
	* function.cc (assign_parm_setup_stack): Allow floating point
	values to be passed on the stack as wider integer modes.

Thanks in advance,
Roger
---

> -----Original Message-----
> From: Jeff Law <jeffreyalaw@gmail.com>
> Sent: 14 March 2022 15:30
> To: Roger Sayle <roger@nextmovesoftware.com>; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] middle-end: Support ABIs that pass FP values as wider
> integers.
> 
> 
> 
> On 2/9/2022 1:12 PM, Roger Sayle wrote:
> > This patch adds middle-end support for target ABIs that pass/return
> > floating point values in integer registers with precision wider than
> > the original FP mode.  An example, is the nvptx backend where 16-bit
> > HFmode registers are passed/returned as (promoted to) SImode registers.
> > Unfortunately, this currently falls foul of the various (recent?)
> > sanity checks that (very sensibly) prevent creating paradoxical
> > SUBREGs of floating point registers.  The approach below is to
> > explicitly perform the conversion/promotion in two steps, via an
> > integer mode of same precision as the floating point value.  So on
> > nvptx, 16-bit HFmode is initially converted to 16-bit HImode (using
> > SUBREG), then zero-extended to SImode, and likewise when going the
> > other way, parameters truncated to HImode then converted to HFmode
> > (using SUBREG).  These changes are localized to expand_value_return
> > and expanding DECL_RTL to support strange ABIs, rather than inside
> > convert_modes or gen_lowpart, as mismatched precision integer/FP
> > conversions should be explicit in the RTL, and these semantics not generally
> visible/implicit in user code.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check with no new failures, and on nvptx-none, where it is
> > the middle-end portion of a pair of patches to allow the default ISA
> > to be advanced.  Ok for mainline?
> >
> > 2022-02-09  Roger Sayle  <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> >         * cfgexpand.cc (expand_value_return): Allow backends to promote
> >         a scalar floating point return value to a wider integer mode.
> >         * expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise, allow
> >         backends to promote scalar FP PARM_DECLs to wider integer modes.
> 
> Buried somewhere in our calling conventions code is the ability to pass around
> BLKmode objects in registers along with the ability to tune left vs right padding
> adjustments.   Much of this support grew out of the PA
> 32 bit SOM ABI.
> 
> While I think we could probably make those bits do what we want, I suspect the
> result will actually be uglier than what you've done here and I wouldn't be
> surprised if there was a performance hit as the code to handle those cases was
> pretty dumb in its implementation.
> 
> What I find rather surprising is the location of your changes -- they feel
> incomplete.  For example, you fix the callee side of returns in
> expand_value_return, but I don't analogous code for the caller side.
> 
> Similarly while you fix things for arguments in expand_expr_real_1, that's again
> just the callee side.  Don't you need to so something on the caller side too?
> 
> Jeff
>

diff --git a/gcc/calls.cc b/gcc/calls.cc
index f4e1299..06d8a95 100644
--- a/gcc/calls.cc
+++ b/gcc/calls.cc
@@ -992,11 +992,24 @@ precompute_register_parameters (int num_actuals, struct arg_data *args,
 	/* If we are to promote the function arg to a wider mode,
 	   do it now.  */
 
-	if (args[i].mode != TYPE_MODE (TREE_TYPE (args[i].tree_value)))
-	  args[i].value
-	    = convert_modes (args[i].mode,
-			     TYPE_MODE (TREE_TYPE (args[i].tree_value)),
-			     args[i].value, args[i].unsignedp);
+	machine_mode old_mode = TYPE_MODE (TREE_TYPE (args[i].tree_value));
+
+	/* Some ABIs require scalar floating point modes to be passed
+	   in a wider scalar integer mode.  We need to explicitly
+	   reinterpret to an integer mode of the correct precision
+	   before extending to the desired result.  */
+	if (SCALAR_INT_MODE_P (args[i].mode)
+	    && SCALAR_FLOAT_MODE_P (old_mode)
+	    && known_gt (GET_MODE_SIZE (args[i].mode),
+			 GET_MODE_SIZE (old_mode)))
+	  {
+	    scalar_int_mode imode = int_mode_for_mode (old_mode).require ();
+	    rtx tmp = force_reg (imode, gen_lowpart (imode, args[i].value));
+	    args[i].value = convert_modes (args[i].mode, imode, tmp, 1);
+	  }
+	else if (args[i].mode != old_mode)
+	  args[i].value = convert_modes (args[i].mode, old_mode,
+					 args[i].value, args[i].unsignedp);
 
 	/* If the value is a non-legitimate constant, force it into a
 	   pseudo now.  TLS symbols sometimes need a call to resolve.  */
@@ -3825,18 +3838,28 @@ expand_call (tree exp, rtx target, int ignore)
 	{
 	  tree type = rettype;
 	  int unsignedp = TYPE_UNSIGNED (type);
+	  machine_mode ret_mode = TYPE_MODE (type);
 	  machine_mode pmode;
 
 	  /* Ensure we promote as expected, and get the new unsignedness.  */
-	  pmode = promote_function_mode (type, TYPE_MODE (type), &unsignedp,
+	  pmode = promote_function_mode (type, ret_mode, &unsignedp,
 					 funtype, 1);
 	  gcc_assert (GET_MODE (target) == pmode);
 
-	  poly_uint64 offset = subreg_lowpart_offset (TYPE_MODE (type),
-						      GET_MODE (target));
-	  target = gen_rtx_SUBREG (TYPE_MODE (type), target, offset);
-	  SUBREG_PROMOTED_VAR_P (target) = 1;
-	  SUBREG_PROMOTED_SET (target, unsignedp);
+	  if (SCALAR_INT_MODE_P (pmode)
+	      && SCALAR_FLOAT_MODE_P (ret_mode)
+	      && known_gt (GET_MODE_SIZE (pmode), GET_MODE_SIZE (ret_mode)))
+	    {
+	      scalar_int_mode imode = int_mode_for_mode (ret_mode).require ();
+	      target = force_reg (imode, gen_lowpart (imode, target));
+	      target = gen_lowpart_SUBREG (ret_mode, target);
+	    }
+	  else
+	    {
+	      target = gen_lowpart_SUBREG (ret_mode, target);
+	      SUBREG_PROMOTED_VAR_P (target) = 1;
+	      SUBREG_PROMOTED_SET (target, unsignedp);
+	    }
 	}
 
       /* If size of args is variable or this was a constructor call for a stack
diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index bb33c1b..a9afec6 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -3720,7 +3720,22 @@ expand_value_return (rtx val)
         mode = promote_function_mode (type, old_mode, &unsignedp, funtype, 1);
 
       if (mode != old_mode)
-	val = convert_modes (mode, old_mode, val, unsignedp);
+	{
+	  /* Some ABIs require scalar floating point modes to be returned
+	     in a wider scalar integer mode.  We need to explicitly
+	     reinterpret to an integer mode of the correct precision
+	     before extending to the desired result.  */
+	  if (SCALAR_INT_MODE_P (mode)
+	      && SCALAR_FLOAT_MODE_P (old_mode)
+	      && known_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (old_mode)))
+	    {
+	      scalar_int_mode imode = int_mode_for_mode (old_mode).require ();
+	      val = force_reg (imode, gen_lowpart (imode, val));
+	      val = convert_modes (mode, imode, val, 1);
+	    }
+	  else
+	    val = convert_modes (mode, old_mode, val, unsignedp);
+	}
 
       if (GET_CODE (return_reg) == PARALLEL)
 	emit_group_load (return_reg, val, type, int_size_in_bytes (type));
diff --git a/gcc/expr.cc b/gcc/expr.cc
index c90cde3..c1ce49e 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -10771,6 +10771,19 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	    pmode = promote_ssa_mode (ssa_name, &unsignedp);
 	  gcc_assert (GET_MODE (decl_rtl) == pmode);
 
+	  /* Some ABIs require scalar floating point modes to be passed
+	     in a wider scalar integer mode.  We need to explicitly
+	     truncate to an integer mode of the correct precision before
+	     using a SUBREG to reinterpret as a floating point value.  */
+	  if (SCALAR_FLOAT_MODE_P (mode)
+	      && SCALAR_INT_MODE_P (pmode)
+	      && known_lt (GET_MODE_SIZE (mode), GET_MODE_SIZE (pmode)))
+	    {
+	      scalar_int_mode imode = int_mode_for_mode (mode).require ();
+	      temp = force_reg (imode, gen_lowpart (imode, decl_rtl));
+	      return gen_lowpart_SUBREG (mode, temp);
+	    }
+
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
 	  SUBREG_PROMOTED_VAR_P (temp) = 1;
 	  SUBREG_PROMOTED_SET (temp, unsignedp);
diff --git a/gcc/function.cc b/gcc/function.cc
index ad0096a..657b95d 100644
--- a/gcc/function.cc
+++ b/gcc/function.cc
@@ -3472,6 +3472,21 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
 
       emit_move_insn (tempreg, validize_mem (copy_rtx (data->entry_parm)));
 
+      /* Some ABIs require scalar floating point modes to be passed
+	 in a wider scalar integer mode.  We need to explicitly
+	 truncate to an integer mode of the correct precision before
+	 using a SUBREG to reinterpret as a floating point value.  */
+      if (SCALAR_FLOAT_MODE_P (data->nominal_mode)
+	  && SCALAR_INT_MODE_P (data->arg.mode)
+	  && known_lt (GET_MODE_SIZE (data->nominal_mode),
+		       GET_MODE_SIZE (data->arg.mode)))
+	{
+	  scalar_int_mode imode
+	    = int_mode_for_mode (data->nominal_mode).require ();
+	  tempreg = force_reg (imode, gen_lowpart (imode, tempreg));
+	  tempreg = gen_lowpart_SUBREG (data->nominal_mode, tempreg);
+	}
+
       push_to_sequence2 (all->first_conversion_insn, all->last_conversion_insn);
       to_conversion = true;