middle-end: Support ABIs that pass FP values as wider integers.

Message ID 057201d81df1$50523bc0$f0f6b340$@nextmovesoftware.com
State New
Headers
Series middle-end: Support ABIs that pass FP values as wider integers. |

Commit Message

Roger Sayle Feb. 9, 2022, 8:12 p.m. UTC
  This patch adds middle-end support for target ABIs that pass/return
floating point values in integer registers with precision wider than
the original FP mode.  An example, is the nvptx backend where 16-bit
HFmode registers are passed/returned as (promoted to) SImode registers.
Unfortunately, this currently falls foul of the various (recent?) sanity
checks that (very sensibly) prevent creating paradoxical SUBREGs of
floating point registers.  The approach below is to explicitly perform the
conversion/promotion in two steps, via an integer mode of same precision
as the floating point value.  So on nvptx, 16-bit HFmode is initially
converted to 16-bit HImode (using SUBREG), then zero-extended to SImode,
and likewise when going the other way, parameters truncated to HImode
then converted to HFmode (using SUBREG).  These changes are localized
to expand_value_return and expanding DECL_RTL to support strange ABIs,
rather than inside convert_modes or gen_lowpart, as mismatched
precision integer/FP conversions should be explicit in the RTL,
and these semantics not generally visible/implicit in user code.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures, and on nvptx-none, where it is
the middle-end portion of a pair of patches to allow the default ISA to
be advanced.  Ok for mainline?


2022-02-09  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
       * cfgexpand.cc (expand_value_return): Allow backends to promote
       a scalar floating point return value to a wider integer mode.
       * expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise, allow
       backends to promote scalar FP PARM_DECLs to wider integer modes.


Thanks in advance,
Roger
--
  

Comments

Tobias Burnus Feb. 17, 2022, 2:35 p.m. UTC | #1
PING for this cfgexpand.cc + expr.cc change by Roger.

This is a pre-requisite for Roger's nvptx patch to avoid an ICE during
bootstrap:

* https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590250.html
   "[PATCH] nvptx: Back-end portion of a fix for PR target/104489."
   (see patch for additional reasoning for this patch)

* See also https://gcc.gnu.org/PR104489
    nvptx, sm_53: internal compiler error: in gen_rtx_SUBREG, at
emit-rtl.cc:1022

Thanks,

Tobias

On 09.02.22 21:12, Roger Sayle wrote:
> This patch adds middle-end support for target ABIs that pass/return
> floating point values in integer registers with precision wider than
> the original FP mode.  An example, is the nvptx backend where 16-bit
> HFmode registers are passed/returned as (promoted to) SImode registers.
> Unfortunately, this currently falls foul of the various (recent?) sanity
> checks that (very sensibly) prevent creating paradoxical SUBREGs of
> floating point registers.  The approach below is to explicitly perform the
> conversion/promotion in two steps, via an integer mode of same precision
> as the floating point value.  So on nvptx, 16-bit HFmode is initially
> converted to 16-bit HImode (using SUBREG), then zero-extended to SImode,
> and likewise when going the other way, parameters truncated to HImode
> then converted to HFmode (using SUBREG).  These changes are localized
> to expand_value_return and expanding DECL_RTL to support strange ABIs,
> rather than inside convert_modes or gen_lowpart, as mismatched
> precision integer/FP conversions should be explicit in the RTL,
> and these semantics not generally visible/implicit in user code.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures, and on nvptx-none, where it is
> the middle-end portion of a pair of patches to allow the default ISA to
> be advanced.  Ok for mainline?
>
>
> 2022-02-09  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * cfgexpand.cc (expand_value_return): Allow backends to promote
>         a scalar floating point return value to a wider integer mode.
>         * expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise, allow
>         backends to promote scalar FP PARM_DECLs to wider integer modes.
>
>
> Thanks in advance,
> Roger
> --
>
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  
Tom de Vries Feb. 22, 2022, 3:42 p.m. UTC | #2
On 2/9/22 21:12, Roger Sayle wrote:
> 
> This patch adds middle-end support for target ABIs that pass/return
> floating point values in integer registers with precision wider than
> the original FP mode.  An example, is the nvptx backend where 16-bit
> HFmode registers are passed/returned as (promoted to) SImode registers.
> Unfortunately, this currently falls foul of the various (recent?) sanity
> checks that (very sensibly) prevent creating paradoxical SUBREGs of
> floating point registers.  The approach below is to explicitly perform the
> conversion/promotion in two steps, via an integer mode of same precision
> as the floating point value.  So on nvptx, 16-bit HFmode is initially
> converted to 16-bit HImode (using SUBREG), then zero-extended to SImode,
> and likewise when going the other way, parameters truncated to HImode
> then converted to HFmode (using SUBREG).  These changes are localized
> to expand_value_return and expanding DECL_RTL to support strange ABIs,
> rather than inside convert_modes or gen_lowpart, as mismatched
> precision integer/FP conversions should be explicit in the RTL,
> and these semantics not generally visible/implicit in user code.
> 

Hi Roger,

I cannot comment on the patch, but I do wonder (after your "strange ABI" 
comment): did we actively decide on (or align to) a register passing ABI 
for HFmode, or has it merely been decided by the implementation of 
promote_arg:
...
static machine_mode
promote_arg (machine_mode mode, bool prototyped)
{
   if (!prototyped && mode == SFmode)
     /* K&R float promotion for unprototyped functions.  */
     mode = DFmode;
   else if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
     mode = SImode;

   return mode;
}
...

There may be a rationale why it's good to pass a HF as SI, but it's not 
documented there.

Anyway, I checked what cuda does for HF, and it passes a byte array:
...
.param .align 2 .b8 _Z5helloPj6__halfs_param_1[2],
...

So, I guess what I'm saying is I'd like to understand why we're having 
the HF -> SI promotion.

Thanks,
- Tom
  
Roger Sayle Feb. 22, 2022, 4:08 p.m. UTC | #3
Hi Tom,

I'll admit that I'd not myself considered the ABI issues when I initially proposed
experimental HFmode support for the nvptx backend, and was surprised when
I finally tracked down the source of the problem you'd reported: that libgcc
spots HFmode support exists and immediately starts passing/returning values
in this type.

The one precedent that I can point to is that LLVM's nvptx backend passes
HFmode values in SImode regs,   see https://reviews.llvm.org/D28540
Their motivation is that not all PTX ISAs support fp16, so for compatibility
with say sm_30/sm_35, fp16 values are treated like b16, i.e. HImode.
At this point, the nvptx ABI states that HImode values are passed as SImode,
so we end up with the interesting mismatch of HFmode<->SImode.
I guess the same thing affects host code, where an i386/x86 host that
doesn't support 16-bit floating point, can pass "unsigned short" values
to and from the accelerator, and likewise this HImode locally gets passed
in a wider (often WORD_MODE) integer types on most x86 ABIs.

My guess is that passing SFmode in DImode may have been supported
in older versions of GCC, before handling of SUBREGs was tightened up,
so this might be considered a regression.

Cheers,
Roger
--

> -----Original Message-----
> From: Tom de Vries <tdevries@suse.de>
> Sent: 22 February 2022 15:43
> To: Roger Sayle <roger@nextmovesoftware.com>; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] middle-end: Support ABIs that pass FP values as wider
> integers.
> 
> On 2/9/22 21:12, Roger Sayle wrote:
> >
> > This patch adds middle-end support for target ABIs that pass/return
> > floating point values in integer registers with precision wider than
> > the original FP mode.  An example, is the nvptx backend where 16-bit
> > HFmode registers are passed/returned as (promoted to) SImode registers.
> > Unfortunately, this currently falls foul of the various (recent?)
> > sanity checks that (very sensibly) prevent creating paradoxical
> > SUBREGs of floating point registers.  The approach below is to
> > explicitly perform the conversion/promotion in two steps, via an
> > integer mode of same precision as the floating point value.  So on
> > nvptx, 16-bit HFmode is initially converted to 16-bit HImode (using
> > SUBREG), then zero-extended to SImode, and likewise when going the
> > other way, parameters truncated to HImode then converted to HFmode
> > (using SUBREG).  These changes are localized to expand_value_return
> > and expanding DECL_RTL to support strange ABIs, rather than inside
> > convert_modes or gen_lowpart, as mismatched precision integer/FP
> > conversions should be explicit in the RTL, and these semantics not generally
> visible/implicit in user code.
> >
> 
> Hi Roger,
> 
> I cannot comment on the patch, but I do wonder (after your "strange ABI"
> comment): did we actively decide on (or align to) a register passing ABI for
> HFmode, or has it merely been decided by the implementation of
> promote_arg:
> ...
> static machine_mode
> promote_arg (machine_mode mode, bool prototyped) {
>    if (!prototyped && mode == SFmode)
>      /* K&R float promotion for unprototyped functions.  */
>      mode = DFmode;
>    else if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
>      mode = SImode;
> 
>    return mode;
> }
> ...
> 
> There may be a rationale why it's good to pass a HF as SI, but it's not
> documented there.
> 
> Anyway, I checked what cuda does for HF, and it passes a byte array:
> ...
> .param .align 2 .b8 _Z5helloPj6__halfs_param_1[2], ...
> 
> So, I guess what I'm saying is I'd like to understand why we're having the HF -> SI
> promotion.
> 
> Thanks,
> - Tom
  
Tom de Vries Feb. 22, 2022, 10:09 p.m. UTC | #4
On 2/22/22 17:08, Roger Sayle wrote:
> 
> Hi Tom,
> 
> I'll admit that I'd not myself considered the ABI issues when I initially proposed
> experimental HFmode support for the nvptx backend, and was surprised when
> I finally tracked down the source of the problem you'd reported: that libgcc
> spots HFmode support exists and immediately starts passing/returning values
> in this type.
> 
> The one precedent that I can point to is that LLVM's nvptx backend passes
> HFmode values in SImode regs,   see https://reviews.llvm.org/D28540

Interesting, thanks for the link.

> Their motivation is that not all PTX ISAs support fp16, so for compatibility
> with say sm_30/sm_35, fp16 values are treated like b16, i.e. HImode.
> At this point, the nvptx ABI states that HImode values are passed as SImode,
> so we end up with the interesting mismatch of HFmode<->SImode.

Indeed, that sounds plausible.

And IIUC, that also means that this leaves the door open for us to 
implement fp16 support for pre-sm_53 using b16 in a compatible way.

Then I think the current solution is OK, thanks for digging this up.

Thanks,
-Tom

> I guess the same thing affects host code, where an i386/x86 host that
> doesn't support 16-bit floating point, can pass "unsigned short" values
> to and from the accelerator, and likewise this HImode locally gets passed
> in a wider (often WORD_MODE) integer types on most x86 ABIs.
> 
> My guess is that passing SFmode in DImode may have been supported
> in older versions of GCC, before handling of SUBREGs was tightened up,
> so this might be considered a regression.
> 
> Cheers,
> Roger
> --
> 
>> -----Original Message-----
>> From: Tom de Vries <tdevries@suse.de>
>> Sent: 22 February 2022 15:43
>> To: Roger Sayle <roger@nextmovesoftware.com>; gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH] middle-end: Support ABIs that pass FP values as wider
>> integers.
>>
>> On 2/9/22 21:12, Roger Sayle wrote:
>>>
>>> This patch adds middle-end support for target ABIs that pass/return
>>> floating point values in integer registers with precision wider than
>>> the original FP mode.  An example, is the nvptx backend where 16-bit
>>> HFmode registers are passed/returned as (promoted to) SImode registers.
>>> Unfortunately, this currently falls foul of the various (recent?)
>>> sanity checks that (very sensibly) prevent creating paradoxical
>>> SUBREGs of floating point registers.  The approach below is to
>>> explicitly perform the conversion/promotion in two steps, via an
>>> integer mode of same precision as the floating point value.  So on
>>> nvptx, 16-bit HFmode is initially converted to 16-bit HImode (using
>>> SUBREG), then zero-extended to SImode, and likewise when going the
>>> other way, parameters truncated to HImode then converted to HFmode
>>> (using SUBREG).  These changes are localized to expand_value_return
>>> and expanding DECL_RTL to support strange ABIs, rather than inside
>>> convert_modes or gen_lowpart, as mismatched precision integer/FP
>>> conversions should be explicit in the RTL, and these semantics not generally
>> visible/implicit in user code.
>>>
>>
>> Hi Roger,
>>
>> I cannot comment on the patch, but I do wonder (after your "strange ABI"
>> comment): did we actively decide on (or align to) a register passing ABI for
>> HFmode, or has it merely been decided by the implementation of
>> promote_arg:
>> ...
>> static machine_mode
>> promote_arg (machine_mode mode, bool prototyped) {
>>     if (!prototyped && mode == SFmode)
>>       /* K&R float promotion for unprototyped functions.  */
>>       mode = DFmode;
>>     else if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
>>       mode = SImode;
>>
>>     return mode;
>> }
>> ...
>>
>> There may be a rationale why it's good to pass a HF as SI, but it's not
>> documented there.
>>
>> Anyway, I checked what cuda does for HF, and it passes a byte array:
>> ...
>> .param .align 2 .b8 _Z5helloPj6__halfs_param_1[2], ...
>>
>> So, I guess what I'm saying is I'd like to understand why we're having the HF -> SI
>> promotion.
>>
>> Thanks,
>> - Tom
>
  
Roger Sayle Feb. 22, 2022, 11:19 p.m. UTC | #5
>> Anyway, I checked what cuda does for HF, and it passes a byte array:
>>> .param .align 2 .b8 _Z5helloPj6__halfs_param_1[2], ...
> >
> > The one precedent that I can point to is that LLVM's nvptx backend passes
> > HFmode values in SImode regs,   see https://reviews.llvm.org/D28540
> 
> Interesting, thanks for the link.

In theory, GCC could also support -mfloat-abi=nvcc and -mfloat-abi=llvm
(much like other targets have -mfloat-abi=soft vs. -mfloat-abi=hard).
At this point getting any ABI supporting HFmode would be an improvement.

Roger
--
  
Tobias Burnus Feb. 23, 2022, 8:42 a.m. UTC | #6
PING**2 for the ME review or at least comments to that patch,
which fixes a build issue/ICE with nvptx (at least when bumping the
default -misa).

Patch: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html
(for gcc/cfgexpand.cc + gcc/expr.cc)

(There is some discussion by Tom and Roger about the BE in the patch
thread, which only not relate to the ME patch. But there is no ME-patch
comment so far.)

Thanks,

Tobias

On 17.02.22 15:35, Tobias Burnus wrote:
> PING for this cfgexpand.cc + expr.cc change by Roger.
>
> This is a pre-requisite for Roger's nvptx patch to avoid an ICE during
> bootstrap:
>
> * https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590250.html
>   "[PATCH] nvptx: Back-end portion of a fix for PR target/104489."
>   (see patch for additional reasoning for this patch)
> * See also https://gcc.gnu.org/PR104489
>    nvptx, sm_53: internal compiler error: in gen_rtx_SUBREG, at
> emit-rtl.cc:1022
>
> Thanks,
>
> Tobias
>
> On 09.02.22 21:12, Roger Sayle wrote:
>> This patch adds middle-end support for target ABIs that pass/return
>> floating point values in integer registers with precision wider than
>> the original FP mode.  An example, is the nvptx backend where 16-bit
>> HFmode registers are passed/returned as (promoted to) SImode registers.
>> Unfortunately, this currently falls foul of the various (recent?) sanity
>> checks that (very sensibly) prevent creating paradoxical SUBREGs of
>> floating point registers.  The approach below is to explicitly
>> perform the
>> conversion/promotion in two steps, via an integer mode of same precision
>> as the floating point value.  So on nvptx, 16-bit HFmode is initially
>> converted to 16-bit HImode (using SUBREG), then zero-extended to SImode,
>> and likewise when going the other way, parameters truncated to HImode
>> then converted to HFmode (using SUBREG).  These changes are localized
>> to expand_value_return and expanding DECL_RTL to support strange ABIs,
>> rather than inside convert_modes or gen_lowpart, as mismatched
>> precision integer/FP conversions should be explicit in the RTL,
>> and these semantics not generally visible/implicit in user code.
>>
>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>> and make -k check with no new failures, and on nvptx-none, where it is
>> the middle-end portion of a pair of patches to allow the default ISA to
>> be advanced.  Ok for mainline?
>>
>>
>> 2022-02-09  Roger Sayle  <roger@nextmovesoftware.com>
>>
>> gcc/ChangeLog
>>         * cfgexpand.cc (expand_value_return): Allow backends to promote
>>         a scalar floating point return value to a wider integer mode.
>>         * expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise,
>> allow
>>         backends to promote scalar FP PARM_DECLs to wider integer modes.
>>
>>
>> Thanks in advance,
>> Roger
>> --
>>
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  
Tobias Burnus Feb. 28, 2022, 9:41 a.m. UTC | #7
Ping**3

On 23.02.22 09:42, Tobias Burnus wrote:
> PING**2 for the ME review or at least comments to that patch,
> which fixes a build issue/ICE with nvptx
>
> Patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html
> (for gcc/cfgexpand.cc + gcc/expr.cc)
>
> (There is some discussion by Tom and Roger about the BE in the patch
> thread, which only not relate to the ME patch. But there is no
> ME-patch comment so far.)
The related BE patch has been already committed, but to be effective, it
needs the ME patch.
>
> Thanks,
>
> Tobias
>
> On 17.02.22 15:35, Tobias Burnus wrote:
>> PING for this cfgexpand.cc + expr.cc change by Roger.
>>
>> This is a pre-requisite for Roger's nvptx patch to avoid an ICE
>> during bootstrap:
>>
>> * https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590250.html
>>   "[PATCH] nvptx: Back-end portion of a fix for PR target/104489."
>>   (see patch for additional reasoning for this patch)
>> * See also https://gcc.gnu.org/PR104489
>>    nvptx, sm_53: internal compiler error: in gen_rtx_SUBREG, at
>> emit-rtl.cc:1022
>>
>> Thanks,
>>
>> Tobias
>>
>> On 09.02.22 21:12, Roger Sayle wrote:
>>> This patch adds middle-end support for target ABIs that pass/return
>>> floating point values in integer registers with precision wider than
>>> the original FP mode.  An example, is the nvptx backend where 16-bit
>>> HFmode registers are passed/returned as (promoted to) SImode registers.
>>> Unfortunately, this currently falls foul of the various (recent?)
>>> sanity
>>> checks that (very sensibly) prevent creating paradoxical SUBREGs of
>>> floating point registers.  The approach below is to explicitly
>>> perform the
>>> conversion/promotion in two steps, via an integer mode of same
>>> precision
>>> as the floating point value.  So on nvptx, 16-bit HFmode is initially
>>> converted to 16-bit HImode (using SUBREG), then zero-extended to
>>> SImode,
>>> and likewise when going the other way, parameters truncated to HImode
>>> then converted to HFmode (using SUBREG).  These changes are localized
>>> to expand_value_return and expanding DECL_RTL to support strange ABIs,
>>> rather than inside convert_modes or gen_lowpart, as mismatched
>>> precision integer/FP conversions should be explicit in the RTL,
>>> and these semantics not generally visible/implicit in user code.
>>>
>>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>>> and make -k check with no new failures, and on nvptx-none, where it is
>>> the middle-end portion of a pair of patches to allow the default ISA to
>>> be advanced.  Ok for mainline?
>>>
>>>
>>> 2022-02-09  Roger Sayle  <roger@nextmovesoftware.com>
>>>
>>> gcc/ChangeLog
>>>         * cfgexpand.cc (expand_value_return): Allow backends to promote
>>>         a scalar floating point return value to a wider integer mode.
>>>         * expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise,
>>> allow
>>>         backends to promote scalar FP PARM_DECLs to wider integer
>>> modes.
>>>
>>>
>>> Thanks in advance,
>>> Roger
>>> --
>>>
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  
Richard Biener Feb. 28, 2022, 12:54 p.m. UTC | #8
On Mon, 28 Feb 2022, Tobias Burnus wrote:

> Ping**3
> 
> On 23.02.22 09:42, Tobias Burnus wrote:
> > PING**2 for the ME review or at least comments to that patch,
> > which fixes a build issue/ICE with nvptx
> >
> > Patch:
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html
> > (for gcc/cfgexpand.cc + gcc/expr.cc)
> >
> > (There is some discussion by Tom and Roger about the BE in the patch
> > thread, which only not relate to the ME patch. But there is no
> > ME-patch comment so far.)
> The related BE patch has been already committed, but to be effective, it
> needs the ME patch.

I'm not sure I'm qualified to review this - maybe Richard is.

Richard.

> >
> > Thanks,
> >
> > Tobias
> >
> > On 17.02.22 15:35, Tobias Burnus wrote:
> >> PING for this cfgexpand.cc + expr.cc change by Roger.
> >>
> >> This is a pre-requisite for Roger's nvptx patch to avoid an ICE
> >> during bootstrap:
> >>
> >> * https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590250.html
> >>   "[PATCH] nvptx: Back-end portion of a fix for PR target/104489."
> >>   (see patch for additional reasoning for this patch)
> >> * See also https://gcc.gnu.org/PR104489
> >>    nvptx, sm_53: internal compiler error: in gen_rtx_SUBREG, at
> >> emit-rtl.cc:1022
> >>
> >> Thanks,
> >>
> >> Tobias
> >>
> >> On 09.02.22 21:12, Roger Sayle wrote:
> >>> This patch adds middle-end support for target ABIs that pass/return
> >>> floating point values in integer registers with precision wider than
> >>> the original FP mode.  An example, is the nvptx backend where 16-bit
> >>> HFmode registers are passed/returned as (promoted to) SImode registers.
> >>> Unfortunately, this currently falls foul of the various (recent?)
> >>> sanity
> >>> checks that (very sensibly) prevent creating paradoxical SUBREGs of
> >>> floating point registers.  The approach below is to explicitly
> >>> perform the
> >>> conversion/promotion in two steps, via an integer mode of same
> >>> precision
> >>> as the floating point value.  So on nvptx, 16-bit HFmode is initially
> >>> converted to 16-bit HImode (using SUBREG), then zero-extended to
> >>> SImode,
> >>> and likewise when going the other way, parameters truncated to HImode
> >>> then converted to HFmode (using SUBREG).  These changes are localized
> >>> to expand_value_return and expanding DECL_RTL to support strange ABIs,
> >>> rather than inside convert_modes or gen_lowpart, as mismatched
> >>> precision integer/FP conversions should be explicit in the RTL,
> >>> and these semantics not generally visible/implicit in user code.
> >>>
> >>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> >>> and make -k check with no new failures, and on nvptx-none, where it is
> >>> the middle-end portion of a pair of patches to allow the default ISA to
> >>> be advanced.  Ok for mainline?
> >>>
> >>>
> >>> 2022-02-09  Roger Sayle  <roger@nextmovesoftware.com>
> >>>
> >>> gcc/ChangeLog
> >>>         * cfgexpand.cc (expand_value_return): Allow backends to promote
> >>>         a scalar floating point return value to a wider integer mode.
> >>>         * expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise,
> >>> allow
> >>>         backends to promote scalar FP PARM_DECLs to wider integer
> >>> modes.
> >>>
> >>>
> >>> Thanks in advance,
> >>> Roger
> >>> --
> >>>
> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht
> München, HRB 106955
> 
>
  
Jeff Law March 2, 2022, 7:18 p.m. UTC | #9
On 2/28/2022 5:54 AM, Richard Biener via Gcc-patches wrote:
> On Mon, 28 Feb 2022, Tobias Burnus wrote:
>
>> Ping**3
>>
>> On 23.02.22 09:42, Tobias Burnus wrote:
>>> PING**2 for the ME review or at least comments to that patch,
>>> which fixes a build issue/ICE with nvptx
>>>
>>> Patch:
>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html
>>> (for gcc/cfgexpand.cc + gcc/expr.cc)
>>>
>>> (There is some discussion by Tom and Roger about the BE in the patch
>>> thread, which only not relate to the ME patch. But there is no
>>> ME-patch comment so far.)
>> The related BE patch has been already committed, but to be effective, it
>> needs the ME patch.
> I'm not sure I'm qualified to review this - maybe Richard is.
I'd initially ignored the patch as it didn't seem a good fit for stage4, 
subsequent messages changed my mind about it, but I never went back to 
take a deeper look at Roger's patch.

jeff
  
Tom de Vries March 14, 2022, 8:06 a.m. UTC | #10
On 3/2/22 20:18, Jeff Law via Gcc-patches wrote:
> 
> 
> On 2/28/2022 5:54 AM, Richard Biener via Gcc-patches wrote:
>> On Mon, 28 Feb 2022, Tobias Burnus wrote:
>>
>>> Ping**3
>>>
>>> On 23.02.22 09:42, Tobias Burnus wrote:
>>>> PING**2 for the ME review or at least comments to that patch,
>>>> which fixes a build issue/ICE with nvptx
>>>>
>>>> Patch:
>>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html
>>>> (for gcc/cfgexpand.cc + gcc/expr.cc)
>>>>
>>>> (There is some discussion by Tom and Roger about the BE in the patch
>>>> thread, which only not relate to the ME patch. But there is no
>>>> ME-patch comment so far.)
>>> The related BE patch has been already committed, but to be effective, it
>>> needs the ME patch.
>> I'm not sure I'm qualified to review this - maybe Richard is.
> I'd initially ignored the patch as it didn't seem a good fit for stage4, 
> subsequent messages changed my mind about it, but I never went back to 
> take a deeper look at Roger's patch.

Ping.

[ FWIW, I'd appreciate it if a response came before the end of stage 4, 
such that I have some time left to deal with fallout in case the patch 
is not approved. ]

Thanks,
- Tom
  
Roger Sayle March 14, 2022, 8:39 a.m. UTC | #11
I thought I'd add a few comments that might help reviewers of this patch.
Most importantly, this patch should be completely safe, as both changes
only trigger with FLOAT vs INT size mismatches which currently both ICE in
the compiler a few lines later.  Admittedly, this indicates something odd 
about a target's choice of ABI, but one alternative might be to issue a
"sorry, unimplemented" error message rather than ICE, perhaps with a
TODO or FIXME comment that support for mixed size FP/integer ABIs be
added in future.

The only (other?) possible point of contention is the (arbitrary) decision that
when passing floating point values in a larger integer register, the code is
hardwired to use zero-extension.  This in theory could be turned into a target
hook to support sign-extension, but given these are padding bits, zero seems
appropriate. [On x86_64, if passing DFmode argument in a V2DFmode vector,
say, it seems reasonable to use movq and zero the high bits].

The final point is that at the moment, the only affected target is nvptx-none,
as I don't believe any other backend specifies an ABI that requires passing 
floating point values in wider integer registers.  Having said that, most targets
don't yet support HFmode, and their ABI specifications haven't yet been
updated as what to do in that eventuality.  If they choose to treat these like
HImode, they'll run into the same issues, as most ABIs pass HImode in 
wider word_mode registers.

I hope this helps.  If folks could air their concerns out loud, I can try my best
to address those worries.

Many thanks in advance (and thanks to Tobias and Tom for pushing for this).
Roger
--

> -----Original Message-----
> From: Tom de Vries <tdevries@suse.de>
> Sent: 14 March 2022 08:06
> To: Jeff Law <jeffreyalaw@gmail.com>; Richard Biener <rguenther@suse.de>;
> Tobias Burnus <tobias@codesourcery.com>
> Cc: richard.sandiford@arm.com; Jeff Law <law@redhat.com>; gcc-
> patches@gcc.gnu.org; Roger Sayle <roger@nextmovesoftware.com>
> Subject: PING**4 - [PATCH] middle-end: Support ABIs that pass FP values as
> wider integers.
> 
> On 3/2/22 20:18, Jeff Law via Gcc-patches wrote:
> >
> >
> > On 2/28/2022 5:54 AM, Richard Biener via Gcc-patches wrote:
> >> On Mon, 28 Feb 2022, Tobias Burnus wrote:
> >>
> >>> Ping**3
> >>>
> >>> On 23.02.22 09:42, Tobias Burnus wrote:
> >>>> PING**2 for the ME review or at least comments to that patch, which
> >>>> fixes a build issue/ICE with nvptx
> >>>>
> >>>> Patch:
> >>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html
> >>>> (for gcc/cfgexpand.cc + gcc/expr.cc)
> >>>>
> >>>> (There is some discussion by Tom and Roger about the BE in the
> >>>> patch thread, which only not relate to the ME patch. But there is
> >>>> no ME-patch comment so far.)
> >>> The related BE patch has been already committed, but to be
> >>> effective, it needs the ME patch.
> >> I'm not sure I'm qualified to review this - maybe Richard is.
> > I'd initially ignored the patch as it didn't seem a good fit for
> > stage4, subsequent messages changed my mind about it, but I never went
> > back to take a deeper look at Roger's patch.
> 
> Ping.
> 
> [ FWIW, I'd appreciate it if a response came before the end of stage 4, such that
> I have some time left to deal with fallout in case the patch is not approved. ]
> 
> Thanks,
> - Tom
diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index d51af2e..c377f16 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -3715,7 +3715,22 @@ expand_value_return (rtx val)
         mode = promote_function_mode (type, old_mode, &unsignedp, funtype, 1);
 
       if (mode != old_mode)
-	val = convert_modes (mode, old_mode, val, unsignedp);
+	{
+	  /* Some ABIs require scalar floating point modes to be returned
+	     in a wider scalar integer mode.  We need to explicitly
+	     reinterpret to an integer mode of the correct precision
+	     before extending to the desired result.  */
+	  if (SCALAR_INT_MODE_P (mode)
+	      && SCALAR_FLOAT_MODE_P (old_mode)
+	      && known_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (old_mode)))
+	    {
+	      scalar_int_mode imode = int_mode_for_mode (old_mode).require ();
+	      val = force_reg (imode, gen_lowpart (imode, val));
+	      val = convert_modes (mode, imode, val, 1);
+	    }
+	  else
+	    val = convert_modes (mode, old_mode, val, unsignedp);
+	}
 
       if (GET_CODE (return_reg) == PARALLEL)
 	emit_group_load (return_reg, val, type, int_size_in_bytes (type));
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 35e4029..e4efdcd 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -10674,6 +10674,19 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	    pmode = promote_ssa_mode (ssa_name, &unsignedp);
 	  gcc_assert (GET_MODE (decl_rtl) == pmode);
 
+	  /* Some ABIs require scalar floating point modes to be passed
+	     in a wider scalar integer mode.  We need to explicitly
+	     truncate to an integer mode of the correct precision before
+	     using a SUBREG to reinterpret as a floating point value.  */
+	  if (SCALAR_FLOAT_MODE_P (mode)
+	      && SCALAR_INT_MODE_P (pmode)
+	      && known_lt (GET_MODE_SIZE (mode), GET_MODE_SIZE (pmode)))
+	    {
+	      scalar_int_mode imode = int_mode_for_mode (mode).require ();
+	      temp = force_reg (imode, gen_lowpart (imode, decl_rtl));
+	      return gen_lowpart_SUBREG (mode, temp);
+	    }
+
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
 	  SUBREG_PROMOTED_VAR_P (temp) = 1;
 	  SUBREG_PROMOTED_SET (temp, unsignedp);
  
Richard Biener March 14, 2022, 9:09 a.m. UTC | #12
On Mon, 14 Mar 2022, Roger Sayle wrote:

> 
> I thought I'd add a few comments that might help reviewers of this patch.
> Most importantly, this patch should be completely safe, as both changes
> only trigger with FLOAT vs INT size mismatches which currently both ICE in
> the compiler a few lines later.  Admittedly, this indicates something odd 
> about a target's choice of ABI, but one alternative might be to issue a
> "sorry, unimplemented" error message rather than ICE, perhaps with a
> TODO or FIXME comment that support for mixed size FP/integer ABIs be
> added in future.
> 
> The only (other?) possible point of contention is the (arbitrary) decision that
> when passing floating point values in a larger integer register, the code is
> hardwired to use zero-extension.  This in theory could be turned into a target
> hook to support sign-extension, but given these are padding bits, zero seems
> appropriate. [On x86_64, if passing DFmode argument in a V2DFmode vector,
> say, it seems reasonable to use movq and zero the high bits].
> 
> The final point is that at the moment, the only affected target is nvptx-none,
> as I don't believe any other backend specifies an ABI that requires passing 
> floating point values in wider integer registers.  Having said that, most targets
> don't yet support HFmode, and their ABI specifications haven't yet been
> updated as what to do in that eventuality.  If they choose to treat these like
> HImode, they'll run into the same issues, as most ABIs pass HImode in 
> wider word_mode registers.
> 
> I hope this helps.  If folks could air their concerns out loud, I can try my best
> to address those worries.

First of all I'm not familiar with the ABI related code as all, so I
refrained from commenting.  But now I've looked closer (still unfamiliar 
code).

I suppose there's targets passing SFmode in a SImode GPR and DFmode
in a DImode GPR (all soft-float targets?), so that already works somehow.
Why does nvptx choose DImode for SFmode passing then?  Can't it choose
(subreg:SI di:..) or so?  Does it require zero-extending to DImode
on the caller side?  It seems your expand_expr_real_1 code does
not rely on that?  So, why does nvptx function_arg hook (?) insist
on returning a DImode reg for an SFmode argument rather than
an SImode subreg of that?

Richard.

> 
> Many thanks in advance (and thanks to Tobias and Tom for pushing for this).
> Roger
> --
> 
> > -----Original Message-----
> > From: Tom de Vries <tdevries@suse.de>
> > Sent: 14 March 2022 08:06
> > To: Jeff Law <jeffreyalaw@gmail.com>; Richard Biener <rguenther@suse.de>;
> > Tobias Burnus <tobias@codesourcery.com>
> > Cc: richard.sandiford@arm.com; Jeff Law <law@redhat.com>; gcc-
> > patches@gcc.gnu.org; Roger Sayle <roger@nextmovesoftware.com>
> > Subject: PING**4 - [PATCH] middle-end: Support ABIs that pass FP values as
> > wider integers.
> > 
> > On 3/2/22 20:18, Jeff Law via Gcc-patches wrote:
> > >
> > >
> > > On 2/28/2022 5:54 AM, Richard Biener via Gcc-patches wrote:
> > >> On Mon, 28 Feb 2022, Tobias Burnus wrote:
> > >>
> > >>> Ping**3
> > >>>
> > >>> On 23.02.22 09:42, Tobias Burnus wrote:
> > >>>> PING**2 for the ME review or at least comments to that patch, which
> > >>>> fixes a build issue/ICE with nvptx
> > >>>>
> > >>>> Patch:
> > >>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html
> > >>>> (for gcc/cfgexpand.cc + gcc/expr.cc)
> > >>>>
> > >>>> (There is some discussion by Tom and Roger about the BE in the
> > >>>> patch thread, which only not relate to the ME patch. But there is
> > >>>> no ME-patch comment so far.)
> > >>> The related BE patch has been already committed, but to be
> > >>> effective, it needs the ME patch.
> > >> I'm not sure I'm qualified to review this - maybe Richard is.
> > > I'd initially ignored the patch as it didn't seem a good fit for
> > > stage4, subsequent messages changed my mind about it, but I never went
> > > back to take a deeper look at Roger's patch.
> > 
> > Ping.
> > 
> > [ FWIW, I'd appreciate it if a response came before the end of stage 4, such that
> > I have some time left to deal with fallout in case the patch is not approved. ]
> > 
> > Thanks,
> > - Tom
>
  
Roger Sayle March 14, 2022, 9:46 a.m. UTC | #13
Hi Richard,
Thanks for asking.
The issue is with 16-bit floating point HFmode, where clang on nvptx passes
HFmode
values in SImode registers, and for binary compatibility GCC needs to do the
same.
Their motivation is that for compatibility with older GPUs and the x86_64
host, HFmode
parameters are treated like "unsigned int".  Hence, libgcc functions like
__sqrthf behave
as though declared (are binary compatible with) "unsigned short
__sqrthf(unsigned short)".
As you point out, this also greatly simplifies soft-float, as both ABIs
remain the same.

Alas, the subreg approach doesn't work as we'd end up with (subreg:SI
(subreg:HI (reg:HF)),
though technically that is what this patch does, inserting a pair of
conversion instructions.
Until recently (subreg:SI (reg:HF)) did work, but that support was
removed/cleaned-up,
as we need sensible subreg semantics in the RTL passes.   The proposed patch
simply
adds support back in the minimal places where changing FP/non-FP and
precision at the
same time is required: argument passing and return values.

Hopefully this answers your question.  An alternate viewpoint might be "is
there a reason for
GCC not to be able to support targets with slightly wacky parameter passing
conventions".

Thanks for thinking about this.
Roger
--

> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: 14 March 2022 09:09
> To: Roger Sayle <roger@nextmovesoftware.com>
> Cc: 'Tom de Vries' <tdevries@suse.de>; 'Jeff Law' <jeffreyalaw@gmail.com>;
> 'Tobias Burnus' <tobias@codesourcery.com>; richard.sandiford@arm.com;
'Jeff
> Law' <law@redhat.com>; gcc-patches@gcc.gnu.org
> Subject: RE: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
values as
> wider integers.
> 
> On Mon, 14 Mar 2022, Roger Sayle wrote:
> 
> >
> > I thought I'd add a few comments that might help reviewers of this
patch.
> > Most importantly, this patch should be completely safe, as both
> > changes only trigger with FLOAT vs INT size mismatches which currently
> > both ICE in the compiler a few lines later.  Admittedly, this
> > indicates something odd about a target's choice of ABI, but one
> > alternative might be to issue a "sorry, unimplemented" error message
> > rather than ICE, perhaps with a TODO or FIXME comment that support for
> > mixed size FP/integer ABIs be added in future.
> >
> > The only (other?) possible point of contention is the (arbitrary)
> > decision that when passing floating point values in a larger integer
> > register, the code is hardwired to use zero-extension.  This in theory
> > could be turned into a target hook to support sign-extension, but
> > given these are padding bits, zero seems appropriate. [On x86_64, if
> > passing DFmode argument in a V2DFmode vector, say, it seems reasonable
to
> use movq and zero the high bits].
> >
> > The final point is that at the moment, the only affected target is
> > nvptx-none, as I don't believe any other backend specifies an ABI that
> > requires passing floating point values in wider integer registers.
> > Having said that, most targets don't yet support HFmode, and their ABI
> > specifications haven't yet been updated as what to do in that
> > eventuality.  If they choose to treat these like HImode, they'll run
> > into the same issues, as most ABIs pass HImode in wider word_mode
registers.
> >
> > I hope this helps.  If folks could air their concerns out loud, I can
> > try my best to address those worries.
> 
> First of all I'm not familiar with the ABI related code as all, so I
refrained from
> commenting.  But now I've looked closer (still unfamiliar code).
> 
> I suppose there's targets passing SFmode in a SImode GPR and DFmode in a
> DImode GPR (all soft-float targets?), so that already works somehow.
> Why does nvptx choose DImode for SFmode passing then?  Can't it choose
> (subreg:SI di:..) or so?  Does it require zero-extending to DImode on the
caller
> side?  It seems your expand_expr_real_1 code does not rely on that?  So,
why
> does nvptx function_arg hook (?) insist on returning a DImode reg for an
SFmode
> argument rather than an SImode subreg of that?
> 
> Richard.
> 
> >
> > Many thanks in advance (and thanks to Tobias and Tom for pushing for
this).
> > Roger
> > --
> >
> > > -----Original Message-----
> > > From: Tom de Vries <tdevries@suse.de>
> > > Sent: 14 March 2022 08:06
> > > To: Jeff Law <jeffreyalaw@gmail.com>; Richard Biener
> > > <rguenther@suse.de>; Tobias Burnus <tobias@codesourcery.com>
> > > Cc: richard.sandiford@arm.com; Jeff Law <law@redhat.com>; gcc-
> > > patches@gcc.gnu.org; Roger Sayle <roger@nextmovesoftware.com>
> > > Subject: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
> > > values as wider integers.
> > >
> > > On 3/2/22 20:18, Jeff Law via Gcc-patches wrote:
> > > >
> > > >
> > > > On 2/28/2022 5:54 AM, Richard Biener via Gcc-patches wrote:
> > > >> On Mon, 28 Feb 2022, Tobias Burnus wrote:
> > > >>
> > > >>> Ping**3
> > > >>>
> > > >>> On 23.02.22 09:42, Tobias Burnus wrote:
> > > >>>> PING**2 for the ME review or at least comments to that patch,
> > > >>>> which fixes a build issue/ICE with nvptx
> > > >>>>
> > > >>>> Patch:
> > > >>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.
> > > >>>> html (for gcc/cfgexpand.cc + gcc/expr.cc)
> > > >>>>
> > > >>>> (There is some discussion by Tom and Roger about the BE in the
> > > >>>> patch thread, which only not relate to the ME patch. But there
> > > >>>> is no ME-patch comment so far.)
> > > >>> The related BE patch has been already committed, but to be
> > > >>> effective, it needs the ME patch.
> > > >> I'm not sure I'm qualified to review this - maybe Richard is.
> > > > I'd initially ignored the patch as it didn't seem a good fit for
> > > > stage4, subsequent messages changed my mind about it, but I never
> > > > went back to take a deeper look at Roger's patch.
> > >
> > > Ping.
> > >
> > > [ FWIW, I'd appreciate it if a response came before the end of stage
> > > 4, such that I have some time left to deal with fallout in case the
> > > patch is not approved. ]
> > >
> > > Thanks,
> > > - Tom
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
  
Richard Biener March 14, 2022, 10:14 a.m. UTC | #14
On Mon, 14 Mar 2022, Roger Sayle wrote:

> 
> Hi Richard,
> Thanks for asking.
> The issue is with 16-bit floating point HFmode, where clang on nvptx passes
> HFmode
> values in SImode registers, and for binary compatibility GCC needs to do the
> same.
> Their motivation is that for compatibility with older GPUs and the x86_64
> host, HFmode
> parameters are treated like "unsigned int".  Hence, libgcc functions like
> __sqrthf behave
> as though declared (are binary compatible with) "unsigned short
> __sqrthf(unsigned short)".
> As you point out, this also greatly simplifies soft-float, as both ABIs
> remain the same.
> 
> Alas, the subreg approach doesn't work as we'd end up with (subreg:SI
> (subreg:HI (reg:HF)),
> though technically that is what this patch does, inserting a pair of
> conversion instructions.

So what does FUNCTION_ARG return for the HFmode parameters?  The above
seems backwards - it should be (subreg:HF (reg:SI 0)), no?

Or are we talking about the caller side?  Or the return value case
(in which case I ask the same question wrt FUNCTION_VALUE)?

Does nvptx use the promotion hooks in those cases?

> Until recently (subreg:SI (reg:HF)) did work, but that support was
> removed/cleaned-up,
> as we need sensible subreg semantics in the RTL passes.   The proposed patch
> simply
> adds support back in the minimal places where changing FP/non-FP and
> precision at the
> same time is required: argument passing and return values.
> 
> Hopefully this answers your question.  An alternate viewpoint might be "is
> there a reason for
> GCC not to be able to support targets with slightly wacky parameter passing
> conventions".
> 
> Thanks for thinking about this.
> Roger
> --
> 
> > -----Original Message-----
> > From: Richard Biener <rguenther@suse.de>
> > Sent: 14 March 2022 09:09
> > To: Roger Sayle <roger@nextmovesoftware.com>
> > Cc: 'Tom de Vries' <tdevries@suse.de>; 'Jeff Law' <jeffreyalaw@gmail.com>;
> > 'Tobias Burnus' <tobias@codesourcery.com>; richard.sandiford@arm.com;
> 'Jeff
> > Law' <law@redhat.com>; gcc-patches@gcc.gnu.org
> > Subject: RE: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
> values as
> > wider integers.
> > 
> > On Mon, 14 Mar 2022, Roger Sayle wrote:
> > 
> > >
> > > I thought I'd add a few comments that might help reviewers of this
> patch.
> > > Most importantly, this patch should be completely safe, as both
> > > changes only trigger with FLOAT vs INT size mismatches which currently
> > > both ICE in the compiler a few lines later.  Admittedly, this
> > > indicates something odd about a target's choice of ABI, but one
> > > alternative might be to issue a "sorry, unimplemented" error message
> > > rather than ICE, perhaps with a TODO or FIXME comment that support for
> > > mixed size FP/integer ABIs be added in future.
> > >
> > > The only (other?) possible point of contention is the (arbitrary)
> > > decision that when passing floating point values in a larger integer
> > > register, the code is hardwired to use zero-extension.  This in theory
> > > could be turned into a target hook to support sign-extension, but
> > > given these are padding bits, zero seems appropriate. [On x86_64, if
> > > passing DFmode argument in a V2DFmode vector, say, it seems reasonable
> to
> > use movq and zero the high bits].
> > >
> > > The final point is that at the moment, the only affected target is
> > > nvptx-none, as I don't believe any other backend specifies an ABI that
> > > requires passing floating point values in wider integer registers.
> > > Having said that, most targets don't yet support HFmode, and their ABI
> > > specifications haven't yet been updated as what to do in that
> > > eventuality.  If they choose to treat these like HImode, they'll run
> > > into the same issues, as most ABIs pass HImode in wider word_mode
> registers.
> > >
> > > I hope this helps.  If folks could air their concerns out loud, I can
> > > try my best to address those worries.
> > 
> > First of all I'm not familiar with the ABI related code as all, so I
> refrained from
> > commenting.  But now I've looked closer (still unfamiliar code).
> > 
> > I suppose there's targets passing SFmode in a SImode GPR and DFmode in a
> > DImode GPR (all soft-float targets?), so that already works somehow.
> > Why does nvptx choose DImode for SFmode passing then?  Can't it choose
> > (subreg:SI di:..) or so?  Does it require zero-extending to DImode on the
> caller
> > side?  It seems your expand_expr_real_1 code does not rely on that?  So,
> why
> > does nvptx function_arg hook (?) insist on returning a DImode reg for an
> SFmode
> > argument rather than an SImode subreg of that?
> > 
> > Richard.
> > 
> > >
> > > Many thanks in advance (and thanks to Tobias and Tom for pushing for
> this).
> > > Roger
> > > --
> > >
> > > > -----Original Message-----
> > > > From: Tom de Vries <tdevries@suse.de>
> > > > Sent: 14 March 2022 08:06
> > > > To: Jeff Law <jeffreyalaw@gmail.com>; Richard Biener
> > > > <rguenther@suse.de>; Tobias Burnus <tobias@codesourcery.com>
> > > > Cc: richard.sandiford@arm.com; Jeff Law <law@redhat.com>; gcc-
> > > > patches@gcc.gnu.org; Roger Sayle <roger@nextmovesoftware.com>
> > > > Subject: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
> > > > values as wider integers.
> > > >
> > > > On 3/2/22 20:18, Jeff Law via Gcc-patches wrote:
> > > > >
> > > > >
> > > > > On 2/28/2022 5:54 AM, Richard Biener via Gcc-patches wrote:
> > > > >> On Mon, 28 Feb 2022, Tobias Burnus wrote:
> > > > >>
> > > > >>> Ping**3
> > > > >>>
> > > > >>> On 23.02.22 09:42, Tobias Burnus wrote:
> > > > >>>> PING**2 for the ME review or at least comments to that patch,
> > > > >>>> which fixes a build issue/ICE with nvptx
> > > > >>>>
> > > > >>>> Patch:
> > > > >>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.
> > > > >>>> html (for gcc/cfgexpand.cc + gcc/expr.cc)
> > > > >>>>
> > > > >>>> (There is some discussion by Tom and Roger about the BE in the
> > > > >>>> patch thread, which only not relate to the ME patch. But there
> > > > >>>> is no ME-patch comment so far.)
> > > > >>> The related BE patch has been already committed, but to be
> > > > >>> effective, it needs the ME patch.
> > > > >> I'm not sure I'm qualified to review this - maybe Richard is.
> > > > > I'd initially ignored the patch as it didn't seem a good fit for
> > > > > stage4, subsequent messages changed my mind about it, but I never
> > > > > went back to take a deeper look at Roger's patch.
> > > >
> > > > Ping.
> > > >
> > > > [ FWIW, I'd appreciate it if a response came before the end of stage
> > > > 4, such that I have some time left to deal with fallout in case the
> > > > patch is not approved. ]
> > > >
> > > > Thanks,
> > > > - Tom
> > >
> > 
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
> 
>
  
Roger Sayle March 14, 2022, 11:49 a.m. UTC | #15
Hi Richard,

> The above seems backwards - it should be (subreg:HF (reg:SI 0)), no?

Line 949 of emit-rtl.cc in the definition of "validate_subreg" contains:

  /* Subregs involving floating point modes are not allowed to
     change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
     (subreg:SI (reg:DF) 0) isn't.  */
  else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
    {
      if (! (known_eq (isize, osize)

Hence, we ICE on line 1022 of emit-rtl.cc in gen_rtx_SUBREG, that asserts
validate_subreg, when called with omode=HFmode and imode=SImode.

Cheers,
Roger
--

> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: 14 March 2022 10:15
> To: Roger Sayle <roger@nextmovesoftware.com>
> Cc: 'Tom de Vries' <tdevries@suse.de>; 'Jeff Law' <jeffreyalaw@gmail.com>;
> 'Tobias Burnus' <tobias@codesourcery.com>; richard.sandiford@arm.com;
'Jeff
> Law' <law@redhat.com>; gcc-patches@gcc.gnu.org
> Subject: RE: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
values as
> wider integers.
> 
> On Mon, 14 Mar 2022, Roger Sayle wrote:
> 
> >
> > Hi Richard,
> > Thanks for asking.
> > The issue is with 16-bit floating point HFmode, where clang on nvptx
> > passes HFmode values in SImode registers, and for binary compatibility
> > GCC needs to do the same.
> > Their motivation is that for compatibility with older GPUs and the
> > x86_64 host, HFmode parameters are treated like "unsigned int".
> > Hence, libgcc functions like __sqrthf behave as though declared (are
> > binary compatible with) "unsigned short __sqrthf(unsigned short)".
> > As you point out, this also greatly simplifies soft-float, as both
> > ABIs remain the same.
> >
> > Alas, the subreg approach doesn't work as we'd end up with (subreg:SI
> > (subreg:HI (reg:HF)), though technically that is what this patch does,
> > inserting a pair of conversion instructions.
> 
> So what does FUNCTION_ARG return for the HFmode parameters?  The above
> seems backwards - it should be (subreg:HF (reg:SI 0)), no?
> 
> Or are we talking about the caller side?  Or the return value case (in
which case I
> ask the same question wrt FUNCTION_VALUE)?
> 
> Does nvptx use the promotion hooks in those cases?
> 
> > Until recently (subreg:SI (reg:HF)) did work, but that support was
> > removed/cleaned-up,
> > as we need sensible subreg semantics in the RTL passes.   The proposed
patch
> > simply
> > adds support back in the minimal places where changing FP/non-FP and
> > precision at the same time is required: argument passing and return
> > values.
> >
> > Hopefully this answers your question.  An alternate viewpoint might be
> > "is there a reason for GCC not to be able to support targets with
> > slightly wacky parameter passing conventions".
> >
> > Thanks for thinking about this.
> > Roger
> > --
> >
> > > -----Original Message-----
> > > From: Richard Biener <rguenther@suse.de>
> > > Sent: 14 March 2022 09:09
> > > To: Roger Sayle <roger@nextmovesoftware.com>
> > > Cc: 'Tom de Vries' <tdevries@suse.de>; 'Jeff Law'
> > > <jeffreyalaw@gmail.com>; 'Tobias Burnus' <tobias@codesourcery.com>;
> > > richard.sandiford@arm.com;
> > 'Jeff
> > > Law' <law@redhat.com>; gcc-patches@gcc.gnu.org
> > > Subject: RE: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
> > values as
> > > wider integers.
> > >
> > > On Mon, 14 Mar 2022, Roger Sayle wrote:
> > >
> > > >
> > > > I thought I'd add a few comments that might help reviewers of this
> > patch.
> > > > Most importantly, this patch should be completely safe, as both
> > > > changes only trigger with FLOAT vs INT size mismatches which
> > > > currently both ICE in the compiler a few lines later.  Admittedly,
> > > > this indicates something odd about a target's choice of ABI, but
> > > > one alternative might be to issue a "sorry, unimplemented" error
> > > > message rather than ICE, perhaps with a TODO or FIXME comment that
> > > > support for mixed size FP/integer ABIs be added in future.
> > > >
> > > > The only (other?) possible point of contention is the (arbitrary)
> > > > decision that when passing floating point values in a larger
> > > > integer register, the code is hardwired to use zero-extension.
> > > > This in theory could be turned into a target hook to support
> > > > sign-extension, but given these are padding bits, zero seems
> > > > appropriate. [On x86_64, if passing DFmode argument in a V2DFmode
> > > > vector, say, it seems reasonable
> > to
> > > use movq and zero the high bits].
> > > >
> > > > The final point is that at the moment, the only affected target is
> > > > nvptx-none, as I don't believe any other backend specifies an ABI
> > > > that requires passing floating point values in wider integer
registers.
> > > > Having said that, most targets don't yet support HFmode, and their
> > > > ABI specifications haven't yet been updated as what to do in that
> > > > eventuality.  If they choose to treat these like HImode, they'll
> > > > run into the same issues, as most ABIs pass HImode in wider
> > > > word_mode
> > registers.
> > > >
> > > > I hope this helps.  If folks could air their concerns out loud, I
> > > > can try my best to address those worries.
> > >
> > > First of all I'm not familiar with the ABI related code as all, so I
> > refrained from
> > > commenting.  But now I've looked closer (still unfamiliar code).
> > >
> > > I suppose there's targets passing SFmode in a SImode GPR and DFmode
> > > in a DImode GPR (all soft-float targets?), so that already works
somehow.
> > > Why does nvptx choose DImode for SFmode passing then?  Can't it
> > > choose (subreg:SI di:..) or so?  Does it require zero-extending to
> > > DImode on the
> > caller
> > > side?  It seems your expand_expr_real_1 code does not rely on that?
> > > So,
> > why
> > > does nvptx function_arg hook (?) insist on returning a DImode reg
> > > for an
> > SFmode
> > > argument rather than an SImode subreg of that?
> > >
> > > Richard.
> > >
> > > >
> > > > Many thanks in advance (and thanks to Tobias and Tom for pushing
> > > > for
> > this).
> > > > Roger
> > > > --
> > > >
> > > > > -----Original Message-----
> > > > > From: Tom de Vries <tdevries@suse.de>
> > > > > Sent: 14 March 2022 08:06
> > > > > To: Jeff Law <jeffreyalaw@gmail.com>; Richard Biener
> > > > > <rguenther@suse.de>; Tobias Burnus <tobias@codesourcery.com>
> > > > > Cc: richard.sandiford@arm.com; Jeff Law <law@redhat.com>; gcc-
> > > > > patches@gcc.gnu.org; Roger Sayle <roger@nextmovesoftware.com>
> > > > > Subject: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
> > > > > values as wider integers.
> > > > >
> > > > > On 3/2/22 20:18, Jeff Law via Gcc-patches wrote:
> > > > > >
> > > > > >
> > > > > > On 2/28/2022 5:54 AM, Richard Biener via Gcc-patches wrote:
> > > > > >> On Mon, 28 Feb 2022, Tobias Burnus wrote:
> > > > > >>
> > > > > >>> Ping**3
> > > > > >>>
> > > > > >>> On 23.02.22 09:42, Tobias Burnus wrote:
> > > > > >>>> PING**2 for the ME review or at least comments to that
> > > > > >>>> patch, which fixes a build issue/ICE with nvptx
> > > > > >>>>
> > > > > >>>> Patch:
> > > > > >>>>
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.
> > > > > >>>> html (for gcc/cfgexpand.cc + gcc/expr.cc)
> > > > > >>>>
> > > > > >>>> (There is some discussion by Tom and Roger about the BE in
> > > > > >>>> the patch thread, which only not relate to the ME patch.
> > > > > >>>> But there is no ME-patch comment so far.)
> > > > > >>> The related BE patch has been already committed, but to be
> > > > > >>> effective, it needs the ME patch.
> > > > > >> I'm not sure I'm qualified to review this - maybe Richard is.
> > > > > > I'd initially ignored the patch as it didn't seem a good fit
> > > > > > for stage4, subsequent messages changed my mind about it, but
> > > > > > I never went back to take a deeper look at Roger's patch.
> > > > >
> > > > > Ping.
> > > > >
> > > > > [ FWIW, I'd appreciate it if a response came before the end of
> > > > > stage 4, such that I have some time left to deal with fallout in
> > > > > case the patch is not approved. ]
> > > > >
> > > > > Thanks,
> > > > > - Tom
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > > Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
> >
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
  
Richard Biener March 14, 2022, 1:27 p.m. UTC | #16
On Mon, 14 Mar 2022, Roger Sayle wrote:

> 
> Hi Richard,
> 
> > The above seems backwards - it should be (subreg:HF (reg:SI 0)), no?
> 
> Line 949 of emit-rtl.cc in the definition of "validate_subreg" contains:
> 
>   /* Subregs involving floating point modes are not allowed to
>      change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
>      (subreg:SI (reg:DF) 0) isn't.  */
>   else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
>     {
>       if (! (known_eq (isize, osize)
> 
> Hence, we ICE on line 1022 of emit-rtl.cc in gen_rtx_SUBREG, that asserts
> validate_subreg, when called with omode=HFmode and imode=SImode.

Yes, which is why I think the target should claim argument passing
happens in reg:HI.  But as said, I'm not very familiar with argument
passing internals.  But still the patch looks quite bolted on.

For the expand_value_return I have to guess that 'val' has HFmode,
the DECL_RESULT also has HFmode(?), promote_function_mode then
returns SImode which isn't supported.  So make it return HImode?

> Cheers,
> Roger
> --
> 
> > -----Original Message-----
> > From: Richard Biener <rguenther@suse.de>
> > Sent: 14 March 2022 10:15
> > To: Roger Sayle <roger@nextmovesoftware.com>
> > Cc: 'Tom de Vries' <tdevries@suse.de>; 'Jeff Law' <jeffreyalaw@gmail.com>;
> > 'Tobias Burnus' <tobias@codesourcery.com>; richard.sandiford@arm.com;
> 'Jeff
> > Law' <law@redhat.com>; gcc-patches@gcc.gnu.org
> > Subject: RE: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
> values as
> > wider integers.
> > 
> > On Mon, 14 Mar 2022, Roger Sayle wrote:
> > 
> > >
> > > Hi Richard,
> > > Thanks for asking.
> > > The issue is with 16-bit floating point HFmode, where clang on nvptx
> > > passes HFmode values in SImode registers, and for binary compatibility
> > > GCC needs to do the same.
> > > Their motivation is that for compatibility with older GPUs and the
> > > x86_64 host, HFmode parameters are treated like "unsigned int".
> > > Hence, libgcc functions like __sqrthf behave as though declared (are
> > > binary compatible with) "unsigned short __sqrthf(unsigned short)".
> > > As you point out, this also greatly simplifies soft-float, as both
> > > ABIs remain the same.
> > >
> > > Alas, the subreg approach doesn't work as we'd end up with (subreg:SI
> > > (subreg:HI (reg:HF)), though technically that is what this patch does,
> > > inserting a pair of conversion instructions.
> > 
> > So what does FUNCTION_ARG return for the HFmode parameters?  The above
> > seems backwards - it should be (subreg:HF (reg:SI 0)), no?
> > 
> > Or are we talking about the caller side?  Or the return value case (in
> which case I
> > ask the same question wrt FUNCTION_VALUE)?
> > 
> > Does nvptx use the promotion hooks in those cases?
> > 
> > > Until recently (subreg:SI (reg:HF)) did work, but that support was
> > > removed/cleaned-up,
> > > as we need sensible subreg semantics in the RTL passes.   The proposed
> patch
> > > simply
> > > adds support back in the minimal places where changing FP/non-FP and
> > > precision at the same time is required: argument passing and return
> > > values.
> > >
> > > Hopefully this answers your question.  An alternate viewpoint might be
> > > "is there a reason for GCC not to be able to support targets with
> > > slightly wacky parameter passing conventions".
> > >
> > > Thanks for thinking about this.
> > > Roger
> > > --
> > >
> > > > -----Original Message-----
> > > > From: Richard Biener <rguenther@suse.de>
> > > > Sent: 14 March 2022 09:09
> > > > To: Roger Sayle <roger@nextmovesoftware.com>
> > > > Cc: 'Tom de Vries' <tdevries@suse.de>; 'Jeff Law'
> > > > <jeffreyalaw@gmail.com>; 'Tobias Burnus' <tobias@codesourcery.com>;
> > > > richard.sandiford@arm.com;
> > > 'Jeff
> > > > Law' <law@redhat.com>; gcc-patches@gcc.gnu.org
> > > > Subject: RE: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
> > > values as
> > > > wider integers.
> > > >
> > > > On Mon, 14 Mar 2022, Roger Sayle wrote:
> > > >
> > > > >
> > > > > I thought I'd add a few comments that might help reviewers of this
> > > patch.
> > > > > Most importantly, this patch should be completely safe, as both
> > > > > changes only trigger with FLOAT vs INT size mismatches which
> > > > > currently both ICE in the compiler a few lines later.  Admittedly,
> > > > > this indicates something odd about a target's choice of ABI, but
> > > > > one alternative might be to issue a "sorry, unimplemented" error
> > > > > message rather than ICE, perhaps with a TODO or FIXME comment that
> > > > > support for mixed size FP/integer ABIs be added in future.
> > > > >
> > > > > The only (other?) possible point of contention is the (arbitrary)
> > > > > decision that when passing floating point values in a larger
> > > > > integer register, the code is hardwired to use zero-extension.
> > > > > This in theory could be turned into a target hook to support
> > > > > sign-extension, but given these are padding bits, zero seems
> > > > > appropriate. [On x86_64, if passing DFmode argument in a V2DFmode
> > > > > vector, say, it seems reasonable
> > > to
> > > > use movq and zero the high bits].
> > > > >
> > > > > The final point is that at the moment, the only affected target is
> > > > > nvptx-none, as I don't believe any other backend specifies an ABI
> > > > > that requires passing floating point values in wider integer
> registers.
> > > > > Having said that, most targets don't yet support HFmode, and their
> > > > > ABI specifications haven't yet been updated as what to do in that
> > > > > eventuality.  If they choose to treat these like HImode, they'll
> > > > > run into the same issues, as most ABIs pass HImode in wider
> > > > > word_mode
> > > registers.
> > > > >
> > > > > I hope this helps.  If folks could air their concerns out loud, I
> > > > > can try my best to address those worries.
> > > >
> > > > First of all I'm not familiar with the ABI related code as all, so I
> > > refrained from
> > > > commenting.  But now I've looked closer (still unfamiliar code).
> > > >
> > > > I suppose there's targets passing SFmode in a SImode GPR and DFmode
> > > > in a DImode GPR (all soft-float targets?), so that already works
> somehow.
> > > > Why does nvptx choose DImode for SFmode passing then?  Can't it
> > > > choose (subreg:SI di:..) or so?  Does it require zero-extending to
> > > > DImode on the
> > > caller
> > > > side?  It seems your expand_expr_real_1 code does not rely on that?
> > > > So,
> > > why
> > > > does nvptx function_arg hook (?) insist on returning a DImode reg
> > > > for an
> > > SFmode
> > > > argument rather than an SImode subreg of that?
> > > >
> > > > Richard.
> > > >
> > > > >
> > > > > Many thanks in advance (and thanks to Tobias and Tom for pushing
> > > > > for
> > > this).
> > > > > Roger
> > > > > --
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Tom de Vries <tdevries@suse.de>
> > > > > > Sent: 14 March 2022 08:06
> > > > > > To: Jeff Law <jeffreyalaw@gmail.com>; Richard Biener
> > > > > > <rguenther@suse.de>; Tobias Burnus <tobias@codesourcery.com>
> > > > > > Cc: richard.sandiford@arm.com; Jeff Law <law@redhat.com>; gcc-
> > > > > > patches@gcc.gnu.org; Roger Sayle <roger@nextmovesoftware.com>
> > > > > > Subject: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
> > > > > > values as wider integers.
> > > > > >
> > > > > > On 3/2/22 20:18, Jeff Law via Gcc-patches wrote:
> > > > > > >
> > > > > > >
> > > > > > > On 2/28/2022 5:54 AM, Richard Biener via Gcc-patches wrote:
> > > > > > >> On Mon, 28 Feb 2022, Tobias Burnus wrote:
> > > > > > >>
> > > > > > >>> Ping**3
> > > > > > >>>
> > > > > > >>> On 23.02.22 09:42, Tobias Burnus wrote:
> > > > > > >>>> PING**2 for the ME review or at least comments to that
> > > > > > >>>> patch, which fixes a build issue/ICE with nvptx
> > > > > > >>>>
> > > > > > >>>> Patch:
> > > > > > >>>>
> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.
> > > > > > >>>> html (for gcc/cfgexpand.cc + gcc/expr.cc)
> > > > > > >>>>
> > > > > > >>>> (There is some discussion by Tom and Roger about the BE in
> > > > > > >>>> the patch thread, which only not relate to the ME patch.
> > > > > > >>>> But there is no ME-patch comment so far.)
> > > > > > >>> The related BE patch has been already committed, but to be
> > > > > > >>> effective, it needs the ME patch.
> > > > > > >> I'm not sure I'm qualified to review this - maybe Richard is.
> > > > > > > I'd initially ignored the patch as it didn't seem a good fit
> > > > > > > for stage4, subsequent messages changed my mind about it, but
> > > > > > > I never went back to take a deeper look at Roger's patch.
> > > > > >
> > > > > > Ping.
> > > > > >
> > > > > > [ FWIW, I'd appreciate it if a response came before the end of
> > > > > > stage 4, such that I have some time left to deal with fallout in
> > > > > > case the patch is not approved. ]
> > > > > >
> > > > > > Thanks,
> > > > > > - Tom
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > > > Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
> > >
> > >
> > 
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
> 
>
  
Roger Sayle March 14, 2022, 2:30 p.m. UTC | #17
Hi Richard,
> Yes, which is why I think the target should claim argument passing happens
in reg:HI.

Unfortunately, this hits another "feature" of the nvptx backend; it's a 

/* Implement TARGET_MODES_TIEABLE_P.  */
bool nvptx_modes_tieable_p (machine_mode, machine_mode) { return false; }
/* Implement TARGET_CAN_CHANGE_MODE_CLASS.  */
bool nvptx_can_change_mode_class (machine_mode, machine_mode, reg_class_t) {
return false; }
/* Implement TARGET_TRULY_NOOP_TRUNCATION.  */
bool nvptx_truly_noop_truncation (poly_uint64, poly_uint64) { return false;
}

Basically, HImode is considered as a different register (bank) to SImode,
and requires
explicit move instructions to move data from HImode to SImode, and back,
unlike
most targets that can simply re-interpret the contents of GPRs.

Passing an argument as HImode would be incompatible with the requirement to
pass as SImode.

Cheers,
Roger
--

> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: 14 March 2022 13:28
> To: Roger Sayle <roger@nextmovesoftware.com>
> Cc: 'Tom de Vries' <tdevries@suse.de>; 'Jeff Law' <jeffreyalaw@gmail.com>;
> 'Tobias Burnus' <tobias@codesourcery.com>; richard.sandiford@arm.com;
'Jeff
> Law' <law@redhat.com>; gcc-patches@gcc.gnu.org
> Subject: RE: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
values as
> wider integers.
> 
> On Mon, 14 Mar 2022, Roger Sayle wrote:
> 
> >
> > Hi Richard,
> >
> > > The above seems backwards - it should be (subreg:HF (reg:SI 0)), no?
> >
> > Line 949 of emit-rtl.cc in the definition of "validate_subreg" contains:
> >
> >   /* Subregs involving floating point modes are not allowed to
> >      change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> >      (subreg:SI (reg:DF) 0) isn't.  */
> >   else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
> >     {
> >       if (! (known_eq (isize, osize)
> >
> > Hence, we ICE on line 1022 of emit-rtl.cc in gen_rtx_SUBREG, that
> > asserts validate_subreg, when called with omode=HFmode and
> imode=SImode.
> 
> Yes, which is why I think the target should claim argument passing happens
in
> reg:HI.  But as said, I'm not very familiar with argument passing
internals.  But
> still the patch looks quite bolted on.
> 
> For the expand_value_return I have to guess that 'val' has HFmode, the
> DECL_RESULT also has HFmode(?), promote_function_mode then returns
> SImode which isn't supported.  So make it return HImode?
> 
> > Cheers,
> > Roger
> > --
> >
> > > -----Original Message-----
> > > From: Richard Biener <rguenther@suse.de>
> > > Sent: 14 March 2022 10:15
> > > To: Roger Sayle <roger@nextmovesoftware.com>
> > > Cc: 'Tom de Vries' <tdevries@suse.de>; 'Jeff Law'
> > > <jeffreyalaw@gmail.com>; 'Tobias Burnus' <tobias@codesourcery.com>;
> > > richard.sandiford@arm.com;
> > 'Jeff
> > > Law' <law@redhat.com>; gcc-patches@gcc.gnu.org
> > > Subject: RE: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
> > values as
> > > wider integers.
> > >
> > > On Mon, 14 Mar 2022, Roger Sayle wrote:
> > >
> > > >
> > > > Hi Richard,
> > > > Thanks for asking.
> > > > The issue is with 16-bit floating point HFmode, where clang on
> > > > nvptx passes HFmode values in SImode registers, and for binary
> > > > compatibility GCC needs to do the same.
> > > > Their motivation is that for compatibility with older GPUs and the
> > > > x86_64 host, HFmode parameters are treated like "unsigned int".
> > > > Hence, libgcc functions like __sqrthf behave as though declared
> > > > (are binary compatible with) "unsigned short __sqrthf(unsigned
short)".
> > > > As you point out, this also greatly simplifies soft-float, as both
> > > > ABIs remain the same.
> > > >
> > > > Alas, the subreg approach doesn't work as we'd end up with
> > > > (subreg:SI (subreg:HI (reg:HF)), though technically that is what
> > > > this patch does, inserting a pair of conversion instructions.
> > >
> > > So what does FUNCTION_ARG return for the HFmode parameters?  The
> > > above seems backwards - it should be (subreg:HF (reg:SI 0)), no?
> > >
> > > Or are we talking about the caller side?  Or the return value case
> > > (in
> > which case I
> > > ask the same question wrt FUNCTION_VALUE)?
> > >
> > > Does nvptx use the promotion hooks in those cases?
> > >
> > > > Until recently (subreg:SI (reg:HF)) did work, but that support was
> > > > removed/cleaned-up,
> > > > as we need sensible subreg semantics in the RTL passes.   The
proposed
> > patch
> > > > simply
> > > > adds support back in the minimal places where changing FP/non-FP
> > > > and precision at the same time is required: argument passing and
> > > > return values.
> > > >
> > > > Hopefully this answers your question.  An alternate viewpoint
> > > > might be "is there a reason for GCC not to be able to support
> > > > targets with slightly wacky parameter passing conventions".
> > > >
> > > > Thanks for thinking about this.
> > > > Roger
> > > > --
> > > >
> > > > > -----Original Message-----
> > > > > From: Richard Biener <rguenther@suse.de>
> > > > > Sent: 14 March 2022 09:09
> > > > > To: Roger Sayle <roger@nextmovesoftware.com>
> > > > > Cc: 'Tom de Vries' <tdevries@suse.de>; 'Jeff Law'
> > > > > <jeffreyalaw@gmail.com>; 'Tobias Burnus'
> > > > > <tobias@codesourcery.com>; richard.sandiford@arm.com;
> > > > 'Jeff
> > > > > Law' <law@redhat.com>; gcc-patches@gcc.gnu.org
> > > > > Subject: RE: PING**4 - [PATCH] middle-end: Support ABIs that
> > > > > pass FP
> > > > values as
> > > > > wider integers.
> > > > >
> > > > > On Mon, 14 Mar 2022, Roger Sayle wrote:
> > > > >
> > > > > >
> > > > > > I thought I'd add a few comments that might help reviewers of
> > > > > > this
> > > > patch.
> > > > > > Most importantly, this patch should be completely safe, as
> > > > > > both changes only trigger with FLOAT vs INT size mismatches
> > > > > > which currently both ICE in the compiler a few lines later.
> > > > > > Admittedly, this indicates something odd about a target's
> > > > > > choice of ABI, but one alternative might be to issue a "sorry,
> > > > > > unimplemented" error message rather than ICE, perhaps with a
> > > > > > TODO or FIXME comment that support for mixed size FP/integer
ABIs
> be added in future.
> > > > > >
> > > > > > The only (other?) possible point of contention is the
> > > > > > (arbitrary) decision that when passing floating point values
> > > > > > in a larger integer register, the code is hardwired to use
zero-extension.
> > > > > > This in theory could be turned into a target hook to support
> > > > > > sign-extension, but given these are padding bits, zero seems
> > > > > > appropriate. [On x86_64, if passing DFmode argument in a
> > > > > > V2DFmode vector, say, it seems reasonable
> > > > to
> > > > > use movq and zero the high bits].
> > > > > >
> > > > > > The final point is that at the moment, the only affected
> > > > > > target is nvptx-none, as I don't believe any other backend
> > > > > > specifies an ABI that requires passing floating point values
> > > > > > in wider integer
> > registers.
> > > > > > Having said that, most targets don't yet support HFmode, and
> > > > > > their ABI specifications haven't yet been updated as what to
> > > > > > do in that eventuality.  If they choose to treat these like
> > > > > > HImode, they'll run into the same issues, as most ABIs pass
> > > > > > HImode in wider word_mode
> > > > registers.
> > > > > >
> > > > > > I hope this helps.  If folks could air their concerns out
> > > > > > loud, I can try my best to address those worries.
> > > > >
> > > > > First of all I'm not familiar with the ABI related code as all,
> > > > > so I
> > > > refrained from
> > > > > commenting.  But now I've looked closer (still unfamiliar code).
> > > > >
> > > > > I suppose there's targets passing SFmode in a SImode GPR and
> > > > > DFmode in a DImode GPR (all soft-float targets?), so that
> > > > > already works
> > somehow.
> > > > > Why does nvptx choose DImode for SFmode passing then?  Can't it
> > > > > choose (subreg:SI di:..) or so?  Does it require zero-extending
> > > > > to DImode on the
> > > > caller
> > > > > side?  It seems your expand_expr_real_1 code does not rely on
that?
> > > > > So,
> > > > why
> > > > > does nvptx function_arg hook (?) insist on returning a DImode
> > > > > reg for an
> > > > SFmode
> > > > > argument rather than an SImode subreg of that?
> > > > >
> > > > > Richard.
> > > > >
> > > > > >
> > > > > > Many thanks in advance (and thanks to Tobias and Tom for
> > > > > > pushing for
> > > > this).
> > > > > > Roger
> > > > > > --
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Tom de Vries <tdevries@suse.de>
> > > > > > > Sent: 14 March 2022 08:06
> > > > > > > To: Jeff Law <jeffreyalaw@gmail.com>; Richard Biener
> > > > > > > <rguenther@suse.de>; Tobias Burnus <tobias@codesourcery.com>
> > > > > > > Cc: richard.sandiford@arm.com; Jeff Law <law@redhat.com>;
> > > > > > > gcc- patches@gcc.gnu.org; Roger Sayle
> > > > > > > <roger@nextmovesoftware.com>
> > > > > > > Subject: PING**4 - [PATCH] middle-end: Support ABIs that
> > > > > > > pass FP values as wider integers.
> > > > > > >
> > > > > > > On 3/2/22 20:18, Jeff Law via Gcc-patches wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > On 2/28/2022 5:54 AM, Richard Biener via Gcc-patches wrote:
> > > > > > > >> On Mon, 28 Feb 2022, Tobias Burnus wrote:
> > > > > > > >>
> > > > > > > >>> Ping**3
> > > > > > > >>>
> > > > > > > >>> On 23.02.22 09:42, Tobias Burnus wrote:
> > > > > > > >>>> PING**2 for the ME review or at least comments to that
> > > > > > > >>>> patch, which fixes a build issue/ICE with nvptx
> > > > > > > >>>>
> > > > > > > >>>> Patch:
> > > > > > > >>>>
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.
> > > > > > > >>>> html (for gcc/cfgexpand.cc + gcc/expr.cc)
> > > > > > > >>>>
> > > > > > > >>>> (There is some discussion by Tom and Roger about the BE
> > > > > > > >>>> in the patch thread, which only not relate to the ME
patch.
> > > > > > > >>>> But there is no ME-patch comment so far.)
> > > > > > > >>> The related BE patch has been already committed, but to
> > > > > > > >>> be effective, it needs the ME patch.
> > > > > > > >> I'm not sure I'm qualified to review this - maybe Richard
is.
> > > > > > > > I'd initially ignored the patch as it didn't seem a good
> > > > > > > > fit for stage4, subsequent messages changed my mind about
> > > > > > > > it, but I never went back to take a deeper look at Roger's
patch.
> > > > > > >
> > > > > > > Ping.
> > > > > > >
> > > > > > > [ FWIW, I'd appreciate it if a response came before the end
> > > > > > > of stage 4, such that I have some time left to deal with
> > > > > > > fallout in case the patch is not approved. ]
> > > > > > >
> > > > > > > Thanks,
> > > > > > > - Tom
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguenther@suse.de> SUSE Software Solutions
> > > > > Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF:
> > > > > Ivo Totev; HRB 36809 (AG Nuernberg)
> > > >
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > > Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
> >
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
  
Richard Biener March 14, 2022, 2:40 p.m. UTC | #18
On Mon, 14 Mar 2022, Roger Sayle wrote:

> 
> Hi Richard,
> > Yes, which is why I think the target should claim argument passing happens
> in reg:HI.
> 
> Unfortunately, this hits another "feature" of the nvptx backend; it's a 
> 
> /* Implement TARGET_MODES_TIEABLE_P.  */
> bool nvptx_modes_tieable_p (machine_mode, machine_mode) { return false; }
> /* Implement TARGET_CAN_CHANGE_MODE_CLASS.  */
> bool nvptx_can_change_mode_class (machine_mode, machine_mode, reg_class_t) {
> return false; }
> /* Implement TARGET_TRULY_NOOP_TRUNCATION.  */
> bool nvptx_truly_noop_truncation (poly_uint64, poly_uint64) { return false;
> }
> 
> Basically, HImode is considered as a different register (bank) to SImode,
> and requires
> explicit move instructions to move data from HImode to SImode, and back,
> unlike
> most targets that can simply re-interpret the contents of GPRs.
> 
> Passing an argument as HImode would be incompatible with the requirement to
> pass as SImode.

OK - I'm out of good suggestions but it somehow feels the ugly details
should hide inside the backend somehow.  Just not sure how.

Richard.

> Cheers,
> Roger
> --
> 
> > -----Original Message-----
> > From: Richard Biener <rguenther@suse.de>
> > Sent: 14 March 2022 13:28
> > To: Roger Sayle <roger@nextmovesoftware.com>
> > Cc: 'Tom de Vries' <tdevries@suse.de>; 'Jeff Law' <jeffreyalaw@gmail.com>;
> > 'Tobias Burnus' <tobias@codesourcery.com>; richard.sandiford@arm.com;
> 'Jeff
> > Law' <law@redhat.com>; gcc-patches@gcc.gnu.org
> > Subject: RE: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
> values as
> > wider integers.
> > 
> > On Mon, 14 Mar 2022, Roger Sayle wrote:
> > 
> > >
> > > Hi Richard,
> > >
> > > > The above seems backwards - it should be (subreg:HF (reg:SI 0)), no?
> > >
> > > Line 949 of emit-rtl.cc in the definition of "validate_subreg" contains:
> > >
> > >   /* Subregs involving floating point modes are not allowed to
> > >      change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> > >      (subreg:SI (reg:DF) 0) isn't.  */
> > >   else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
> > >     {
> > >       if (! (known_eq (isize, osize)
> > >
> > > Hence, we ICE on line 1022 of emit-rtl.cc in gen_rtx_SUBREG, that
> > > asserts validate_subreg, when called with omode=HFmode and
> > imode=SImode.
> > 
> > Yes, which is why I think the target should claim argument passing happens
> in
> > reg:HI.  But as said, I'm not very familiar with argument passing
> internals.  But
> > still the patch looks quite bolted on.
> > 
> > For the expand_value_return I have to guess that 'val' has HFmode, the
> > DECL_RESULT also has HFmode(?), promote_function_mode then returns
> > SImode which isn't supported.  So make it return HImode?
> > 
> > > Cheers,
> > > Roger
> > > --
> > >
> > > > -----Original Message-----
> > > > From: Richard Biener <rguenther@suse.de>
> > > > Sent: 14 March 2022 10:15
> > > > To: Roger Sayle <roger@nextmovesoftware.com>
> > > > Cc: 'Tom de Vries' <tdevries@suse.de>; 'Jeff Law'
> > > > <jeffreyalaw@gmail.com>; 'Tobias Burnus' <tobias@codesourcery.com>;
> > > > richard.sandiford@arm.com;
> > > 'Jeff
> > > > Law' <law@redhat.com>; gcc-patches@gcc.gnu.org
> > > > Subject: RE: PING**4 - [PATCH] middle-end: Support ABIs that pass FP
> > > values as
> > > > wider integers.
> > > >
> > > > On Mon, 14 Mar 2022, Roger Sayle wrote:
> > > >
> > > > >
> > > > > Hi Richard,
> > > > > Thanks for asking.
> > > > > The issue is with 16-bit floating point HFmode, where clang on
> > > > > nvptx passes HFmode values in SImode registers, and for binary
> > > > > compatibility GCC needs to do the same.
> > > > > Their motivation is that for compatibility with older GPUs and the
> > > > > x86_64 host, HFmode parameters are treated like "unsigned int".
> > > > > Hence, libgcc functions like __sqrthf behave as though declared
> > > > > (are binary compatible with) "unsigned short __sqrthf(unsigned
> short)".
> > > > > As you point out, this also greatly simplifies soft-float, as both
> > > > > ABIs remain the same.
> > > > >
> > > > > Alas, the subreg approach doesn't work as we'd end up with
> > > > > (subreg:SI (subreg:HI (reg:HF)), though technically that is what
> > > > > this patch does, inserting a pair of conversion instructions.
> > > >
> > > > So what does FUNCTION_ARG return for the HFmode parameters?  The
> > > > above seems backwards - it should be (subreg:HF (reg:SI 0)), no?
> > > >
> > > > Or are we talking about the caller side?  Or the return value case
> > > > (in
> > > which case I
> > > > ask the same question wrt FUNCTION_VALUE)?
> > > >
> > > > Does nvptx use the promotion hooks in those cases?
> > > >
> > > > > Until recently (subreg:SI (reg:HF)) did work, but that support was
> > > > > removed/cleaned-up,
> > > > > as we need sensible subreg semantics in the RTL passes.   The
> proposed
> > > patch
> > > > > simply
> > > > > adds support back in the minimal places where changing FP/non-FP
> > > > > and precision at the same time is required: argument passing and
> > > > > return values.
> > > > >
> > > > > Hopefully this answers your question.  An alternate viewpoint
> > > > > might be "is there a reason for GCC not to be able to support
> > > > > targets with slightly wacky parameter passing conventions".
> > > > >
> > > > > Thanks for thinking about this.
> > > > > Roger
> > > > > --
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Richard Biener <rguenther@suse.de>
> > > > > > Sent: 14 March 2022 09:09
> > > > > > To: Roger Sayle <roger@nextmovesoftware.com>
> > > > > > Cc: 'Tom de Vries' <tdevries@suse.de>; 'Jeff Law'
> > > > > > <jeffreyalaw@gmail.com>; 'Tobias Burnus'
> > > > > > <tobias@codesourcery.com>; richard.sandiford@arm.com;
> > > > > 'Jeff
> > > > > > Law' <law@redhat.com>; gcc-patches@gcc.gnu.org
> > > > > > Subject: RE: PING**4 - [PATCH] middle-end: Support ABIs that
> > > > > > pass FP
> > > > > values as
> > > > > > wider integers.
> > > > > >
> > > > > > On Mon, 14 Mar 2022, Roger Sayle wrote:
> > > > > >
> > > > > > >
> > > > > > > I thought I'd add a few comments that might help reviewers of
> > > > > > > this
> > > > > patch.
> > > > > > > Most importantly, this patch should be completely safe, as
> > > > > > > both changes only trigger with FLOAT vs INT size mismatches
> > > > > > > which currently both ICE in the compiler a few lines later.
> > > > > > > Admittedly, this indicates something odd about a target's
> > > > > > > choice of ABI, but one alternative might be to issue a "sorry,
> > > > > > > unimplemented" error message rather than ICE, perhaps with a
> > > > > > > TODO or FIXME comment that support for mixed size FP/integer
> ABIs
> > be added in future.
> > > > > > >
> > > > > > > The only (other?) possible point of contention is the
> > > > > > > (arbitrary) decision that when passing floating point values
> > > > > > > in a larger integer register, the code is hardwired to use
> zero-extension.
> > > > > > > This in theory could be turned into a target hook to support
> > > > > > > sign-extension, but given these are padding bits, zero seems
> > > > > > > appropriate. [On x86_64, if passing DFmode argument in a
> > > > > > > V2DFmode vector, say, it seems reasonable
> > > > > to
> > > > > > use movq and zero the high bits].
> > > > > > >
> > > > > > > The final point is that at the moment, the only affected
> > > > > > > target is nvptx-none, as I don't believe any other backend
> > > > > > > specifies an ABI that requires passing floating point values
> > > > > > > in wider integer
> > > registers.
> > > > > > > Having said that, most targets don't yet support HFmode, and
> > > > > > > their ABI specifications haven't yet been updated as what to
> > > > > > > do in that eventuality.  If they choose to treat these like
> > > > > > > HImode, they'll run into the same issues, as most ABIs pass
> > > > > > > HImode in wider word_mode
> > > > > registers.
> > > > > > >
> > > > > > > I hope this helps.  If folks could air their concerns out
> > > > > > > loud, I can try my best to address those worries.
> > > > > >
> > > > > > First of all I'm not familiar with the ABI related code as all,
> > > > > > so I
> > > > > refrained from
> > > > > > commenting.  But now I've looked closer (still unfamiliar code).
> > > > > >
> > > > > > I suppose there's targets passing SFmode in a SImode GPR and
> > > > > > DFmode in a DImode GPR (all soft-float targets?), so that
> > > > > > already works
> > > somehow.
> > > > > > Why does nvptx choose DImode for SFmode passing then?  Can't it
> > > > > > choose (subreg:SI di:..) or so?  Does it require zero-extending
> > > > > > to DImode on the
> > > > > caller
> > > > > > side?  It seems your expand_expr_real_1 code does not rely on
> that?
> > > > > > So,
> > > > > why
> > > > > > does nvptx function_arg hook (?) insist on returning a DImode
> > > > > > reg for an
> > > > > SFmode
> > > > > > argument rather than an SImode subreg of that?
> > > > > >
> > > > > > Richard.
> > > > > >
> > > > > > >
> > > > > > > Many thanks in advance (and thanks to Tobias and Tom for
> > > > > > > pushing for
> > > > > this).
> > > > > > > Roger
> > > > > > > --
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Tom de Vries <tdevries@suse.de>
> > > > > > > > Sent: 14 March 2022 08:06
> > > > > > > > To: Jeff Law <jeffreyalaw@gmail.com>; Richard Biener
> > > > > > > > <rguenther@suse.de>; Tobias Burnus <tobias@codesourcery.com>
> > > > > > > > Cc: richard.sandiford@arm.com; Jeff Law <law@redhat.com>;
> > > > > > > > gcc- patches@gcc.gnu.org; Roger Sayle
> > > > > > > > <roger@nextmovesoftware.com>
> > > > > > > > Subject: PING**4 - [PATCH] middle-end: Support ABIs that
> > > > > > > > pass FP values as wider integers.
> > > > > > > >
> > > > > > > > On 3/2/22 20:18, Jeff Law via Gcc-patches wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On 2/28/2022 5:54 AM, Richard Biener via Gcc-patches wrote:
> > > > > > > > >> On Mon, 28 Feb 2022, Tobias Burnus wrote:
> > > > > > > > >>
> > > > > > > > >>> Ping**3
> > > > > > > > >>>
> > > > > > > > >>> On 23.02.22 09:42, Tobias Burnus wrote:
> > > > > > > > >>>> PING**2 for the ME review or at least comments to that
> > > > > > > > >>>> patch, which fixes a build issue/ICE with nvptx
> > > > > > > > >>>>
> > > > > > > > >>>> Patch:
> > > > > > > > >>>>
> > > https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.
> > > > > > > > >>>> html (for gcc/cfgexpand.cc + gcc/expr.cc)
> > > > > > > > >>>>
> > > > > > > > >>>> (There is some discussion by Tom and Roger about the BE
> > > > > > > > >>>> in the patch thread, which only not relate to the ME
> patch.
> > > > > > > > >>>> But there is no ME-patch comment so far.)
> > > > > > > > >>> The related BE patch has been already committed, but to
> > > > > > > > >>> be effective, it needs the ME patch.
> > > > > > > > >> I'm not sure I'm qualified to review this - maybe Richard
> is.
> > > > > > > > > I'd initially ignored the patch as it didn't seem a good
> > > > > > > > > fit for stage4, subsequent messages changed my mind about
> > > > > > > > > it, but I never went back to take a deeper look at Roger's
> patch.
> > > > > > > >
> > > > > > > > Ping.
> > > > > > > >
> > > > > > > > [ FWIW, I'd appreciate it if a response came before the end
> > > > > > > > of stage 4, such that I have some time left to deal with
> > > > > > > > fallout in case the patch is not approved. ]
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > - Tom
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Richard Biener <rguenther@suse.de> SUSE Software Solutions
> > > > > > Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF:
> > > > > > Ivo Totev; HRB 36809 (AG Nuernberg)
> > > > >
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > > > Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
> > >
> > >
> > 
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
> 
>
  
Jeff Law March 14, 2022, 2:59 p.m. UTC | #19
On 3/14/2022 3:09 AM, Richard Biener wrote:
> On Mon, 14 Mar 2022, Roger Sayle wrote:
>
>> I thought I'd add a few comments that might help reviewers of this patch.
>> Most importantly, this patch should be completely safe, as both changes
>> only trigger with FLOAT vs INT size mismatches which currently both ICE in
>> the compiler a few lines later.  Admittedly, this indicates something odd
>> about a target's choice of ABI, but one alternative might be to issue a
>> "sorry, unimplemented" error message rather than ICE, perhaps with a
>> TODO or FIXME comment that support for mixed size FP/integer ABIs be
>> added in future.
>>
>> The only (other?) possible point of contention is the (arbitrary) decision that
>> when passing floating point values in a larger integer register, the code is
>> hardwired to use zero-extension.  This in theory could be turned into a target
>> hook to support sign-extension, but given these are padding bits, zero seems
>> appropriate. [On x86_64, if passing DFmode argument in a V2DFmode vector,
>> say, it seems reasonable to use movq and zero the high bits].
>>
>> The final point is that at the moment, the only affected target is nvptx-none,
>> as I don't believe any other backend specifies an ABI that requires passing
>> floating point values in wider integer registers.  Having said that, most targets
>> don't yet support HFmode, and their ABI specifications haven't yet been
>> updated as what to do in that eventuality.  If they choose to treat these like
>> HImode, they'll run into the same issues, as most ABIs pass HImode in
>> wider word_mode registers.
>>
>> I hope this helps.  If folks could air their concerns out loud, I can try my best
>> to address those worries.
> First of all I'm not familiar with the ABI related code as all, so I
> refrained from commenting.  But now I've looked closer (still unfamiliar
> code).
>
> I suppose there's targets passing SFmode in a SImode GPR and DFmode
> in a DImode GPR (all soft-float targets?), so that already works somehow.
> Why does nvptx choose DImode for SFmode passing then?  Can't it choose
> (subreg:SI di:..) or so?  Does it require zero-extending to DImode
> on the caller side?  It seems your expand_expr_real_1 code does
> not rely on that?  So, why does nvptx function_arg hook (?) insist
> on returning a DImode reg for an SFmode argument rather than
> an SImode subreg of that?
I think so.  V850 for example might follow this pattern, but those 
extensions came in after the original port, so I'm not 100% sure.

I'd hazard a guess DI is a better match than SI simply because nvptx is 
a 64bit target.  I'd be surprised if sign vs zero extension is actually 
important.  It may be following from existing conventions on the port.

jeff
  
Jeff Law March 14, 2022, 3:08 p.m. UTC | #20
On 3/14/2022 2:39 AM, Roger Sayle wrote:
>
> The only (other?) possible point of contention is the (arbitrary) decision that
> when passing floating point values in a larger integer register, the code is
> hardwired to use zero-extension.  This in theory could be turned into a target
> hook to support sign-extension, but given these are padding bits, zero seems
> appropriate. [On x86_64, if passing DFmode argument in a V2DFmode vector,
> say, it seems reasonable to use movq and zero the high bits].
I'm not terribly concerned about the zero vs sign extension.

>
> The final point is that at the moment, the only affected target is nvptx-none,
> as I don't believe any other backend specifies an ABI that requires passing
> floating point values in wider integer registers.  Having said that, most targets
> don't yet support HFmode, and their ABI specifications haven't yet been
> updated as what to do in that eventuality.  If they choose to treat these like
> HImode, they'll run into the same issues, as most ABIs pass HImode in
> wider word_mode registers.
The plan for our target has been to pass around HFmode in the GPF 
registers, but in general HFmode is new enough that not many targets 
have them handled explicitly in their ABI.

Passing HFmode objects around in GPRs is odd, but not crazy so. We've 
certainly had targets where FP values would get passed around in the 
integer registers -- for example the 32 bit PA ABI mandated this kind of 
behavior in some circumstances.  Of course they also relied on the 
linker to look at the caller & callee and add a stub in between them to 
deal with mismatches :(

jeff
  
Jeff Law March 14, 2022, 3:30 p.m. UTC | #21
On 2/9/2022 1:12 PM, Roger Sayle wrote:
> This patch adds middle-end support for target ABIs that pass/return
> floating point values in integer registers with precision wider than
> the original FP mode.  An example, is the nvptx backend where 16-bit
> HFmode registers are passed/returned as (promoted to) SImode registers.
> Unfortunately, this currently falls foul of the various (recent?) sanity
> checks that (very sensibly) prevent creating paradoxical SUBREGs of
> floating point registers.  The approach below is to explicitly perform the
> conversion/promotion in two steps, via an integer mode of same precision
> as the floating point value.  So on nvptx, 16-bit HFmode is initially
> converted to 16-bit HImode (using SUBREG), then zero-extended to SImode,
> and likewise when going the other way, parameters truncated to HImode
> then converted to HFmode (using SUBREG).  These changes are localized
> to expand_value_return and expanding DECL_RTL to support strange ABIs,
> rather than inside convert_modes or gen_lowpart, as mismatched
> precision integer/FP conversions should be explicit in the RTL,
> and these semantics not generally visible/implicit in user code.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures, and on nvptx-none, where it is
> the middle-end portion of a pair of patches to allow the default ISA to
> be advanced.  Ok for mainline?
>
> 2022-02-09  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * cfgexpand.cc (expand_value_return): Allow backends to promote
>         a scalar floating point return value to a wider integer mode.
>         * expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise, allow
>         backends to promote scalar FP PARM_DECLs to wider integer modes.

Buried somewhere in our calling conventions code is the ability to pass 
around BLKmode objects in registers along with the ability to tune left 
vs right padding adjustments.   Much of this support grew out of the PA 
32 bit SOM ABI.

While I think we could probably make those bits do what we want, I 
suspect the result will actually be uglier than what you've done here 
and I wouldn't be surprised if there was a performance hit as the code 
to handle those cases was pretty dumb in its implementation.

What I find rather surprising is the location of your changes -- they 
feel incomplete.  For example, you fix the callee side of returns in 
expand_value_return, but I don't analogous code for the caller side.

Similarly while you fix things for arguments in expand_expr_real_1, 
that's again just the callee side.  Don't you need to so something on 
the caller side too?

Jeff
  
Richard Sandiford March 14, 2022, 3:31 p.m. UTC | #22
"Roger Sayle" <roger@nextmovesoftware.com> writes:
> Hi Richard,
>> Yes, which is why I think the target should claim argument passing happens
> in reg:HI.
>
> Unfortunately, this hits another "feature" of the nvptx backend; it's a 
>
> /* Implement TARGET_MODES_TIEABLE_P.  */
> bool nvptx_modes_tieable_p (machine_mode, machine_mode) { return false; }
> /* Implement TARGET_CAN_CHANGE_MODE_CLASS.  */
> bool nvptx_can_change_mode_class (machine_mode, machine_mode, reg_class_t) {
> return false; }
> /* Implement TARGET_TRULY_NOOP_TRUNCATION.  */
> bool nvptx_truly_noop_truncation (poly_uint64, poly_uint64) { return false;
> }
>
> Basically, HImode is considered as a different register (bank) to SImode,
> and requires
> explicit move instructions to move data from HImode to SImode, and back,
> unlike
> most targets that can simply re-interpret the contents of GPRs.
>
> Passing an argument as HImode would be incompatible with the requirement to
> pass as SImode.

Ah, OK, this was the bit I was missing from all of this.

Like Richard says, how important is the zero-extension
vs. sign-extension thing?  Are callers required to zero-extend,
and can callees rely on that?  Or does the ABI treat the upper
bits as undefined, so that callees can't rely on their contents?

If the bits are undefined, have you tried making the ABI code return:

  (parallel:HF [(expr_list (reg:SI rN) (const_int 0))])

It's admittedly unusual to have the register be *bigger* than the
value, but supporting that feels less hacky to me than the current
approach.

Thanks,
Richard
  
Roger Sayle March 14, 2022, 5:24 p.m. UTC | #23
Hi Jeff,

> What I find rather surprising is the location of your changes -- they feel
> incomplete.  For example, you fix the callee side of returns in
> expand_value_return, but I don't analogous code for the caller side.
> 
> Similarly while you fix things for arguments in expand_expr_real_1, that's again
> just the callee side.  Don't you need to do something on the caller side too?

I've taken the pragmatic approach for this fix to PR target/104489, that this
patch only needs to modify/fix the parts of the middle-end that are broken.
With this patch, gcc can compile the following with -O2 -misa=sm_80 -ffast-math

_Float16 p;
_Float16 q;
_Float16 r;

_Float16 foo(_Float16 x, _Float16 y)
{
  return x * y;
}

_Float16 mid(_Float16 x, _Float16 y)
{
  return foo(x,y) + foo(y,x);
}

void bar()
{
  p = mid(q,r);
}

which I assume covers all of the paths that I/we need to care about.
Technically, the blocker is that without this patch, GCC's build fails
in libgcc (compiling __mulhc3) when/if HFmode is enabled by default.
I'm hoping any remaining issues, not caught by the current testsuite,
can be handled as regular Bugzilla PRs to be fixed/added to the
testsuite.


Let me if there's anything I've missed or need to worry about.
I believe most PC laptops/desktops contain Nvidia graphics cards, so it's
relatively easy for GCC developers to try things out (on real hardware)
for themselves.
 
Cheers,
Roger
--

> -----Original Message-----
> From: Jeff Law <jeffreyalaw@gmail.com>
> Sent: 14 March 2022 15:30
> To: Roger Sayle <roger@nextmovesoftware.com>; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] middle-end: Support ABIs that pass FP values as wider
> integers.
> 
> 
> 
> On 2/9/2022 1:12 PM, Roger Sayle wrote:
> > This patch adds middle-end support for target ABIs that pass/return
> > floating point values in integer registers with precision wider than
> > the original FP mode.  An example, is the nvptx backend where 16-bit
> > HFmode registers are passed/returned as (promoted to) SImode registers.
> > Unfortunately, this currently falls foul of the various (recent?)
> > sanity checks that (very sensibly) prevent creating paradoxical
> > SUBREGs of floating point registers.  The approach below is to
> > explicitly perform the conversion/promotion in two steps, via an
> > integer mode of same precision as the floating point value.  So on
> > nvptx, 16-bit HFmode is initially converted to 16-bit HImode (using
> > SUBREG), then zero-extended to SImode, and likewise when going the
> > other way, parameters truncated to HImode then converted to HFmode
> > (using SUBREG).  These changes are localized to expand_value_return
> > and expanding DECL_RTL to support strange ABIs, rather than inside
> > convert_modes or gen_lowpart, as mismatched precision integer/FP
> > conversions should be explicit in the RTL, and these semantics not generally
> visible/implicit in user code.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check with no new failures, and on nvptx-none, where it is
> > the middle-end portion of a pair of patches to allow the default ISA
> > to be advanced.  Ok for mainline?
> >
> > 2022-02-09  Roger Sayle  <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> >         * cfgexpand.cc (expand_value_return): Allow backends to promote
> >         a scalar floating point return value to a wider integer mode.
> >         * expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise, allow
> >         backends to promote scalar FP PARM_DECLs to wider integer modes.
> 
> Buried somewhere in our calling conventions code is the ability to pass around
> BLKmode objects in registers along with the ability to tune left vs right padding
> adjustments.   Much of this support grew out of the PA
> 32 bit SOM ABI.
> 
> While I think we could probably make those bits do what we want, I suspect the
> result will actually be uglier than what you've done here and I wouldn't be
> surprised if there was a performance hit as the code to handle those cases was
> pretty dumb in its implementation.
> 
> What I find rather surprising is the location of your changes -- they feel
> incomplete.  For example, you fix the callee side of returns in
> expand_value_return, but I don't analogous code for the caller side.
> 
> Similarly while you fix things for arguments in expand_expr_real_1, that's again
> just the callee side.  Don't you need to so something on the caller side too?
> 
> Jeff
>
  

Patch

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index d51af2e..c377f16 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -3715,7 +3715,22 @@  expand_value_return (rtx val)
         mode = promote_function_mode (type, old_mode, &unsignedp, funtype, 1);
 
       if (mode != old_mode)
-	val = convert_modes (mode, old_mode, val, unsignedp);
+	{
+	  /* Some ABIs require scalar floating point modes to be returned
+	     in a wider scalar integer mode.  We need to explicitly
+	     reinterpret to an integer mode of the correct precision
+	     before extending to the desired result.  */
+	  if (SCALAR_INT_MODE_P (mode)
+	      && SCALAR_FLOAT_MODE_P (old_mode)
+	      && known_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (old_mode)))
+	    {
+	      scalar_int_mode imode = int_mode_for_mode (old_mode).require ();
+	      val = force_reg (imode, gen_lowpart (imode, val));
+	      val = convert_modes (mode, imode, val, 1);
+	    }
+	  else
+	    val = convert_modes (mode, old_mode, val, unsignedp);
+	}
 
       if (GET_CODE (return_reg) == PARALLEL)
 	emit_group_load (return_reg, val, type, int_size_in_bytes (type));
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 35e4029..e4efdcd 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -10674,6 +10674,19 @@  expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	    pmode = promote_ssa_mode (ssa_name, &unsignedp);
 	  gcc_assert (GET_MODE (decl_rtl) == pmode);
 
+	  /* Some ABIs require scalar floating point modes to be passed
+	     in a wider scalar integer mode.  We need to explicitly
+	     truncate to an integer mode of the correct precision before
+	     using a SUBREG to reinterpret as a floating point value.  */
+	  if (SCALAR_FLOAT_MODE_P (mode)
+	      && SCALAR_INT_MODE_P (pmode)
+	      && known_lt (GET_MODE_SIZE (mode), GET_MODE_SIZE (pmode)))
+	    {
+	      scalar_int_mode imode = int_mode_for_mode (mode).require ();
+	      temp = force_reg (imode, gen_lowpart (imode, decl_rtl));
+	      return gen_lowpart_SUBREG (mode, temp);
+	    }
+
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
 	  SUBREG_PROMOTED_VAR_P (temp) = 1;
 	  SUBREG_PROMOTED_SET (temp, unsignedp);