testsuite: update requires for powerpc/float128-cmp2-runnable.c

Message ID 20230410020941.2440885-1-guojiufu@linux.ibm.com
State New
Headers
Series testsuite: update requires for powerpc/float128-cmp2-runnable.c |

Commit Message

Jiufu Guo April 10, 2023, 2:09 a.m. UTC
  Hi,

In this test case (float128-cmp2-runnable.c), the instruction
xscmpexpqp is used to support a few builtins e.g.
__builtin_vsx_scalar_cmp_exp_qp_eq on _Float128.
This instruction handles the whole 128bits of the vector, and
it is guarded by [ieee128-hw].
So, we may update the testcase to require ppc_float128_hw.

Tested on ppc64 both BE and LE.
Is this ok for trunk?

BR,
Jeff (Jiufu)

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/float128-cmp2-runnable.c: Update requires.

---
 gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Kewen.Lin April 10, 2023, 9:26 a.m. UTC | #1
Hi Jeff,

on 2023/4/10 10:09, Jiufu Guo via Gcc-patches wrote:
> Hi,
> 
> In this test case (float128-cmp2-runnable.c), the instruction
> xscmpexpqp is used to support a few builtins e.g.
> __builtin_vsx_scalar_cmp_exp_qp_eq on _Float128.
> This instruction handles the whole 128bits of the vector, and
> it is guarded by [ieee128-hw].

The instruction xscmpexpqp is guarded with TARGET_P9_VECTOR,

(define_insn "*xscmpexpqp"
  [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
	(compare:CCFP
	 (unspec:IEEE128 [(match_operand:IEEE128 1 "altivec_register_operand" "v")
		          (match_operand:IEEE128 2 "altivec_register_operand" "v")]
	  UNSPEC_VSX_SCMPEXPQP)
	 (match_operand:SI 3 "zero_constant" "j")))]
  "TARGET_P9_VECTOR"
  "xscmpexpqp %0,%1,%2"
  [(set_attr "type" "fpcompare")])

[ieee128-hw] is used for guarding those bifs, so the above
statement doesn't quite match the fact.

PR108758 said this case doesn't fail with gcc-10 and gcc-11,
I wonder why it changes from gcc-12?  The above define_insn
shows the underlying insns for these bifs just requires the
condition power9-vector.  Could you have a further check?
Thanks.

btw, please add a PR marker for PR108758.

BR,
Kewen

> So, we may update the testcase to require ppc_float128_hw.
> 
> Tested on ppc64 both BE and LE.
> Is this ok for trunk?
> 
> BR,
> Jeff (Jiufu)
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/powerpc/float128-cmp2-runnable.c: Update requires.
> 
> ---
>  gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c b/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c
> index d376a3ca68e..91287c0fb7a 100644
> --- a/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c
> @@ -1,5 +1,5 @@
>  /* { dg-do run } */
> -/* { dg-require-effective-target ppc_float128_sw } */
> +/* { dg-require-effective-target ppc_float128_hw } */
>  /* { dg-require-effective-target p9vector_hw } */
>  /* { dg-options "-O2 -mdejagnu-cpu=power9 " } */
>
  
Jiufu Guo April 11, 2023, 9:14 a.m. UTC | #2
Hi Kewen,

Thanks a lot for your very helpful comments!

On 2023-04-10 17:26, Kewen.Lin wrote:
> Hi Jeff,
> 
> on 2023/4/10 10:09, Jiufu Guo via Gcc-patches wrote:
>> Hi,
>> 
>> In this test case (float128-cmp2-runnable.c), the instruction
>> xscmpexpqp is used to support a few builtins e.g.
>> __builtin_vsx_scalar_cmp_exp_qp_eq on _Float128.
>> This instruction handles the whole 128bits of the vector, and
>> it is guarded by [ieee128-hw].
> 
> The instruction xscmpexpqp is guarded with TARGET_P9_VECTOR,
> 
> (define_insn "*xscmpexpqp"
>   [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
> 	(compare:CCFP
> 	 (unspec:IEEE128 [(match_operand:IEEE128 1 "altivec_register_operand" 
> "v")
> 		          (match_operand:IEEE128 2 "altivec_register_operand" "v")]
> 	  UNSPEC_VSX_SCMPEXPQP)
> 	 (match_operand:SI 3 "zero_constant" "j")))]
>   "TARGET_P9_VECTOR"
>   "xscmpexpqp %0,%1,%2"
>   [(set_attr "type" "fpcompare")])
> 
> [ieee128-hw] is used for guarding those bifs, so the above
> statement doesn't quite match the fact.
> 

Agree, I'm wondering if P9_VECTOR is perfect here, even if it indicates 
the ISA
which contains xscmpexpqp. Let me have more checks.

> PR108758 said this case doesn't fail with gcc-10 and gcc-11,
> I wonder why it changes from gcc-12?  The above define_insn
> shows the underlying insns for these bifs just requires the
> condition power9-vector.  Could you have a further check?
> Thanks.

Thanks for raising this concern.
The behavior to check about bif on FLOAT128_HW and emit an error message 
for
requirements on quad-precision is added in gcc12. This is why gcc12 
fails to
compile the case on -m32.

Before gcc12, altivec_resolve_overloaded_builtin will return the 
overloaded
result directly, and does not check more about the result function.

> 
> btw, please add a PR marker for PR108758.

Sure,  thanks for catching this!


BR,
Jeff (Jiufu)

> 
> BR,
> Kewen
> 
>> So, we may update the testcase to require ppc_float128_hw.
>> 
>> Tested on ppc64 both BE and LE.
>> Is this ok for trunk?
>> 
>> BR,
>> Jeff (Jiufu)
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 	* gcc.target/powerpc/float128-cmp2-runnable.c: Update requires.
>> 
>> ---
>>  gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c 
>> b/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c
>> index d376a3ca68e..91287c0fb7a 100644
>> --- a/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c
>> @@ -1,5 +1,5 @@
>>  /* { dg-do run } */
>> -/* { dg-require-effective-target ppc_float128_sw } */
>> +/* { dg-require-effective-target ppc_float128_hw } */
>>  /* { dg-require-effective-target p9vector_hw } */
>>  /* { dg-options "-O2 -mdejagnu-cpu=power9 " } */
>>
  
Kewen.Lin April 11, 2023, 9:40 a.m. UTC | #3
Hi Jeff,

on 2023/4/11 17:14, guojiufu wrote:
> Hi Kewen,
> 
> Thanks a lot for your very helpful comments!
> 
> On 2023-04-10 17:26, Kewen.Lin wrote:
>> Hi Jeff,
>>
>> on 2023/4/10 10:09, Jiufu Guo via Gcc-patches wrote:
>>> Hi,
>>>
>>> In this test case (float128-cmp2-runnable.c), the instruction
>>> xscmpexpqp is used to support a few builtins e.g.
>>> __builtin_vsx_scalar_cmp_exp_qp_eq on _Float128.
>>> This instruction handles the whole 128bits of the vector, and
>>> it is guarded by [ieee128-hw].
>>
>> The instruction xscmpexpqp is guarded with TARGET_P9_VECTOR,
>>
>> (define_insn "*xscmpexpqp"
>>   [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
>>     (compare:CCFP
>>      (unspec:IEEE128 [(match_operand:IEEE128 1 "altivec_register_operand" "v")
>>                   (match_operand:IEEE128 2 "altivec_register_operand" "v")]
>>       UNSPEC_VSX_SCMPEXPQP)
>>      (match_operand:SI 3 "zero_constant" "j")))]
>>   "TARGET_P9_VECTOR"
>>   "xscmpexpqp %0,%1,%2"
>>   [(set_attr "type" "fpcompare")])
>>
>> [ieee128-hw] is used for guarding those bifs, so the above
>> statement doesn't quite match the fact.
>>
> 
> Agree, I'm wondering if P9_VECTOR is perfect here, even if it indicates the ISA
> which contains xscmpexpqp. Let me have more checks.
> 
>> PR108758 said this case doesn't fail with gcc-10 and gcc-11,
>> I wonder why it changes from gcc-12?  The above define_insn
>> shows the underlying insns for these bifs just requires the
>> condition power9-vector.  Could you have a further check?
>> Thanks.
> 
> Thanks for raising this concern.
> The behavior to check about bif on FLOAT128_HW and emit an error message for
> requirements on quad-precision is added in gcc12. This is why gcc12 fails to
> compile the case on -m32.
> 
> Before gcc12, altivec_resolve_overloaded_builtin will return the overloaded
> result directly, and does not check more about the result function.

Thanks for checking, I wonder which commit caused this behavior change and what's
the underlying justification?  I know there is one new bif handling framework
introduced in gcc12, not sure the checking condition was changed together or by
a standalone commit.  Anyway, apparently the conditions for the support of these
bifs are different on gcc-11 and gcc-12, I wonder why it changed.  As mentioned
above, PR108758's c#1 said this case (bifs) work well on gcc-11, I suspected the
condition change was an overkill, that's why I asked.

BR,
Kewen

> 
>>
>> btw, please add a PR marker for PR108758.
> 
> Sure,  thanks for catching this!
> 
> 
> BR,
> Jeff (Jiufu)
> 
>>
>> BR,
>> Kewen
>>
>>> So, we may update the testcase to require ppc_float128_hw.
>>>
>>> Tested on ppc64 both BE and LE.
>>> Is this ok for trunk?
>>>
>>> BR,
>>> Jeff (Jiufu)
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>     * gcc.target/powerpc/float128-cmp2-runnable.c: Update requires.
>>>
>>> ---
>>>  gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c b/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c
>>> index d376a3ca68e..91287c0fb7a 100644
>>> --- a/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c
>>> +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c
>>> @@ -1,5 +1,5 @@
>>>  /* { dg-do run } */
>>> -/* { dg-require-effective-target ppc_float128_sw } */
>>> +/* { dg-require-effective-target ppc_float128_hw } */
>>>  /* { dg-require-effective-target p9vector_hw } */
>>>  /* { dg-options "-O2 -mdejagnu-cpu=power9 " } */
>>>
  
Segher Boessenkool April 11, 2023, 3:13 p.m. UTC | #4
On Tue, Apr 11, 2023 at 05:40:09PM +0800, Kewen.Lin wrote:
> on 2023/4/11 17:14, guojiufu wrote:
> > Thanks for raising this concern.
> > The behavior to check about bif on FLOAT128_HW and emit an error message for
> > requirements on quad-precision is added in gcc12. This is why gcc12 fails to
> > compile the case on -m32.
> > 
> > Before gcc12, altivec_resolve_overloaded_builtin will return the overloaded
> > result directly, and does not check more about the result function.
> 
> Thanks for checking, I wonder which commit caused this behavior change and what's
> the underlying justification?  I know there is one new bif handling framework
> introduced in gcc12, not sure the checking condition was changed together or by
> a standalone commit.  Anyway, apparently the conditions for the support of these
> bifs are different on gcc-11 and gcc-12, I wonder why it changed.  As mentioned
> above, PR108758's c#1 said this case (bifs) work well on gcc-11, I suspected the
> condition change was an overkill, that's why I asked.

It almost certainly was an oversight.  The new builtin framework changed
so many things, there was bound to be some breakage to go with all the
good things it brought.

So what is the actual thing going wrong?  QP insns work fine and are
valid on all systems and environments, BE or LE, 32-bit or 64-bit.  Of
course you cannot use the "long double" type for those everywhere, but
that is a very different thing.


Segher
  
Jiufu Guo April 12, 2023, 5:06 a.m. UTC | #5
Hi,

Segher Boessenkool <segher@kernel.crashing.org> writes:

> On Tue, Apr 11, 2023 at 05:40:09PM +0800, Kewen.Lin wrote:
>> on 2023/4/11 17:14, guojiufu wrote:
>> > Thanks for raising this concern.
>> > The behavior to check about bif on FLOAT128_HW and emit an error message for
>> > requirements on quad-precision is added in gcc12. This is why gcc12 fails to
>> > compile the case on -m32.
>> > 
>> > Before gcc12, altivec_resolve_overloaded_builtin will return the overloaded
>> > result directly, and does not check more about the result function.
>> 
>> Thanks for checking, I wonder which commit caused this behavior change and what's
>> the underlying justification?  I know there is one new bif handling framework
>> introduced in gcc12, not sure the checking condition was changed together or by
>> a standalone commit.  Anyway, apparently the conditions for the support of these
>> bifs are different on gcc-11 and gcc-12, I wonder why it changed.  As mentioned
>> above, PR108758's c#1 said this case (bifs) work well on gcc-11, I suspected the
>> condition change was an overkill, that's why I asked.
>
> It almost certainly was an oversight.  The new builtin framework changed
> so many things, there was bound to be some breakage to go with all the
> good things it brought.

Yes, the condition checking on gcc-12 is different from gcc-11. In
gcc-11, the condition on overloaded bif is not checked.
And, there are a few commits related to the bifs change. e.g.
r12-4977-ga28cfe49203705 introduces a new bif expand function which has
the ability to check more bif's target requirements like ieee128_hw.
And another commit changes the error message (r12-6684).

>
> So what is the actual thing going wrong?  QP insns work fine and are
> valid on all systems and environments, BE or LE, 32-bit or 64-bit.  Of
> course you cannot use the "long double" type for those everywhere, but
> that is a very different thing.

Currently, when compiling bif __builtin_vsx_scalar_cmp_exp_qp_eq,
gcc generates error message:
error: '__builtin_vsx_scalar_cmp_exp_qp_eq' requires quad-precision
floating-point arithmetic

IMHO, this error would be ok.  Because it makes sense that this bif
needs ieee128_hw.

BR,
Jeff (Jiufu)

>
>
> Segher
  
Jiufu Guo April 12, 2023, 5:31 a.m. UTC | #6
Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:

> Hi,
>
> Segher Boessenkool <segher@kernel.crashing.org> writes:
>
>> On Tue, Apr 11, 2023 at 05:40:09PM +0800, Kewen.Lin wrote:
>>> on 2023/4/11 17:14, guojiufu wrote:
>>> > Thanks for raising this concern.
>>> > The behavior to check about bif on FLOAT128_HW and emit an error message for
>>> > requirements on quad-precision is added in gcc12. This is why gcc12 fails to
>>> > compile the case on -m32.
>>> > 
>>> > Before gcc12, altivec_resolve_overloaded_builtin will return the overloaded
>>> > result directly, and does not check more about the result function.
>>> 
>>> Thanks for checking, I wonder which commit caused this behavior change and what's
>>> the underlying justification?  I know there is one new bif handling framework
>>> introduced in gcc12, not sure the checking condition was changed together or by
>>> a standalone commit.  Anyway, apparently the conditions for the support of these
>>> bifs are different on gcc-11 and gcc-12, I wonder why it changed.  As mentioned
>>> above, PR108758's c#1 said this case (bifs) work well on gcc-11, I suspected the
>>> condition change was an overkill, that's why I asked.
>>
>> It almost certainly was an oversight.  The new builtin framework changed
>> so many things, there was bound to be some breakage to go with all the
>> good things it brought.
>
> Yes, the condition checking on gcc-12 is different from gcc-11. In
> gcc-11, the condition on overloaded bif is not checked.
> And, there are a few commits related to the bifs change. e.g.
> r12-4977-ga28cfe49203705 introduces a new bif expand function which has
> the ability to check more bif's target requirements like ieee128_hw.
> And another commit changes the error message (r12-6684).
>
>>
>> So what is the actual thing going wrong?  QP insns work fine and are
>> valid on all systems and environments, BE or LE, 32-bit or 64-bit.
>> Of

I understand that QP insns (e.g. xscmpexpqp) is valid if the system
meets ISA3.0, no matter BE/LE, 32-bit/64-bit.
I think option -mfloat128-hardware is designed for QP insns.

While there is one issue, on BE machine, when compiling with options
"-mfloat128-hardware -m32", an error message is generated:
"error: '%<-mfloat128-hardware%>' requires '-m64'"

(I'm wondering if we need to relax this limitation.)


BR,
Jeff (Jiufu)

>> course you cannot use the "long double" type for those everywhere, but
>> that is a very different thing.
>
> Currently, when compiling bif __builtin_vsx_scalar_cmp_exp_qp_eq,
> gcc generates error message:
> error: '__builtin_vsx_scalar_cmp_exp_qp_eq' requires quad-precision
> floating-point arithmetic
>
> IMHO, this error would be ok.  Because it makes sense that this bif
> needs ieee128_hw.
>
> BR,
> Jeff (Jiufu)
>
>>
>>
>> Segher
  
Kewen.Lin April 12, 2023, 12:47 p.m. UTC | #7
Hi Segher & Jeff,

on 2023/4/11 23:13, Segher Boessenkool wrote:
> On Tue, Apr 11, 2023 at 05:40:09PM +0800, Kewen.Lin wrote:
>> on 2023/4/11 17:14, guojiufu wrote:
>>> Thanks for raising this concern.
>>> The behavior to check about bif on FLOAT128_HW and emit an error message for
>>> requirements on quad-precision is added in gcc12. This is why gcc12 fails to
>>> compile the case on -m32.
>>>
>>> Before gcc12, altivec_resolve_overloaded_builtin will return the overloaded
>>> result directly, and does not check more about the result function.
>>
>> Thanks for checking, I wonder which commit caused this behavior change and what's
>> the underlying justification?  I know there is one new bif handling framework

Answered this question by myself with some diggings, test case
float128-cmp2-runnable.c started to fail from r12-5752-gd08236359eb229 which
exactly makes new bif framework start to take effect and the reason why the
behavior changes is the condition change from **TARGET_P9_VECTOR** to
**TARGET_FLOAT128_HW**.

With r12-5751-gc9dd01314d8467 (still old bif framework):

$ grep -r scalar_cmp_exp_qp gcc/config/rs6000/rs6000-builtin.def
BU_P9V_VSX_2 (VSCEQPGT, "scalar_cmp_exp_qp_gt", CONST,  xscmpexpqp_gt_kf)
BU_P9V_VSX_2 (VSCEQPLT, "scalar_cmp_exp_qp_lt", CONST,  xscmpexpqp_lt_kf)
BU_P9V_VSX_2 (VSCEQPEQ, "scalar_cmp_exp_qp_eq", CONST,  xscmpexpqp_eq_kf)
BU_P9V_VSX_2 (VSCEQPUO, "scalar_cmp_exp_qp_unordered",  CONST,  xscmpexpqp_unordered_kf)
BU_P9V_OVERLOAD_2 (VSCEQPGT,    "scalar_cmp_exp_qp_gt")
BU_P9V_OVERLOAD_2 (VSCEQPLT,    "scalar_cmp_exp_qp_lt")
BU_P9V_OVERLOAD_2 (VSCEQPEQ,    "scalar_cmp_exp_qp_eq")
BU_P9V_OVERLOAD_2 (VSCEQPUO,    "scalar_cmp_exp_qp_unordered")

There were only 13 bifs requiring TARGET_FLOAT128_HW in old bif framework.

$ grep ^BU_FLOAT128_HW gcc/config/rs6000/rs6000-builtin.def
BU_FLOAT128_HW_VSX_1 (VSEEQP,   "scalar_extract_expq",  CONST,  xsxexpqp_kf)
BU_FLOAT128_HW_VSX_1 (VSESQP,   "scalar_extract_sigq",  CONST,  xsxsigqp_kf)
BU_FLOAT128_HW_VSX_1 (VSTDCNQP, "scalar_test_neg_qp",   CONST,  xststdcnegqp_kf)
BU_FLOAT128_HW_VSX_2 (VSIEQP,   "scalar_insert_exp_q",  CONST,  xsiexpqp_kf)
BU_FLOAT128_HW_VSX_2 (VSIEQPF,  "scalar_insert_exp_qp", CONST,  xsiexpqpf_kf)
BU_FLOAT128_HW_VSX_2 (VSTDCQP, "scalar_test_data_class_qp",     CONST,  xststdcqp_kf)
BU_FLOAT128_HW_1 (SQRTF128_ODD,  "sqrtf128_round_to_odd",  FP, sqrtkf2_odd)
BU_FLOAT128_HW_1 (TRUNCF128_ODD, "truncf128_round_to_odd", FP, trunckfdf2_odd)
BU_FLOAT128_HW_2 (ADDF128_ODD,   "addf128_round_to_odd",   FP, addkf3_odd)
BU_FLOAT128_HW_2 (SUBF128_ODD,   "subf128_round_to_odd",   FP, subkf3_odd)
BU_FLOAT128_HW_2 (MULF128_ODD,   "mulf128_round_to_odd",   FP, mulkf3_odd)
BU_FLOAT128_HW_2 (DIVF128_ODD,   "divf128_round_to_odd",   FP, divkf3_odd)
BU_FLOAT128_HW_3 (FMAF128_ODD,   "fmaf128_round_to_odd",   FP, fmakf4_odd)

Starting from r12-5752-gd08236359eb229, these scalar_cmp_exp_qp_{gt,lt,eq,unordered}
bifs were put under stanza ieee128-hw, it makes ieee128-hw to have 17 bifs,
comparing to the previous, the extra four ones were exactly these
scalar_cmp_exp_qp_{gt,lt,eq,unordered}.

>> introduced in gcc12, not sure the checking condition was changed together or by
>> a standalone commit.  Anyway, apparently the conditions for the support of these
>> bifs are different on gcc-11 and gcc-12, I wonder why it changed.  As mentioned
>> above, PR108758's c#1 said this case (bifs) work well on gcc-11, I suspected the
>> condition change was an overkill, that's why I asked.
> 
> It almost certainly was an oversight.  The new builtin framework changed
> so many things, there was bound to be some breakage to go with all the
> good things it brought.

Yeah, as the above findings, also I found that r12-3126-g2ed356a4c9af06 introduced
power9 related stanzas and r12-3167-g2f9489a1009d98 introduced ieee128-hw stanza
including these four bifs, both of them don't have any notes on why we would change
the condition for these scalar_cmp_exp_qp_{gt,lt,eq,unordered} from power9-vector to
ieee128-hw, so I think it's just an oversight (ieee128-hw is an overkill comparing
to power9-vector :)).

> 
> So what is the actual thing going wrong?  QP insns work fine and are
> valid on all systems and environments, BE or LE, 32-bit or 64-bit.  Of
> course you cannot use the "long double" type for those everywhere, but
> that is a very different thing.

The actual thing going wrong is that: the test case float128-cmp2-runnable.c
runs well on BE -m32 and -m64 with gcc-11, but meets failures on BE -m32 with
latest gcc-12 and trunk during compilation, having the error messages like:

gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c: In function 'main':
gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c:155:3: error:
  '__builtin_vsx_scalar_cmp_exp_qp_eq' requires ISA 3.0 IEEE 128-bit floating point

As scalar_cmp_exp_qp_{gt,lt,eq,unordered} requires condition TARGET_FLOAT128_HW
now (since new bif framework took effect).

(To be more exact, it started to fail from r12-5752-gd08236359eb229).

IMHO, the apparent cause seems to be the wrong effective target mismatching the
condition for those bifs, but the underlying cause is that new bif framework
unexpectedly moved these four bifs from power9-vector to ieee128-hw.

Testing a diff as below:

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 03fb194b151..67a3f5edaf2 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2797,6 +2797,19 @@
   const vsi __builtin_vsx_xxbrw_v4si (vsi);
     XXBRW_V4SI p9_xxbrw_v4si {}

+  const signed int __builtin_vsx_scalar_cmp_exp_qp_eq (_Float128, _Float128);
+    VSCEQPEQ xscmpexpqp_eq_kf {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_gt (_Float128, _Float128);
+    VSCEQPGT xscmpexpqp_gt_kf {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_lt (_Float128, _Float128);
+    VSCEQPLT xscmpexpqp_lt_kf {}
+
+  const signed int \
+      __builtin_vsx_scalar_cmp_exp_qp_unordered (_Float128, _Float128);
+    VSCEQPUO xscmpexpqp_unordered_kf {}
+

 ; Miscellaneous P9 functions
 [power9]
@@ -2879,19 +2892,6 @@
   fpmath _Float128 __builtin_mulf128_round_to_odd (_Float128, _Float128);
     MULF128_ODD mulkf3_odd {}

-  const signed int __builtin_vsx_scalar_cmp_exp_qp_eq (_Float128, _Float128);
-    VSCEQPEQ xscmpexpqp_eq_kf {}
-
-  const signed int __builtin_vsx_scalar_cmp_exp_qp_gt (_Float128, _Float128);
-    VSCEQPGT xscmpexpqp_gt_kf {}
-
-  const signed int __builtin_vsx_scalar_cmp_exp_qp_lt (_Float128, _Float128);
-    VSCEQPLT xscmpexpqp_lt_kf {}
-
-  const signed int \
-      __builtin_vsx_scalar_cmp_exp_qp_unordered (_Float128, _Float128);
-    VSCEQPUO xscmpexpqp_unordered_kf {}
-
   fpmath _Float128 __builtin_sqrtf128_round_to_odd (_Float128);
     SQRTF128_ODD sqrtkf2_odd {}


BR,
Kewen
  
Michael Meissner April 12, 2023, 2:46 p.m. UTC | #8
On Wed, Apr 12, 2023 at 01:31:46PM +0800, Jiufu Guo wrote:
> I understand that QP insns (e.g. xscmpexpqp) is valid if the system
> meets ISA3.0, no matter BE/LE, 32-bit/64-bit.
> I think option -mfloat128-hardware is designed for QP insns.
> 
> While there is one issue, on BE machine, when compiling with options
> "-mfloat128-hardware -m32", an error message is generated:
> "error: '%<-mfloat128-hardware%>' requires '-m64'"
> 
> (I'm wondering if we need to relax this limitation.)

In the past, the machine independent portion of the compiler demanded that for
scalar mode, there be an integer mode of the same size, since sometimes moves
are converted to using an int RTL mode.  Since we don't have TImode support in
32-bit, you would get various errors because something tried to do a TImode
move for KFmode types, and the TImode wasn't available.

If somebody wants to verify that this now works on 32-bit and/or implements
TImode on 32-bit, then we can relax the restriction.
  
Jiufu Guo April 13, 2023, 5:35 a.m. UTC | #9
Hi Mike,

On 2023-04-12 22:46, Michael Meissner wrote:
> On Wed, Apr 12, 2023 at 01:31:46PM +0800, Jiufu Guo wrote:
>> I understand that QP insns (e.g. xscmpexpqp) is valid if the system
>> meets ISA3.0, no matter BE/LE, 32-bit/64-bit.
>> I think option -mfloat128-hardware is designed for QP insns.
>> 
>> While there is one issue, on BE machine, when compiling with options
>> "-mfloat128-hardware -m32", an error message is generated:
>> "error: '%<-mfloat128-hardware%>' requires '-m64'"
>> 
>> (I'm wondering if we need to relax this limitation.)
> 
> In the past, the machine independent portion of the compiler demanded 
> that for
> scalar mode, there be an integer mode of the same size, since sometimes 
> moves
> are converted to using an int RTL mode.  Since we don't have TImode 
> support in
> 32-bit, you would get various errors because something tried to do a 
> TImode
> move for KFmode types, and the TImode wasn't available.
> 
> If somebody wants to verify that this now works on 32-bit and/or 
> implements
> TImode on 32-bit, then we can relax the restriction.

Thanks a lot for pointing out this!

BR,
Jeff (Jiufu)
  
Kewen.Lin April 13, 2023, 7:42 a.m. UTC | #10
on 2023/4/12 20:47, Kewen.Lin wrote:
> Hi Segher & Jeff,
> 
> on 2023/4/11 23:13, Segher Boessenkool wrote:
>> On Tue, Apr 11, 2023 at 05:40:09PM +0800, Kewen.Lin wrote:
>>> on 2023/4/11 17:14, guojiufu wrote:
>>>> Thanks for raising this concern.
>>>> The behavior to check about bif on FLOAT128_HW and emit an error message for
>>>> requirements on quad-precision is added in gcc12. This is why gcc12 fails to
>>>> compile the case on -m32.
>>>>
>>>> Before gcc12, altivec_resolve_overloaded_builtin will return the overloaded
>>>> result directly, and does not check more about the result function.
>>>
>>> Thanks for checking, I wonder which commit caused this behavior change and what's
>>> the underlying justification?  I know there is one new bif handling framework
> 
> Answered this question by myself with some diggings, test case
> float128-cmp2-runnable.c started to fail from r12-5752-gd08236359eb229 which
> exactly makes new bif framework start to take effect and the reason why the
> behavior changes is the condition change from **TARGET_P9_VECTOR** to
> **TARGET_FLOAT128_HW**.
> 
> With r12-5751-gc9dd01314d8467 (still old bif framework):
> 
> $ grep -r scalar_cmp_exp_qp gcc/config/rs6000/rs6000-builtin.def
> BU_P9V_VSX_2 (VSCEQPGT, "scalar_cmp_exp_qp_gt", CONST,  xscmpexpqp_gt_kf)
> BU_P9V_VSX_2 (VSCEQPLT, "scalar_cmp_exp_qp_lt", CONST,  xscmpexpqp_lt_kf)
> BU_P9V_VSX_2 (VSCEQPEQ, "scalar_cmp_exp_qp_eq", CONST,  xscmpexpqp_eq_kf)
> BU_P9V_VSX_2 (VSCEQPUO, "scalar_cmp_exp_qp_unordered",  CONST,  xscmpexpqp_unordered_kf)
> BU_P9V_OVERLOAD_2 (VSCEQPGT,    "scalar_cmp_exp_qp_gt")
> BU_P9V_OVERLOAD_2 (VSCEQPLT,    "scalar_cmp_exp_qp_lt")
> BU_P9V_OVERLOAD_2 (VSCEQPEQ,    "scalar_cmp_exp_qp_eq")
> BU_P9V_OVERLOAD_2 (VSCEQPUO,    "scalar_cmp_exp_qp_unordered")
> 
> There were only 13 bifs requiring TARGET_FLOAT128_HW in old bif framework.
> 
> $ grep ^BU_FLOAT128_HW gcc/config/rs6000/rs6000-builtin.def
> BU_FLOAT128_HW_VSX_1 (VSEEQP,   "scalar_extract_expq",  CONST,  xsxexpqp_kf)
> BU_FLOAT128_HW_VSX_1 (VSESQP,   "scalar_extract_sigq",  CONST,  xsxsigqp_kf)
> BU_FLOAT128_HW_VSX_1 (VSTDCNQP, "scalar_test_neg_qp",   CONST,  xststdcnegqp_kf)
> BU_FLOAT128_HW_VSX_2 (VSIEQP,   "scalar_insert_exp_q",  CONST,  xsiexpqp_kf)
> BU_FLOAT128_HW_VSX_2 (VSIEQPF,  "scalar_insert_exp_qp", CONST,  xsiexpqpf_kf)
> BU_FLOAT128_HW_VSX_2 (VSTDCQP, "scalar_test_data_class_qp",     CONST,  xststdcqp_kf)
> BU_FLOAT128_HW_1 (SQRTF128_ODD,  "sqrtf128_round_to_odd",  FP, sqrtkf2_odd)
> BU_FLOAT128_HW_1 (TRUNCF128_ODD, "truncf128_round_to_odd", FP, trunckfdf2_odd)
> BU_FLOAT128_HW_2 (ADDF128_ODD,   "addf128_round_to_odd",   FP, addkf3_odd)
> BU_FLOAT128_HW_2 (SUBF128_ODD,   "subf128_round_to_odd",   FP, subkf3_odd)
> BU_FLOAT128_HW_2 (MULF128_ODD,   "mulf128_round_to_odd",   FP, mulkf3_odd)
> BU_FLOAT128_HW_2 (DIVF128_ODD,   "divf128_round_to_odd",   FP, divkf3_odd)
> BU_FLOAT128_HW_3 (FMAF128_ODD,   "fmaf128_round_to_odd",   FP, fmakf4_odd)
> 
> Starting from r12-5752-gd08236359eb229, these scalar_cmp_exp_qp_{gt,lt,eq,unordered}
> bifs were put under stanza ieee128-hw, it makes ieee128-hw to have 17 bifs,
> comparing to the previous, the extra four ones were exactly these
> scalar_cmp_exp_qp_{gt,lt,eq,unordered}.
> 
>>> introduced in gcc12, not sure the checking condition was changed together or by
>>> a standalone commit.  Anyway, apparently the conditions for the support of these
>>> bifs are different on gcc-11 and gcc-12, I wonder why it changed.  As mentioned
>>> above, PR108758's c#1 said this case (bifs) work well on gcc-11, I suspected the
>>> condition change was an overkill, that's why I asked.
>>
>> It almost certainly was an oversight.  The new builtin framework changed
>> so many things, there was bound to be some breakage to go with all the
>> good things it brought.
> 
> Yeah, as the above findings, also I found that r12-3126-g2ed356a4c9af06 introduced
> power9 related stanzas and r12-3167-g2f9489a1009d98 introduced ieee128-hw stanza
> including these four bifs, both of them don't have any notes on why we would change
> the condition for these scalar_cmp_exp_qp_{gt,lt,eq,unordered} from power9-vector to
> ieee128-hw, so I think it's just an oversight (ieee128-hw is an overkill comparing
> to power9-vector :)).
> 
>>
>> So what is the actual thing going wrong?  QP insns work fine and are
>> valid on all systems and environments, BE or LE, 32-bit or 64-bit.  Of
>> course you cannot use the "long double" type for those everywhere, but
>> that is a very different thing.
> 
> The actual thing going wrong is that: the test case float128-cmp2-runnable.c
> runs well on BE -m32 and -m64 with gcc-11, but meets failures on BE -m32 with
> latest gcc-12 and trunk during compilation, having the error messages like:
> 
> gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c: In function 'main':
> gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c:155:3: error:
>   '__builtin_vsx_scalar_cmp_exp_qp_eq' requires ISA 3.0 IEEE 128-bit floating point
> 
> As scalar_cmp_exp_qp_{gt,lt,eq,unordered} requires condition TARGET_FLOAT128_HW
> now (since new bif framework took effect).
> 
> (To be more exact, it started to fail from r12-5752-gd08236359eb229).
> 
> IMHO, the apparent cause seems to be the wrong effective target mismatching the
> condition for those bifs, but the underlying cause is that new bif framework
> unexpectedly moved these four bifs from power9-vector to ieee128-hw.
> 

I'm going to push the below patch next week if no objections.

Bootstrapped and regress-tested on:
  - powerpc64le-linux-gnu Power10
  - powerpc64le-linux-gnu Power9
  - powerpc64le-linux-gnu Power8
  - powerpc64-linux-gnu Power9 {-m64,-m32}
  - powerpc64-linux-gnu Power8 {-m64,-m32}

BR,
Kewen
----------
[PATCH] rs6000: Guard power9-vector for vsx_scalar_cmp_exp_qp_* [PR108758]

__builtin_vsx_scalar_cmp_exp_qp_{eq,gt,lt,unordered} used
to be guarded with condition TARGET_P9_VECTOR before new
bif framework was introduced (r12-5752-gd08236359eb229),
since r12-5752 they are placed under stanza ieee128-hw,
that is to check condition TARGET_FLOAT128_HW, it caused
test case float128-cmp2-runnable.c to fail at -m32 as the
condition TARGET_FLOAT128_HW isn't satisfied with -m32.

By checking the commit history, I didn't see any notes on
why this condition changes on them was made, so this patch
is to move these bifs from stanza ieee128-hw to stanza
power9-vector as before.  It also matches the condition of
the corresponding define_insns.

	PR target/108758

gcc/ChangeLog:

	* config/rs6000/rs6000-builtins.def
	(__builtin_vsx_scalar_cmp_exp_qp_eq, __builtin_vsx_scalar_cmp_exp_qp_gt
	__builtin_vsx_scalar_cmp_exp_qp_lt,
	__builtin_vsx_scalar_cmp_exp_qp_unordered): Move from stanza ieee128-hw
	to power9-vector.
---
 gcc/config/rs6000/rs6000-builtins.def | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 03fb194b151..67a3f5edaf2 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2797,6 +2797,19 @@
   const vsi __builtin_vsx_xxbrw_v4si (vsi);
     XXBRW_V4SI p9_xxbrw_v4si {}

+  const signed int __builtin_vsx_scalar_cmp_exp_qp_eq (_Float128, _Float128);
+    VSCEQPEQ xscmpexpqp_eq_kf {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_gt (_Float128, _Float128);
+    VSCEQPGT xscmpexpqp_gt_kf {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_lt (_Float128, _Float128);
+    VSCEQPLT xscmpexpqp_lt_kf {}
+
+  const signed int \
+      __builtin_vsx_scalar_cmp_exp_qp_unordered (_Float128, _Float128);
+    VSCEQPUO xscmpexpqp_unordered_kf {}
+

 ; Miscellaneous P9 functions
 [power9]
@@ -2879,19 +2892,6 @@
   fpmath _Float128 __builtin_mulf128_round_to_odd (_Float128, _Float128);
     MULF128_ODD mulkf3_odd {}

-  const signed int __builtin_vsx_scalar_cmp_exp_qp_eq (_Float128, _Float128);
-    VSCEQPEQ xscmpexpqp_eq_kf {}
-
-  const signed int __builtin_vsx_scalar_cmp_exp_qp_gt (_Float128, _Float128);
-    VSCEQPGT xscmpexpqp_gt_kf {}
-
-  const signed int __builtin_vsx_scalar_cmp_exp_qp_lt (_Float128, _Float128);
-    VSCEQPLT xscmpexpqp_lt_kf {}
-
-  const signed int \
-      __builtin_vsx_scalar_cmp_exp_qp_unordered (_Float128, _Float128);
-    VSCEQPUO xscmpexpqp_unordered_kf {}
-
   fpmath _Float128 __builtin_sqrtf128_round_to_odd (_Float128);
     SQRTF128_ODD sqrtkf2_odd {}

--
2.39.1
  
Jiufu Guo April 13, 2023, 7:45 a.m. UTC | #11
Hi,

On 2023-04-12 20:47, Kewen.Lin wrote:
> Hi Segher & Jeff,
> 
> on 2023/4/11 23:13, Segher Boessenkool wrote:
>> On Tue, Apr 11, 2023 at 05:40:09PM +0800, Kewen.Lin wrote:
>>> on 2023/4/11 17:14, guojiufu wrote:
>>>> Thanks for raising this concern.
>>>> The behavior to check about bif on FLOAT128_HW and emit an error 
>>>> message for
>>>> requirements on quad-precision is added in gcc12. This is why gcc12 
>>>> fails to
>>>> compile the case on -m32.
>>>> 
>>>> Before gcc12, altivec_resolve_overloaded_builtin will return the 
>>>> overloaded
>>>> result directly, and does not check more about the result function.
>>> 
>>> Thanks for checking, I wonder which commit caused this behavior 
>>> change and what's
>>> the underlying justification?  I know there is one new bif handling 
>>> framework
> 
> Answered this question by myself with some diggings, test case
> float128-cmp2-runnable.c started to fail from r12-5752-gd08236359eb229 
> which
> exactly makes new bif framework start to take effect and the reason why 
> the
> behavior changes is the condition change from **TARGET_P9_VECTOR** to
> **TARGET_FLOAT128_HW**.
> 
> With r12-5751-gc9dd01314d8467 (still old bif framework):
> 
> $ grep -r scalar_cmp_exp_qp gcc/config/rs6000/rs6000-builtin.def
> BU_P9V_VSX_2 (VSCEQPGT, "scalar_cmp_exp_qp_gt", CONST,  
> xscmpexpqp_gt_kf)
> BU_P9V_VSX_2 (VSCEQPLT, "scalar_cmp_exp_qp_lt", CONST,  
> xscmpexpqp_lt_kf)
> BU_P9V_VSX_2 (VSCEQPEQ, "scalar_cmp_exp_qp_eq", CONST,  
> xscmpexpqp_eq_kf)
> BU_P9V_VSX_2 (VSCEQPUO, "scalar_cmp_exp_qp_unordered",  CONST,
> xscmpexpqp_unordered_kf)
> BU_P9V_OVERLOAD_2 (VSCEQPGT,    "scalar_cmp_exp_qp_gt")
> BU_P9V_OVERLOAD_2 (VSCEQPLT,    "scalar_cmp_exp_qp_lt")
> BU_P9V_OVERLOAD_2 (VSCEQPEQ,    "scalar_cmp_exp_qp_eq")
> BU_P9V_OVERLOAD_2 (VSCEQPUO,    "scalar_cmp_exp_qp_unordered")
> 
> There were only 13 bifs requiring TARGET_FLOAT128_HW in old bif 
> framework.
> 
> $ grep ^BU_FLOAT128_HW gcc/config/rs6000/rs6000-builtin.def
> BU_FLOAT128_HW_VSX_1 (VSEEQP,   "scalar_extract_expq",  CONST,  
> xsxexpqp_kf)
> BU_FLOAT128_HW_VSX_1 (VSESQP,   "scalar_extract_sigq",  CONST,  
> xsxsigqp_kf)
> BU_FLOAT128_HW_VSX_1 (VSTDCNQP, "scalar_test_neg_qp",   CONST,  
> xststdcnegqp_kf)
> BU_FLOAT128_HW_VSX_2 (VSIEQP,   "scalar_insert_exp_q",  CONST,  
> xsiexpqp_kf)
> BU_FLOAT128_HW_VSX_2 (VSIEQPF,  "scalar_insert_exp_qp", CONST,  
> xsiexpqpf_kf)
> BU_FLOAT128_HW_VSX_2 (VSTDCQP, "scalar_test_data_class_qp",     CONST,
>  xststdcqp_kf)
> BU_FLOAT128_HW_1 (SQRTF128_ODD,  "sqrtf128_round_to_odd",  FP, 
> sqrtkf2_odd)
> BU_FLOAT128_HW_1 (TRUNCF128_ODD, "truncf128_round_to_odd", FP, 
> trunckfdf2_odd)
> BU_FLOAT128_HW_2 (ADDF128_ODD,   "addf128_round_to_odd",   FP, 
> addkf3_odd)
> BU_FLOAT128_HW_2 (SUBF128_ODD,   "subf128_round_to_odd",   FP, 
> subkf3_odd)
> BU_FLOAT128_HW_2 (MULF128_ODD,   "mulf128_round_to_odd",   FP, 
> mulkf3_odd)
> BU_FLOAT128_HW_2 (DIVF128_ODD,   "divf128_round_to_odd",   FP, 
> divkf3_odd)
> BU_FLOAT128_HW_3 (FMAF128_ODD,   "fmaf128_round_to_odd",   FP, 
> fmakf4_odd)
> 
> Starting from r12-5752-gd08236359eb229, these
> scalar_cmp_exp_qp_{gt,lt,eq,unordered}
> bifs were put under stanza ieee128-hw, it makes ieee128-hw to have 17 
> bifs,
> comparing to the previous, the extra four ones were exactly these
> scalar_cmp_exp_qp_{gt,lt,eq,unordered}.
> 
>>> introduced in gcc12, not sure the checking condition was changed 
>>> together or by
>>> a standalone commit.  Anyway, apparently the conditions for the 
>>> support of these
>>> bifs are different on gcc-11 and gcc-12, I wonder why it changed.  As 
>>> mentioned
>>> above, PR108758's c#1 said this case (bifs) work well on gcc-11, I 
>>> suspected the
>>> condition change was an overkill, that's why I asked.
>> 
>> It almost certainly was an oversight.  The new builtin framework 
>> changed
>> so many things, there was bound to be some breakage to go with all the
>> good things it brought.
> 
> Yeah, as the above findings, also I found that
> r12-3126-g2ed356a4c9af06 introduced
> power9 related stanzas and r12-3167-g2f9489a1009d98 introduced 
> ieee128-hw stanza
> including these four bifs, both of them don't have any notes on why we
> would change
> the condition for these scalar_cmp_exp_qp_{gt,lt,eq,unordered} from
> power9-vector to
> ieee128-hw, so I think it's just an oversight (ieee128-hw is an
> overkill comparing
> to power9-vector :)).
> 
>> 
>> So what is the actual thing going wrong?  QP insns work fine and are
>> valid on all systems and environments, BE or LE, 32-bit or 64-bit.  Of
>> course you cannot use the "long double" type for those everywhere, but
>> that is a very different thing.
> 
> The actual thing going wrong is that: the test case 
> float128-cmp2-runnable.c
> runs well on BE -m32 and -m64 with gcc-11, but meets failures on BE 
> -m32 with
> latest gcc-12 and trunk during compilation, having the error messages 
> like:
> 
> gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c: In function 
> 'main':
> gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c:155:3: error:
>   '__builtin_vsx_scalar_cmp_exp_qp_eq' requires ISA 3.0 IEEE 128-bit
> floating point
> 
> As scalar_cmp_exp_qp_{gt,lt,eq,unordered} requires condition 
> TARGET_FLOAT128_HW
> now (since new bif framework took effect).
> 
> (To be more exact, it started to fail from r12-5752-gd08236359eb229).
> 
> IMHO, the apparent cause seems to be the wrong effective target 
> mismatching the
> condition for those bifs, but the underlying cause is that new bif 
> framework
> unexpectedly moved these four bifs from power9-vector to ieee128-hw.
> 
> Testing a diff as below:
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def
> b/gcc/config/rs6000/rs6000-builtins.def
> index 03fb194b151..67a3f5edaf2 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -2797,6 +2797,19 @@
>    const vsi __builtin_vsx_xxbrw_v4si (vsi);
>      XXBRW_V4SI p9_xxbrw_v4si {}
> 
> +  const signed int __builtin_vsx_scalar_cmp_exp_qp_eq (_Float128, 
> _Float128);
> +    VSCEQPEQ xscmpexpqp_eq_kf {}
> +
> +  const signed int __builtin_vsx_scalar_cmp_exp_qp_gt (_Float128, 
> _Float128);
> +    VSCEQPGT xscmpexpqp_gt_kf {}
> +
> +  const signed int __builtin_vsx_scalar_cmp_exp_qp_lt (_Float128, 
> _Float128);
> +    VSCEQPLT xscmpexpqp_lt_kf {}
> +
> +  const signed int \
> +      __builtin_vsx_scalar_cmp_exp_qp_unordered (_Float128, 
> _Float128);
> +    VSCEQPUO xscmpexpqp_unordered_kf {}
> +
> 
>  ; Miscellaneous P9 functions
>  [power9]
> @@ -2879,19 +2892,6 @@
>    fpmath _Float128 __builtin_mulf128_round_to_odd (_Float128, 
> _Float128);
>      MULF128_ODD mulkf3_odd {}
> 
> -  const signed int __builtin_vsx_scalar_cmp_exp_qp_eq (_Float128, 
> _Float128);
> -    VSCEQPEQ xscmpexpqp_eq_kf {}
> -
> -  const signed int __builtin_vsx_scalar_cmp_exp_qp_gt (_Float128, 
> _Float128);
> -    VSCEQPGT xscmpexpqp_gt_kf {}
> -
> -  const signed int __builtin_vsx_scalar_cmp_exp_qp_lt (_Float128, 
> _Float128);
> -    VSCEQPLT xscmpexpqp_lt_kf {}
> -
> -  const signed int \
> -      __builtin_vsx_scalar_cmp_exp_qp_unordered (_Float128, 
> _Float128);
> -    VSCEQPUO xscmpexpqp_unordered_kf {}
> -
>    fpmath _Float128 __builtin_sqrtf128_round_to_odd (_Float128);
>      SQRTF128_ODD sqrtkf2_odd {}
> 

Thanks a lot for your great findings and comments!

I understand your points:
1.  This is `regression`:  float128-cmp2-runnable.c pass and runs well 
on gcc-11.
     But fail to compile with gcc12 and trunk.
2.  Trunk(or gcc-12) fail to compile this case because the bifs
     (e.g. __builtin_vsx_scalar_cmp_exp_qp_eq) are put under 
[ieee128-hw].
     (This may be an unexpected/unintended change.)
and 3. xscmpexpqp (QP insn) is valid on power9 system (no matter BE/LE, 
-m32/-64):
     define_insn for xscmpexpqp's condition is "TARGET_P9_VECTOR".

So, We may fix the regression first.
     "updating those bifs (scalar_cmp_exp_qp) to make them can be 
compiled on P9/BE/-m32"
     would be a fix for this regression.


> 
> BR,
> Kewen
  
Kewen.Lin April 13, 2023, 8:09 a.m. UTC | #12
Hi Jeff,

on 2023/4/13 15:45, guojiufu wrote:
> Hi,
> 
> On 2023-04-12 20:47, Kewen.Lin wrote:
>> Hi Segher & Jeff,
>>
>> on 2023/4/11 23:13, Segher Boessenkool wrote:
>>> On Tue, Apr 11, 2023 at 05:40:09PM +0800, Kewen.Lin wrote:
>>>> on 2023/4/11 17:14, guojiufu wrote:
>>>>> Thanks for raising this concern.
>>>>> The behavior to check about bif on FLOAT128_HW and emit an error message for
>>>>> requirements on quad-precision is added in gcc12. This is why gcc12 fails to
>>>>> compile the case on -m32.
>>>>>
>>>>> Before gcc12, altivec_resolve_overloaded_builtin will return the overloaded
>>>>> result directly, and does not check more about the result function.
>>>>
>>>> Thanks for checking, I wonder which commit caused this behavior change and what's
>>>> the underlying justification?  I know there is one new bif handling framework
>>
>> Answered this question by myself with some diggings, test case
>> float128-cmp2-runnable.c started to fail from r12-5752-gd08236359eb229 which
>> exactly makes new bif framework start to take effect and the reason why the
>> behavior changes is the condition change from **TARGET_P9_VECTOR** to
>> **TARGET_FLOAT128_HW**.
>>
>> With r12-5751-gc9dd01314d8467 (still old bif framework):
>>
>> $ grep -r scalar_cmp_exp_qp gcc/config/rs6000/rs6000-builtin.def
>> BU_P9V_VSX_2 (VSCEQPGT, "scalar_cmp_exp_qp_gt", CONST,  xscmpexpqp_gt_kf)
>> BU_P9V_VSX_2 (VSCEQPLT, "scalar_cmp_exp_qp_lt", CONST,  xscmpexpqp_lt_kf)
>> BU_P9V_VSX_2 (VSCEQPEQ, "scalar_cmp_exp_qp_eq", CONST,  xscmpexpqp_eq_kf)
>> BU_P9V_VSX_2 (VSCEQPUO, "scalar_cmp_exp_qp_unordered",  CONST,
>> xscmpexpqp_unordered_kf)
>> BU_P9V_OVERLOAD_2 (VSCEQPGT,    "scalar_cmp_exp_qp_gt")
>> BU_P9V_OVERLOAD_2 (VSCEQPLT,    "scalar_cmp_exp_qp_lt")
>> BU_P9V_OVERLOAD_2 (VSCEQPEQ,    "scalar_cmp_exp_qp_eq")
>> BU_P9V_OVERLOAD_2 (VSCEQPUO,    "scalar_cmp_exp_qp_unordered")
>>
>> There were only 13 bifs requiring TARGET_FLOAT128_HW in old bif framework.
>>
>> $ grep ^BU_FLOAT128_HW gcc/config/rs6000/rs6000-builtin.def
>> BU_FLOAT128_HW_VSX_1 (VSEEQP,   "scalar_extract_expq",  CONST,  xsxexpqp_kf)
>> BU_FLOAT128_HW_VSX_1 (VSESQP,   "scalar_extract_sigq",  CONST,  xsxsigqp_kf)
>> BU_FLOAT128_HW_VSX_1 (VSTDCNQP, "scalar_test_neg_qp",   CONST,  xststdcnegqp_kf)
>> BU_FLOAT128_HW_VSX_2 (VSIEQP,   "scalar_insert_exp_q",  CONST,  xsiexpqp_kf)
>> BU_FLOAT128_HW_VSX_2 (VSIEQPF,  "scalar_insert_exp_qp", CONST,  xsiexpqpf_kf)
>> BU_FLOAT128_HW_VSX_2 (VSTDCQP, "scalar_test_data_class_qp",     CONST,
>>  xststdcqp_kf)
>> BU_FLOAT128_HW_1 (SQRTF128_ODD,  "sqrtf128_round_to_odd",  FP, sqrtkf2_odd)
>> BU_FLOAT128_HW_1 (TRUNCF128_ODD, "truncf128_round_to_odd", FP, trunckfdf2_odd)
>> BU_FLOAT128_HW_2 (ADDF128_ODD,   "addf128_round_to_odd",   FP, addkf3_odd)
>> BU_FLOAT128_HW_2 (SUBF128_ODD,   "subf128_round_to_odd",   FP, subkf3_odd)
>> BU_FLOAT128_HW_2 (MULF128_ODD,   "mulf128_round_to_odd",   FP, mulkf3_odd)
>> BU_FLOAT128_HW_2 (DIVF128_ODD,   "divf128_round_to_odd",   FP, divkf3_odd)
>> BU_FLOAT128_HW_3 (FMAF128_ODD,   "fmaf128_round_to_odd",   FP, fmakf4_odd)
>>
>> Starting from r12-5752-gd08236359eb229, these
>> scalar_cmp_exp_qp_{gt,lt,eq,unordered}
>> bifs were put under stanza ieee128-hw, it makes ieee128-hw to have 17 bifs,
>> comparing to the previous, the extra four ones were exactly these
>> scalar_cmp_exp_qp_{gt,lt,eq,unordered}.
>>
>>>> introduced in gcc12, not sure the checking condition was changed together or by
>>>> a standalone commit.  Anyway, apparently the conditions for the support of these
>>>> bifs are different on gcc-11 and gcc-12, I wonder why it changed.  As mentioned
>>>> above, PR108758's c#1 said this case (bifs) work well on gcc-11, I suspected the
>>>> condition change was an overkill, that's why I asked.
>>>
>>> It almost certainly was an oversight.  The new builtin framework changed
>>> so many things, there was bound to be some breakage to go with all the
>>> good things it brought.
>>
>> Yeah, as the above findings, also I found that
>> r12-3126-g2ed356a4c9af06 introduced
>> power9 related stanzas and r12-3167-g2f9489a1009d98 introduced ieee128-hw stanza
>> including these four bifs, both of them don't have any notes on why we
>> would change
>> the condition for these scalar_cmp_exp_qp_{gt,lt,eq,unordered} from
>> power9-vector to
>> ieee128-hw, so I think it's just an oversight (ieee128-hw is an
>> overkill comparing
>> to power9-vector :)).
>>
>>>
>>> So what is the actual thing going wrong?  QP insns work fine and are
>>> valid on all systems and environments, BE or LE, 32-bit or 64-bit.  Of
>>> course you cannot use the "long double" type for those everywhere, but
>>> that is a very different thing.
>>
>> The actual thing going wrong is that: the test case float128-cmp2-runnable.c
>> runs well on BE -m32 and -m64 with gcc-11, but meets failures on BE -m32 with
>> latest gcc-12 and trunk during compilation, having the error messages like:
>>
>> gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c: In function 'main':
>> gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c:155:3: error:
>>   '__builtin_vsx_scalar_cmp_exp_qp_eq' requires ISA 3.0 IEEE 128-bit
>> floating point
>>
>> As scalar_cmp_exp_qp_{gt,lt,eq,unordered} requires condition TARGET_FLOAT128_HW
>> now (since new bif framework took effect).
>>
>> (To be more exact, it started to fail from r12-5752-gd08236359eb229).
>>
>> IMHO, the apparent cause seems to be the wrong effective target mismatching the
>> condition for those bifs, but the underlying cause is that new bif framework
>> unexpectedly moved these four bifs from power9-vector to ieee128-hw.
>>
>> Testing a diff as below:
>>
>> diff --git a/gcc/config/rs6000/rs6000-builtins.def
>> b/gcc/config/rs6000/rs6000-builtins.def
>> index 03fb194b151..67a3f5edaf2 100644
>> --- a/gcc/config/rs6000/rs6000-builtins.def
>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>> @@ -2797,6 +2797,19 @@
>>    const vsi __builtin_vsx_xxbrw_v4si (vsi);
>>      XXBRW_V4SI p9_xxbrw_v4si {}
>>
>> +  const signed int __builtin_vsx_scalar_cmp_exp_qp_eq (_Float128, _Float128);
>> +    VSCEQPEQ xscmpexpqp_eq_kf {}
>> +
>> +  const signed int __builtin_vsx_scalar_cmp_exp_qp_gt (_Float128, _Float128);
>> +    VSCEQPGT xscmpexpqp_gt_kf {}
>> +
>> +  const signed int __builtin_vsx_scalar_cmp_exp_qp_lt (_Float128, _Float128);
>> +    VSCEQPLT xscmpexpqp_lt_kf {}
>> +
>> +  const signed int \
>> +      __builtin_vsx_scalar_cmp_exp_qp_unordered (_Float128, _Float128);
>> +    VSCEQPUO xscmpexpqp_unordered_kf {}
>> +
>>
>>  ; Miscellaneous P9 functions
>>  [power9]
>> @@ -2879,19 +2892,6 @@
>>    fpmath _Float128 __builtin_mulf128_round_to_odd (_Float128, _Float128);
>>      MULF128_ODD mulkf3_odd {}
>>
>> -  const signed int __builtin_vsx_scalar_cmp_exp_qp_eq (_Float128, _Float128);
>> -    VSCEQPEQ xscmpexpqp_eq_kf {}
>> -
>> -  const signed int __builtin_vsx_scalar_cmp_exp_qp_gt (_Float128, _Float128);
>> -    VSCEQPGT xscmpexpqp_gt_kf {}
>> -
>> -  const signed int __builtin_vsx_scalar_cmp_exp_qp_lt (_Float128, _Float128);
>> -    VSCEQPLT xscmpexpqp_lt_kf {}
>> -
>> -  const signed int \
>> -      __builtin_vsx_scalar_cmp_exp_qp_unordered (_Float128, _Float128);
>> -    VSCEQPUO xscmpexpqp_unordered_kf {}
>> -
>>    fpmath _Float128 __builtin_sqrtf128_round_to_odd (_Float128);
>>      SQRTF128_ODD sqrtkf2_odd {}
>>
> 
> Thanks a lot for your great findings and comments!
> 
> I understand your points:
> 1.  This is `regression`:  float128-cmp2-runnable.c pass and runs well on gcc-11.
>     But fail to compile with gcc12 and trunk.
> 2.  Trunk(or gcc-12) fail to compile this case because the bifs
>     (e.g. __builtin_vsx_scalar_cmp_exp_qp_eq) are put under [ieee128-hw].
>     (This may be an unexpected/unintended change.)
> and 3. xscmpexpqp (QP insn) is valid on power9 system (no matter BE/LE, -m32/-64):
>     define_insn for xscmpexpqp's condition is "TARGET_P9_VECTOR".
> 
> So, We may fix the regression first.
>     "updating those bifs (scalar_cmp_exp_qp) to make them can be compiled on P9/BE/-m32"
>     would be a fix for this regression.
> 

Thanks for the summary, yeah, this is a regression IMHO.  Those bifs worked well on
Power9 BE -m32 in the past, it's not sensible to disable them there.  Also considering
the corresponding define_insn condition and commit history etc, I believe it's an
oversight, and we should fix the sitting stanza instead of the test case.

BR,
Kewen
  

Patch

diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c b/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c
index d376a3ca68e..91287c0fb7a 100644
--- a/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/float128-cmp2-runnable.c
@@ -1,5 +1,5 @@ 
 /* { dg-do run } */
-/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-require-effective-target ppc_float128_hw } */
 /* { dg-require-effective-target p9vector_hw } */
 /* { dg-options "-O2 -mdejagnu-cpu=power9 " } */