[v5,rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
Commit Message
Hi,
This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
Tests show that outputs of xs[min/max]dp are consistent with the standard
of C99 fmin/max.
This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
of smin/max. So the builtins always generate xs[min/max]dp on all
platforms.
Compared with previous version, I added a condition check for finite_math_only
in fmin/max insn.
Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.
ChangeLog
2022-06-20 Haochen Gui <guihaoc@linux.ibm.com>
gcc/
PR target/105414
* match.pd (minmax): Skip constant folding for fmin/fmax when both
arguments are sNaN or one is sNaN and another is NaN.
gcc/testsuite/
PR target/105414
* gcc.dg/pr105414.c: New.
patch.diff
Comments
Hi!
On Mon, Jun 20, 2022 at 11:12:50AM +0800, HAO CHEN GUI wrote:
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
You don't have this in the changelog. Please fix.
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
This, too. And match.pd isn't in the patch.
> +(define_insn "f<minmax_op><mode>3"
> + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
> + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
> + (match_operand:SFDF 2 "vsx_register_operand" "wa")]
> + FMINMAX))]
> + "TARGET_VSX && !flag_finite_math_only"
&& !flag_trapping_math
and/or whatever else is needed as well here.
> + "xs<minmax_op>dp %x0,%x1,%x2"
> + [(set_attr "type" "fp")]
> +)
Are things like
fmin(4.0, 2.0);
(still) optimised correctly?
> new file mode 100644
> index 00000000000..e43ac40c2d1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c
> +/* { dg-options "-O1 -mvsx" } */
Please use -O2 instead. That way, it will catch it if any of the
optimisations that are normally done (and not with just -O1) sabotage
us here.
Thanks,
Segher
Hi,
On 21/6/2022 上午 7:08, Segher Boessenkool wrote:
> && !flag_trapping_math
>
> and/or whatever else is needed as well here.
>
I have a question here. fmin/max are folded to MIN/MAX_EXPR when
flag_finite_math_only is set. Seems no-trapping-math is no need to
fmin/max? Also xs[min|max]dp do raise trapping.
/* Convert fmin/fmax to MIN_EXPR/MAX_EXPR. C99 requires these
functions to return the numeric arg if the other one is NaN.
MIN and MAX don't honor that, so only transform if -ffinite-math-only
is set. C99 doesn't require -0.0 to be handled, so we don't have to
worry about it either. */
(if (flag_finite_math_only)
(simplify
(FMIN_ALL @0 @1)
(min @0 @1))
(simplify
(FMAX_ALL @0 @1)
(max @0 @1)))
> Are things like
> fmin(4.0, 2.0);
> (still) optimised correctly?
I have tested it. fmin(4.0, 2.0) is converted to "2.0" in front end.
So my patch doesn't touch it.
Thanks a lot.
Gui Haochen
@@ -1613,10 +1613,10 @@
XSCVSPDP vsx_xscvspdp {}
const double __builtin_vsx_xsmaxdp (double, double);
- XSMAXDP smaxdf3 {}
+ XSMAXDP fmaxdf3 {}
const double __builtin_vsx_xsmindp (double, double);
- XSMINDP smindf3 {}
+ XSMINDP fmindf3 {}
const double __builtin_vsx_xsrdpi (double);
XSRDPI vsx_xsrdpi {}
@@ -158,6 +158,8 @@ (define_c_enum "unspec"
UNSPEC_HASHCHK
UNSPEC_XXSPLTIDP_CONST
UNSPEC_XXSPLTIW_CONST
+ UNSPEC_FMAX
+ UNSPEC_FMIN
])
;;
@@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr"
DONE;
})
+
+(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN])
+
+(define_int_attr minmax_op [(UNSPEC_FMAX "max")
+ (UNSPEC_FMIN "min")])
+
+(define_insn "f<minmax_op><mode>3"
+ [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
+ (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
+ (match_operand:SFDF 2 "vsx_register_operand" "wa")]
+ FMINMAX))]
+ "TARGET_VSX && !flag_finite_math_only"
+ "xs<minmax_op>dp %x0,%x1,%x2"
+ [(set_attr "type" "fp")]
+)
+
(define_expand "mov<mode>cc"
[(set (match_operand:GPR 0 "gpc_reg_operand")
(if_then_else:GPR (match_operand 1 "comparison_operator")
new file mode 100644
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O1 -mvsx" } */
+/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */
+
+#include <math.h>
+
+double test1 (double d0, double d1)
+{
+ return fmin (d0, d1);
+}
+
+float test2 (float d0, float d1)
+{
+ return fmin (d0, d1);
+}
+
+double test3 (double d0, double d1)
+{
+ return fmax (d0, d1);
+}
+
+float test4 (float d0, float d1)
+{
+ return fmax (d0, d1);
+}
+
+double test5 (double d0, double d1)
+{
+ return __builtin_vsx_xsmindp (d0, d1);
+}
+
+double test6 (double d0, double d1)
+{
+ return __builtin_vsx_xsmaxdp (d0, d1);
+}