[v4,rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

Message ID cf9a8958-c46d-2c09-ce99-fc3136366863@linux.ibm.com
State New
Headers
Series [v4,rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605] |

Commit Message

HAO CHEN GUI June 8, 2022, 3:28 a.m. UTC
  Hi,
  This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
Tests show that outputs of xs[min/max]dp are consistent with the standard
of C99 fmin/max.

  This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
of smin/max. So the builtins always generate xs[min/max]dp on all
platforms.

  Compared with previous version, the main change is to fix indent problem.

  Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

ChangeLog
2022-05-31 Haochen Gui <guihaoc@linux.ibm.com>

gcc/
	PR target/103605
	* config/rs6000/rs6000.md (FMINMAX): New.
	(minmax_op): New.
	(f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN.
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set
	pattern to fmaxdf3.
	(__builtin_vsx_xsmindp): Set pattern to fmindf3.

gcc/testsuite/
	PR target/103605
	* gcc.dg/powerpc/pr103605.c: New.

patch.diff
  

Comments

Kewen.Lin June 8, 2022, 7:44 a.m. UTC | #1
on 2022/6/8 11:28, HAO CHEN GUI wrote:
> Hi,
>   This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
> Tests show that outputs of xs[min/max]dp are consistent with the standard
> of C99 fmin/max.
> 
>   This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
> of smin/max. So the builtins always generate xs[min/max]dp on all
> platforms.
> 
>   Compared with previous version, the main change is to fix indent problem.
> 
>   Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.

OK, thanks!

BR,
Kewen

> 
> ChangeLog
> 2022-05-31 Haochen Gui <guihaoc@linux.ibm.com>
> 
> gcc/
> 	PR target/103605
> 	* config/rs6000/rs6000.md (FMINMAX): New.
> 	(minmax_op): New.
> 	(f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN.
> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set
> 	pattern to fmaxdf3.
> 	(__builtin_vsx_xsmindp): Set pattern to fmindf3.
> 
> gcc/testsuite/
> 	PR target/103605
> 	* gcc.dg/powerpc/pr103605.c: New.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index f4a9f24bcc5..8b735493b40 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1613,10 +1613,10 @@
>      XSCVSPDP vsx_xscvspdp {}
> 
>    const double __builtin_vsx_xsmaxdp (double, double);
> -    XSMAXDP smaxdf3 {}
> +    XSMAXDP fmaxdf3 {}
> 
>    const double __builtin_vsx_xsmindp (double, double);
> -    XSMINDP smindf3 {}
> +    XSMINDP fmindf3 {}
> 
>    const double __builtin_vsx_xsrdpi (double);
>      XSRDPI vsx_xsrdpi {}
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index bf85baa5370..42d3edf2eca 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -158,6 +158,8 @@ (define_c_enum "unspec"
>     UNSPEC_HASHCHK
>     UNSPEC_XXSPLTIDP_CONST
>     UNSPEC_XXSPLTIW_CONST
> +   UNSPEC_FMAX
> +   UNSPEC_FMIN
>    ])
> 
>  ;;
> @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr"
>    DONE;
>  })
> 
> +
> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN])
> +
> +(define_int_attr  minmax_op [(UNSPEC_FMAX "max")
> +			     (UNSPEC_FMIN "min")])
> +
> +(define_insn "f<minmax_op><mode>3"
> +  [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
> +	(unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
> +		      (match_operand:SFDF 2 "vsx_register_operand" "wa")]
> +		     FMINMAX))]
> +  "TARGET_VSX"
> +  "xs<minmax_op>dp %x0,%x1,%x2"
> +  [(set_attr "type" "fp")]
> +)
> +
>  (define_expand "mov<mode>cc"
>     [(set (match_operand:GPR 0 "gpc_reg_operand")
>  	 (if_then_else:GPR (match_operand 1 "comparison_operator")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c
> new file mode 100644
> index 00000000000..e43ac40c2d1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-O1 -mvsx" } */
> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */
> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */
> +
> +#include <math.h>
> +
> +double test1 (double d0, double d1)
> +{
> +  return fmin (d0, d1);
> +}
> +
> +float test2 (float d0, float d1)
> +{
> +  return fmin (d0, d1);
> +}
> +
> +double test3 (double d0, double d1)
> +{
> +  return fmax (d0, d1);
> +}
> +
> +float test4 (float d0, float d1)
> +{
> +  return fmax (d0, d1);
> +}
> +
> +double test5 (double d0, double d1)
> +{
> +  return __builtin_vsx_xsmindp (d0, d1);
> +}
> +
> +double test6 (double d0, double d1)
> +{
> +  return __builtin_vsx_xsmaxdp (d0, d1);
> +}
  
Segher Boessenkool June 8, 2022, 1:24 p.m. UTC | #2
On Wed, Jun 08, 2022 at 11:28:11AM +0800, HAO CHEN GUI wrote:
>   This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
> Tests show that outputs of xs[min/max]dp are consistent with the standard
> of C99 fmin/max.

But it regresses the code quality generated with -ffast-math (because
the new unspecs arent't optimised like standard rtl is).  This can be
follow-up work of course -- and the best direction is to make fmin/fmax
generic, even!  :-)


Segher
  
HAO CHEN GUI June 9, 2022, 1:24 a.m. UTC | #3
Hi,

On 8/6/2022 下午 9:24, Segher Boessenkool wrote:
> But it regresses the code quality generated with -ffast-math (because
> the new unspecs arent't optimised like standard rtl is).  This can be
> follow-up work of course -- and the best direction is to make fmin/fmax
> generic, even!  :-)

fmin/max will be folded to MIN/MAX_EXPR when fast-math is set. So the
behavior doesn't change when fast-math is set.
  
Segher Boessenkool June 9, 2022, 3:07 p.m. UTC | #4
On Thu, Jun 09, 2022 at 09:24:00AM +0800, HAO CHEN GUI wrote:
> On 8/6/2022 下午 9:24, Segher Boessenkool wrote:
> > But it regresses the code quality generated with -ffast-math (because
> > the new unspecs arent't optimised like standard rtl is).  This can be
> > follow-up work of course -- and the best direction is to make fmin/fmax
> > generic, even!  :-)
> 
> fmin/max will be folded to MIN/MAX_EXPR when fast-math is set. So the
> behavior doesn't change when fast-math is set.

Ah, good.  Should we then have an assert that there is no fast-math if
we ever get the rtl fmin/fmax stuff?


Segher
  
HAO CHEN GUI June 10, 2022, 12:47 a.m. UTC | #5
On 9/6/2022 下午 11:07, Segher Boessenkool wrote:
> Ah, good.  Should we then have an assert that there is no fast-math if
> we ever get the rtl fmin/fmax stuff?

Sure, I will add a condition for it. Thanks a lot.
Gui Haochen
  

Patch

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index f4a9f24bcc5..8b735493b40 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1613,10 +1613,10 @@ 
     XSCVSPDP vsx_xscvspdp {}

   const double __builtin_vsx_xsmaxdp (double, double);
-    XSMAXDP smaxdf3 {}
+    XSMAXDP fmaxdf3 {}

   const double __builtin_vsx_xsmindp (double, double);
-    XSMINDP smindf3 {}
+    XSMINDP fmindf3 {}

   const double __builtin_vsx_xsrdpi (double);
     XSRDPI vsx_xsrdpi {}
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index bf85baa5370..42d3edf2eca 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -158,6 +158,8 @@  (define_c_enum "unspec"
    UNSPEC_HASHCHK
    UNSPEC_XXSPLTIDP_CONST
    UNSPEC_XXSPLTIW_CONST
+   UNSPEC_FMAX
+   UNSPEC_FMIN
   ])

 ;;
@@ -5341,6 +5343,22 @@  (define_insn_and_split "*s<minmax><mode>3_fpr"
   DONE;
 })

+
+(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN])
+
+(define_int_attr  minmax_op [(UNSPEC_FMAX "max")
+			     (UNSPEC_FMIN "min")])
+
+(define_insn "f<minmax_op><mode>3"
+  [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
+	(unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
+		      (match_operand:SFDF 2 "vsx_register_operand" "wa")]
+		     FMINMAX))]
+  "TARGET_VSX"
+  "xs<minmax_op>dp %x0,%x1,%x2"
+  [(set_attr "type" "fp")]
+)
+
 (define_expand "mov<mode>cc"
    [(set (match_operand:GPR 0 "gpc_reg_operand")
 	 (if_then_else:GPR (match_operand 1 "comparison_operator")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c
new file mode 100644
index 00000000000..e43ac40c2d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c
@@ -0,0 +1,37 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O1 -mvsx" } */
+/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */
+
+#include <math.h>
+
+double test1 (double d0, double d1)
+{
+  return fmin (d0, d1);
+}
+
+float test2 (float d0, float d1)
+{
+  return fmin (d0, d1);
+}
+
+double test3 (double d0, double d1)
+{
+  return fmax (d0, d1);
+}
+
+float test4 (float d0, float d1)
+{
+  return fmax (d0, d1);
+}
+
+double test5 (double d0, double d1)
+{
+  return __builtin_vsx_xsmindp (d0, d1);
+}
+
+double test6 (double d0, double d1)
+{
+  return __builtin_vsx_xsmaxdp (d0, d1);
+}