[2/8] middle-end: Recognize scalar widening reductions

Message ID Y1+4GFnUyuwSK1hy@arm.com
State Dropped
Headers
Series [1/8] middle-end: Recognize scalar reductions from bitfields and array_refs |

Commit Message

Tamar Christina Oct. 31, 2022, 11:57 a.m. UTC
  Hi All,

This adds a new optab and IFNs for REDUC_PLUS_WIDEN where the resulting
scalar reduction has twice the precision of the input elements.

At some point in a later patch I will also teach the vectorizer to recognize
this builtin once I figure out how the various bits of reductions work.

For now it's generated only by the match.pd pattern.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	* internal-fn.def (REDUC_PLUS_WIDEN): New.
	* doc/md.texi: Document it.
	* match.pd: Recognize widening plus.
	* optabs.def (reduc_splus_widen_scal_optab,
	reduc_uplus_widen_scal_optab): New.

--- inline copy of patch -- 
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 34825549ed4e315b07d36dc3d63bae0cc0a3932d..c08691ab4c9a4bfe55ae81e5e228a414d6242d78 100644




--
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 34825549ed4e315b07d36dc3d63bae0cc0a3932d..c08691ab4c9a4bfe55ae81e5e228a414d6242d78 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5284,6 +5284,20 @@ Compute the sum of the elements of a vector. The vector is operand 1, and
 operand 0 is the scalar result, with mode equal to the mode of the elements of
 the input vector.
 
+@cindex @code{reduc_uplus_widen_scal_@var{m}} instruction pattern
+@item @samp{reduc_uplus_widen_scal_@var{m}}
+Compute the sum of the elements of a vector and zero-extend @var{m} to a mode
+that has twice the precision of @var{m}.. The vector is operand 1, and
+operand 0 is the scalar result, with mode equal to twice the precision of the
+mode of the elements of the input vector.
+
+@cindex @code{reduc_splus_widen_scal_@var{m}} instruction pattern
+@item @samp{reduc_splus_widen_scal_@var{m}}
+Compute the sum of the elements of a vector and sign-extend @var{m} to a mode
+that has twice the precision of @var{m}.. The vector is operand 1, and
+operand 0 is the scalar result, with mode equal to twice the precision of the
+mode of the elements of the input vector.
+
 @cindex @code{reduc_and_scal_@var{m}} instruction pattern
 @item @samp{reduc_and_scal_@var{m}}
 @cindex @code{reduc_ior_scal_@var{m}} instruction pattern
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 5e672183f4def9d0cdc29cf12fe17e8cff928f9f..f64a8421b1087b6c0f3602dc556876b0fd15c7ad 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -215,6 +215,9 @@ DEF_INTERNAL_OPTAB_FN (RSQRT, ECF_CONST, rsqrt, unary)
 
 DEF_INTERNAL_OPTAB_FN (REDUC_PLUS, ECF_CONST | ECF_NOTHROW,
 		       reduc_plus_scal, unary)
+DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_PLUS_WIDEN, ECF_CONST | ECF_NOTHROW,
+			      first, reduc_splus_widen_scal,
+			      reduc_uplus_widen_scal, unary)
 DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MAX, ECF_CONST | ECF_NOTHROW, first,
 			      reduc_smax_scal, reduc_umax_scal, unary)
 DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MIN, ECF_CONST | ECF_NOTHROW, first,
diff --git a/gcc/match.pd b/gcc/match.pd
index aecaa3520b36e770d11ea9a10eb18db23c0cd9f7..1d407414bee278c64c00d425d9f025c1c58d853d 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -7237,6 +7237,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
        (ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))
        (ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))))))))
 
+/* Widening reduction conversions. */
+(simplify
+ (convert (IFN_REDUC_PLUS @0))
+ (if (element_precision (TREE_TYPE (@0)) * 2 == element_precision (type)
+      && TYPE_UNSIGNED (type) == TYPE_UNSIGNED (TREE_TYPE (@0))
+      && ANY_INTEGRAL_TYPE_P (type) && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@0)))
+  (IFN_REDUC_PLUS_WIDEN @0)))
+
 (simplify
  (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4)
  (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); }))
diff --git a/gcc/optabs.def b/gcc/optabs.def
index a6db2342bed6baf13ecbd84112c8432c6972e6fe..9947aed67fb8a3b675cb0aab9aeb059f89644106 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -346,6 +346,8 @@ OPTAB_D (reduc_fmin_scal_optab, "reduc_fmin_scal_$a")
 OPTAB_D (reduc_smax_scal_optab, "reduc_smax_scal_$a")
 OPTAB_D (reduc_smin_scal_optab, "reduc_smin_scal_$a")
 OPTAB_D (reduc_plus_scal_optab, "reduc_plus_scal_$a")
+OPTAB_D (reduc_splus_widen_scal_optab, "reduc_splus_widen_scal_$a")
+OPTAB_D (reduc_uplus_widen_scal_optab, "reduc_uplus_widen_scal_$a")
 OPTAB_D (reduc_umax_scal_optab, "reduc_umax_scal_$a")
 OPTAB_D (reduc_umin_scal_optab, "reduc_umin_scal_$a")
 OPTAB_D (reduc_and_scal_optab,  "reduc_and_scal_$a")
  

Comments

Jeff Law Oct. 31, 2022, 9:42 p.m. UTC | #1
On 10/31/22 05:57, Tamar Christina wrote:
> Hi All,
>
> This adds a new optab and IFNs for REDUC_PLUS_WIDEN where the resulting
> scalar reduction has twice the precision of the input elements.
>
> At some point in a later patch I will also teach the vectorizer to recognize
> this builtin once I figure out how the various bits of reductions work.
>
> For now it's generated only by the match.pd pattern.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> 	* internal-fn.def (REDUC_PLUS_WIDEN): New.
> 	* doc/md.texi: Document it.
> 	* match.pd: Recognize widening plus.
> 	* optabs.def (reduc_splus_widen_scal_optab,
> 	reduc_uplus_widen_scal_optab): New.

OK

jeff
  
Richard Biener Nov. 7, 2022, 1:21 p.m. UTC | #2
On Mon, 31 Oct 2022, Tamar Christina wrote:

> Hi All,
> 
> This adds a new optab and IFNs for REDUC_PLUS_WIDEN where the resulting
> scalar reduction has twice the precision of the input elements.
> 
> At some point in a later patch I will also teach the vectorizer to recognize
> this builtin once I figure out how the various bits of reductions work.
> 
> For now it's generated only by the match.pd pattern.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 	* internal-fn.def (REDUC_PLUS_WIDEN): New.
> 	* doc/md.texi: Document it.
> 	* match.pd: Recognize widening plus.
> 	* optabs.def (reduc_splus_widen_scal_optab,
> 	reduc_uplus_widen_scal_optab): New.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 34825549ed4e315b07d36dc3d63bae0cc0a3932d..c08691ab4c9a4bfe55ae81e5e228a414d6242d78 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5284,6 +5284,20 @@ Compute the sum of the elements of a vector. The vector is operand 1, and
>  operand 0 is the scalar result, with mode equal to the mode of the elements of
>  the input vector.
>  
> +@cindex @code{reduc_uplus_widen_scal_@var{m}} instruction pattern
> +@item @samp{reduc_uplus_widen_scal_@var{m}}
> +Compute the sum of the elements of a vector and zero-extend @var{m} to a mode
> +that has twice the precision of @var{m}.. The vector is operand 1, and
> +operand 0 is the scalar result, with mode equal to twice the precision of the
> +mode of the elements of the input vector.
> +
> +@cindex @code{reduc_splus_widen_scal_@var{m}} instruction pattern
> +@item @samp{reduc_splus_widen_scal_@var{m}}
> +Compute the sum of the elements of a vector and sign-extend @var{m} to a mode
> +that has twice the precision of @var{m}.. The vector is operand 1, and
> +operand 0 is the scalar result, with mode equal to twice the precision of the
> +mode of the elements of the input vector.
> +
>  @cindex @code{reduc_and_scal_@var{m}} instruction pattern
>  @item @samp{reduc_and_scal_@var{m}}
>  @cindex @code{reduc_ior_scal_@var{m}} instruction pattern
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index 5e672183f4def9d0cdc29cf12fe17e8cff928f9f..f64a8421b1087b6c0f3602dc556876b0fd15c7ad 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -215,6 +215,9 @@ DEF_INTERNAL_OPTAB_FN (RSQRT, ECF_CONST, rsqrt, unary)
>  
>  DEF_INTERNAL_OPTAB_FN (REDUC_PLUS, ECF_CONST | ECF_NOTHROW,
>  		       reduc_plus_scal, unary)
> +DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_PLUS_WIDEN, ECF_CONST | ECF_NOTHROW,
> +			      first, reduc_splus_widen_scal,
> +			      reduc_uplus_widen_scal, unary)
>  DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MAX, ECF_CONST | ECF_NOTHROW, first,
>  			      reduc_smax_scal, reduc_umax_scal, unary)
>  DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MIN, ECF_CONST | ECF_NOTHROW, first,
> diff --git a/gcc/match.pd b/gcc/match.pd
> index aecaa3520b36e770d11ea9a10eb18db23c0cd9f7..1d407414bee278c64c00d425d9f025c1c58d853d 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -7237,6 +7237,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>         (ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))
>         (ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))))))))
>  
> +/* Widening reduction conversions. */
> +(simplify
> + (convert (IFN_REDUC_PLUS @0))
> + (if (element_precision (TREE_TYPE (@0)) * 2 == element_precision (type)
> +      && TYPE_UNSIGNED (type) == TYPE_UNSIGNED (TREE_TYPE (@0))
> +      && ANY_INTEGRAL_TYPE_P (type) && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@0)))
> +  (IFN_REDUC_PLUS_WIDEN @0)))

But that's not the same?  REDUC_PLUS_WIDEN first widens, then sums while
REDUC_PLUS on overflow "truncates", no?

> +
>  (simplify
>   (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4)
>   (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); }))
> diff --git a/gcc/optabs.def b/gcc/optabs.def
> index a6db2342bed6baf13ecbd84112c8432c6972e6fe..9947aed67fb8a3b675cb0aab9aeb059f89644106 100644
> --- a/gcc/optabs.def
> +++ b/gcc/optabs.def
> @@ -346,6 +346,8 @@ OPTAB_D (reduc_fmin_scal_optab, "reduc_fmin_scal_$a")
>  OPTAB_D (reduc_smax_scal_optab, "reduc_smax_scal_$a")
>  OPTAB_D (reduc_smin_scal_optab, "reduc_smin_scal_$a")
>  OPTAB_D (reduc_plus_scal_optab, "reduc_plus_scal_$a")
> +OPTAB_D (reduc_splus_widen_scal_optab, "reduc_splus_widen_scal_$a")
> +OPTAB_D (reduc_uplus_widen_scal_optab, "reduc_uplus_widen_scal_$a")
>  OPTAB_D (reduc_umax_scal_optab, "reduc_umax_scal_$a")
>  OPTAB_D (reduc_umin_scal_optab, "reduc_umin_scal_$a")
>  OPTAB_D (reduc_and_scal_optab,  "reduc_and_scal_$a")
> 
> 
> 
> 
>
  

Patch

--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5284,6 +5284,20 @@  Compute the sum of the elements of a vector. The vector is operand 1, and
 operand 0 is the scalar result, with mode equal to the mode of the elements of
 the input vector.
 
+@cindex @code{reduc_uplus_widen_scal_@var{m}} instruction pattern
+@item @samp{reduc_uplus_widen_scal_@var{m}}
+Compute the sum of the elements of a vector and zero-extend @var{m} to a mode
+that has twice the precision of @var{m}.. The vector is operand 1, and
+operand 0 is the scalar result, with mode equal to twice the precision of the
+mode of the elements of the input vector.
+
+@cindex @code{reduc_splus_widen_scal_@var{m}} instruction pattern
+@item @samp{reduc_splus_widen_scal_@var{m}}
+Compute the sum of the elements of a vector and sign-extend @var{m} to a mode
+that has twice the precision of @var{m}.. The vector is operand 1, and
+operand 0 is the scalar result, with mode equal to twice the precision of the
+mode of the elements of the input vector.
+
 @cindex @code{reduc_and_scal_@var{m}} instruction pattern
 @item @samp{reduc_and_scal_@var{m}}
 @cindex @code{reduc_ior_scal_@var{m}} instruction pattern
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 5e672183f4def9d0cdc29cf12fe17e8cff928f9f..f64a8421b1087b6c0f3602dc556876b0fd15c7ad 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -215,6 +215,9 @@  DEF_INTERNAL_OPTAB_FN (RSQRT, ECF_CONST, rsqrt, unary)
 
 DEF_INTERNAL_OPTAB_FN (REDUC_PLUS, ECF_CONST | ECF_NOTHROW,
 		       reduc_plus_scal, unary)
+DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_PLUS_WIDEN, ECF_CONST | ECF_NOTHROW,
+			      first, reduc_splus_widen_scal,
+			      reduc_uplus_widen_scal, unary)
 DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MAX, ECF_CONST | ECF_NOTHROW, first,
 			      reduc_smax_scal, reduc_umax_scal, unary)
 DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MIN, ECF_CONST | ECF_NOTHROW, first,
diff --git a/gcc/match.pd b/gcc/match.pd
index aecaa3520b36e770d11ea9a10eb18db23c0cd9f7..1d407414bee278c64c00d425d9f025c1c58d853d 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -7237,6 +7237,14 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
        (ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))
        (ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))))))))
 
+/* Widening reduction conversions. */
+(simplify
+ (convert (IFN_REDUC_PLUS @0))
+ (if (element_precision (TREE_TYPE (@0)) * 2 == element_precision (type)
+      && TYPE_UNSIGNED (type) == TYPE_UNSIGNED (TREE_TYPE (@0))
+      && ANY_INTEGRAL_TYPE_P (type) && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@0)))
+  (IFN_REDUC_PLUS_WIDEN @0)))
+
 (simplify
  (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4)
  (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); }))
diff --git a/gcc/optabs.def b/gcc/optabs.def
index a6db2342bed6baf13ecbd84112c8432c6972e6fe..9947aed67fb8a3b675cb0aab9aeb059f89644106 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -346,6 +346,8 @@  OPTAB_D (reduc_fmin_scal_optab, "reduc_fmin_scal_$a")
 OPTAB_D (reduc_smax_scal_optab, "reduc_smax_scal_$a")
 OPTAB_D (reduc_smin_scal_optab, "reduc_smin_scal_$a")
 OPTAB_D (reduc_plus_scal_optab, "reduc_plus_scal_$a")
+OPTAB_D (reduc_splus_widen_scal_optab, "reduc_splus_widen_scal_$a")
+OPTAB_D (reduc_uplus_widen_scal_optab, "reduc_uplus_widen_scal_$a")
 OPTAB_D (reduc_umax_scal_optab, "reduc_umax_scal_$a")
 OPTAB_D (reduc_umin_scal_optab, "reduc_umin_scal_$a")
 OPTAB_D (reduc_and_scal_optab,  "reduc_and_scal_$a")