[DOC] Document the VEC_PERM_EXPR tree code (and minor clean-ups).

Message ID 000f01d938d8$00cdf7d0$0269e770$@nextmovesoftware.com
State New
Headers
Series [DOC] Document the VEC_PERM_EXPR tree code (and minor clean-ups). |

Commit Message

Roger Sayle Feb. 4, 2023, 8:33 p.m. UTC
  This patch (primarily) documents the VEC_PERM_EXPR tree code in
generic.texi.  For ease of review, it is provided below as a pair
of diffs.  The first contains just the new text added to describe
VEC_PERM_EXPR, the second tidies up this part of the documentation
by sorting the tree codes into alphabetical order, and providing
consistent section naming/capitalization, so changing this section
from "Vectors" to "Vector Expressions" (matching the nearby
"Unary and Binary Expressions").

Tested with make pdf and make html on x86_64-pc-linux-gnu.
The reviewer(s) can decide whether to approve just the new content,
or the content+clean-up.  Ok for mainline?


2023-02-04  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* doc/generic.texi <Expression Trees>: Standardize capitalization
	of section titles from "Expression trees".
	<Language-dependent Trees>: Likewise standardize capitalization
	from "Language-dependent trees".
	<Constant expressions>: Capitalized from "Constant Expressions".
	<Vector Expressions>: Standardized section name from "Vectors".
	Document VEC_PERM_EXPR tree code.  Sort tree codes alphabetically.


Thanks in advance,
Roger
--

diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 3f52d30..93b2e00 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -38,10 +38,10 @@ seems inelegant.
 * Types::                       Fundamental and aggregate types.
 * Declarations::                Type declarations and variables.
 * Attributes::                  Declaration and type attributes.
-* Expressions: Expression trees.            Operating on data.
+* Expressions: Expression Trees.            Operating on data.
 * Statements::                  Control flow and related trees.
 * Functions::           	Function bodies, linkage, and other aspects.
-* Language-dependent trees::    Topics and trees specific to language front ends.
+* Language-dependent Trees::    Topics and trees specific to language front ends.
 * C and C++ Trees::     	Trees specific to C and C++.
 @end menu
 
@@ -976,7 +976,7 @@ This macro returns the attributes on the type @var{type}.
 @c Expressions
 @c ---------------------------------------------------------------------
 
-@node Expression trees
+@node Expression Trees
 @section Expressions
 @cindex expression
 @findex TREE_TYPE
@@ -1021,14 +1021,14 @@ As this example indicates, the operands are zero-indexed.
 
 
 @menu
-* Constants: Constant expressions.
+* Constants: Constant Expressions.
 * Storage References::
 * Unary and Binary Expressions::
-* Vectors::
+* Vector Expressions::
 @end menu
 
-@node Constant expressions
-@subsection Constant expressions
+@node Constant Expressions
+@subsection Constant Expressions
 @tindex INTEGER_CST
 @findex tree_int_cst_lt
 @findex tree_int_cst_equal
@@ -1803,36 +1803,119 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
 @end table
 
 
-@node Vectors
-@subsection Vectors
+@node Vector Expressions
+@subsection Vector Expressions
+@tindex SAD_EXPR
+@tindex VEC_COND_EXPR
 @tindex VEC_DUPLICATE_EXPR
-@tindex VEC_SERIES_EXPR
 @tindex VEC_LSHIFT_EXPR
 @tindex VEC_RSHIFT_EXPR
+@tindex VEC_PACK_FIX_TRUNC_EXPR
+@tindex VEC_PACK_FLOAT_EXPR
+@tindex VEC_PACK_SAT_EXPR
+@tindex VEC_PACK_TRUNC_EXPR
+@tindex VEC_PERM_EXPR
+@tindex VEC_SERIES_EXPR
+@tindex VEC_UNPACK_FIX_TRUNC_HI_EXPR
+@tindex VEC_UNPACK_FIX_TRUNC_LO_EXPR
+@tindex VEC_UNPACK_FLOAT_HI_EXPR
+@tindex VEC_UNPACK_FLOAT_LO_EXPR
+@tindex VEC_UNPACK_HI_EXPR
+@tindex VEC_UNPACK_LO_EXPR
+@tindex VEC_WIDEN_MINUS_HI_EXPR
+@tindex VEC_WIDEN_MINUS_LO_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
 @tindex VEC_WIDEN_MULT_LO_EXPR
 @tindex VEC_WIDEN_PLUS_HI_EXPR
 @tindex VEC_WIDEN_PLUS_LO_EXPR
-@tindex VEC_WIDEN_MINUS_HI_EXPR
-@tindex VEC_WIDEN_MINUS_LO_EXPR
-@tindex VEC_UNPACK_HI_EXPR
-@tindex VEC_UNPACK_LO_EXPR
-@tindex VEC_UNPACK_FLOAT_HI_EXPR
-@tindex VEC_UNPACK_FLOAT_LO_EXPR
-@tindex VEC_UNPACK_FIX_TRUNC_HI_EXPR
-@tindex VEC_UNPACK_FIX_TRUNC_LO_EXPR
-@tindex VEC_PACK_TRUNC_EXPR
-@tindex VEC_PACK_SAT_EXPR
-@tindex VEC_PACK_FIX_TRUNC_EXPR
-@tindex VEC_PACK_FLOAT_EXPR
-@tindex VEC_COND_EXPR
-@tindex SAD_EXPR
 
 @table @code
+@item SAD_EXPR
+This node represents the Sum of Absolute Differences operation.  The three
+operands must be vectors of integral types.  The first and second operand
+must have the same type.  The size of the vector element of the third
+operand must be at lease twice of the size of the vector element of the
+first and second one.  The SAD is calculated between the first and second
+operands, added to the third operand, and returned.
+
+@item VEC_COND_EXPR
+These nodes represent @code{?:} expressions.  The three operands must be
+vectors of the same size and number of elements.  The second and third
+operands must have the same type as the entire expression.  The first
+operand is of signed integral vector type.  If an element of the first
+operand evaluates to a zero value, the corresponding element of the
+result is taken from the third operand. If it evaluates to a minus one
+value, it is taken from the second operand. It should never evaluate to
+any other value currently, but optimizations should not rely on that
+property. In contrast with a @code{COND_EXPR}, all operands are always
+evaluated.
+
 @item VEC_DUPLICATE_EXPR
 This node has a single operand and represents a vector in which every
 element is equal to that operand.
 
+@item VEC_LSHIFT_EXPR
+@itemx VEC_RSHIFT_EXPR
+These nodes represent whole vector left and right shifts, respectively.
+The first operand is the vector to shift; it will always be of vector type.
+The second operand is an expression for the number of bits by which to
+shift.  Note that the result is undefined if the second operand is larger
+than or equal to the first operand's type size.
+
+@item VEC_PACK_FIX_TRUNC_EXPR
+This node represents packing of elements of the two input vectors into the
+output vector, where the values are converted from floating point
+to fixed point.  Input operands are vectors that contain the same number
+of elements of a floating point type.  The result is a vector that contains
+twice as many elements of an integral type whose size is half as wide.  The
+elements of the two vectors are merged (concatenated) to form the output
+vector.
+
+@item VEC_PACK_FLOAT_EXPR
+This node represents packing of elements of the two input vectors into the
+output vector, where the values are converted from fixed point to floating
+point.  Input operands are vectors that contain the same number of elements
+of an integral type.  The result is a vector that contains twice as many
+elements of floating point type whose size is half as wide.  The elements of
+the two vectors are merged (concatenated) to form the output vector.
+
+@item VEC_PACK_SAT_EXPR
+This node represents packing of elements of the two input vectors into the
+output vector using saturation.  Input operands are vectors that contain
+the same number of elements of the same integral type.  The result is a
+vector that contains twice as many elements of an integral type whose size
+is half as wide.  The elements of the two vectors are demoted and merged
+(concatenated) to form the output vector.
+
+@item VEC_PACK_TRUNC_EXPR
+This node represents packing of truncated elements of the two input vectors
+into the output vector.  Input operands are vectors that contain the same
+number of elements of the same integral or floating point type.  The result
+is a vector that contains twice as many elements of an integral or floating
+point type whose size is half as wide. The elements of the two vectors are
+demoted and merged (concatenated) to form the output vector.
+
+@item VEC_PERM_EXPR
+This node represents a vector permute/blend operation.  The three operands
+must be vectors of the same number of elements.  The first and second
+operands must be vectors of the same type as the entire expression, and
+the third operand, @dfn{selector}, must be an integral vector type.
+
+The input elements are numbered from 0 in operand 1 through
+@math{2*@var{N}-1} in operand 2.  The elements of the selector are
+interpreted modulo @math{2*@var{N}}.
+
+The expression
+@code{@var{out} = VEC_PERM_EXPR<@var{v0}, @var{v1}, @var{selector}>}, 
+where @var{v0}, @var{v1} and @var{selector} have @var{N} elements, means
+@smallexample
+  for (int i = 0; i < N; i++)
+    @{
+      int j = selector[i] % (2*N);
+      out[i] = j < N ? v0[j] : v1[j-N];
+    @}
+@end smallexample
+
 @item VEC_SERIES_EXPR
 This node represents a vector formed from a scalar base and step,
 given as the first and second operands respectively.  Element @var{i}
@@ -1841,13 +1924,54 @@ of the result is equal to @samp{@var{base} + @var{i}*@var{step}}.
 This node is restricted to integral types, in order to avoid
 specifying the rounding behavior for floating-point types.
 
-@item VEC_LSHIFT_EXPR
-@itemx VEC_RSHIFT_EXPR
-These nodes represent whole vector left and right shifts, respectively.
-The first operand is the vector to shift; it will always be of vector type.
-The second operand is an expression for the number of bits by which to
-shift.  Note that the result is undefined if the second operand is larger
-than or equal to the first operand's type size.
+@item VEC_UNPACK_FIX_TRUNC_HI_EXPR
+@itemx VEC_UNPACK_FIX_TRUNC_LO_EXPR
+These nodes represent unpacking of the high and low parts of the input vector,
+where the values are truncated from floating point to fixed point.  The
+single operand is a vector that contains @code{N} elements of the same
+floating point type.  The result is a vector that contains half as many
+elements of an integral type whose size is twice as wide.  In the case of
+@code{VEC_UNPACK_FIX_TRUNC_HI_EXPR} the high @code{N/2} elements of the
+vector are extracted and converted with truncation.  In the case of
+@code{VEC_UNPACK_FIX_TRUNC_LO_EXPR} the low @code{N/2} elements of the
+vector are extracted and converted with truncation.
+
+@item VEC_UNPACK_FLOAT_HI_EXPR
+@itemx VEC_UNPACK_FLOAT_LO_EXPR
+These nodes represent unpacking of the high and low parts of the input vector,
+where the values are converted from fixed point to floating point.  The
+single operand is a vector that contains @code{N} elements of the same
+integral type.  The result is a vector that contains half as many elements
+of a floating point type whose size is twice as wide.  In the case of
+@code{VEC_UNPACK_FLOAT_HI_EXPR} the high @code{N/2} elements of the vector are
+extracted, converted and widened.  In the case of @code{VEC_UNPACK_FLOAT_LO_EXPR}
+the low @code{N/2} elements of the vector are extracted, converted and widened.
+
+@item VEC_UNPACK_HI_EXPR
+@itemx VEC_UNPACK_LO_EXPR
+These nodes represent unpacking of the high and low parts of the input vector,
+respectively.  The single operand is a vector that contains @code{N} elements
+of the same integral or floating point type.  The result is a vector
+that contains half as many elements, of an integral or floating point type
+whose size is twice as wide.  In the case of @code{VEC_UNPACK_HI_EXPR} the
+high @code{N/2} elements of the vector are extracted and widened (promoted).
+In the case of @code{VEC_UNPACK_LO_EXPR} the low @code{N/2} elements of the
+vector are extracted and widened (promoted).
+
+@item VEC_WIDEN_MINUS_HI_EXPR
+@itemx VEC_WIDEN_MINUS_LO_EXPR
+These nodes represent widening vector subtraction of the high and low parts of
+the two input vectors, respectively.  Their operands are vectors that contain
+the same number of elements (@code{N}) of the same integral type. The high/low
+elements of the second vector are subtracted from the high/low elements of the
+first. The result is a vector that contains half as many elements, of an
+integral type whose size is twice as wide.  In the case of
+@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second
+vector are subtracted from the high @code{N/2} of the first to produce the
+vector of @code{N/2} products.  In the case of
+@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second
+vector are subtracted from the low @code{N/2} of the first to produce the
+vector of @code{N/2} products.
 
 @item VEC_WIDEN_MULT_HI_EXPR
 @itemx VEC_WIDEN_MULT_LO_EXPR
@@ -1873,108 +1997,6 @@ is twice as wide.  In the case of @code{VEC_WIDEN_PLUS_HI_EXPR} the high
 @code{N/2} elements of the two vectors are added to produce the vector of
 @code{N/2} products.
 
-@item VEC_WIDEN_MINUS_HI_EXPR
-@itemx VEC_WIDEN_MINUS_LO_EXPR
-These nodes represent widening vector subtraction of the high and low parts of
-the two input vectors, respectively.  Their operands are vectors that contain
-the same number of elements (@code{N}) of the same integral type. The high/low
-elements of the second vector are subtracted from the high/low elements of the
-first. The result is a vector that contains half as many elements, of an
-integral type whose size is twice as wide.  In the case of
-@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second
-vector are subtracted from the high @code{N/2} of the first to produce the
-vector of @code{N/2} products.  In the case of
-@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second
-vector are subtracted from the low @code{N/2} of the first to produce the
-vector of @code{N/2} products.
-
-@item VEC_UNPACK_HI_EXPR
-@itemx VEC_UNPACK_LO_EXPR
-These nodes represent unpacking of the high and low parts of the input vector,
-respectively.  The single operand is a vector that contains @code{N} elements
-of the same integral or floating point type.  The result is a vector
-that contains half as many elements, of an integral or floating point type
-whose size is twice as wide.  In the case of @code{VEC_UNPACK_HI_EXPR} the
-high @code{N/2} elements of the vector are extracted and widened (promoted).
-In the case of @code{VEC_UNPACK_LO_EXPR} the low @code{N/2} elements of the
-vector are extracted and widened (promoted).
-
-@item VEC_UNPACK_FLOAT_HI_EXPR
-@itemx VEC_UNPACK_FLOAT_LO_EXPR
-These nodes represent unpacking of the high and low parts of the input vector,
-where the values are converted from fixed point to floating point.  The
-single operand is a vector that contains @code{N} elements of the same
-integral type.  The result is a vector that contains half as many elements
-of a floating point type whose size is twice as wide.  In the case of
-@code{VEC_UNPACK_FLOAT_HI_EXPR} the high @code{N/2} elements of the vector are
-extracted, converted and widened.  In the case of @code{VEC_UNPACK_FLOAT_LO_EXPR}
-the low @code{N/2} elements of the vector are extracted, converted and widened.
-
-@item VEC_UNPACK_FIX_TRUNC_HI_EXPR
-@itemx VEC_UNPACK_FIX_TRUNC_LO_EXPR
-These nodes represent unpacking of the high and low parts of the input vector,
-where the values are truncated from floating point to fixed point.  The
-single operand is a vector that contains @code{N} elements of the same
-floating point type.  The result is a vector that contains half as many
-elements of an integral type whose size is twice as wide.  In the case of
-@code{VEC_UNPACK_FIX_TRUNC_HI_EXPR} the high @code{N/2} elements of the
-vector are extracted and converted with truncation.  In the case of
-@code{VEC_UNPACK_FIX_TRUNC_LO_EXPR} the low @code{N/2} elements of the
-vector are extracted and converted with truncation.
-
-@item VEC_PACK_TRUNC_EXPR
-This node represents packing of truncated elements of the two input vectors
-into the output vector.  Input operands are vectors that contain the same
-number of elements of the same integral or floating point type.  The result
-is a vector that contains twice as many elements of an integral or floating
-point type whose size is half as wide. The elements of the two vectors are
-demoted and merged (concatenated) to form the output vector.
-
-@item VEC_PACK_SAT_EXPR
-This node represents packing of elements of the two input vectors into the
-output vector using saturation.  Input operands are vectors that contain
-the same number of elements of the same integral type.  The result is a
-vector that contains twice as many elements of an integral type whose size
-is half as wide.  The elements of the two vectors are demoted and merged
-(concatenated) to form the output vector.
-
-@item VEC_PACK_FIX_TRUNC_EXPR
-This node represents packing of elements of the two input vectors into the
-output vector, where the values are converted from floating point
-to fixed point.  Input operands are vectors that contain the same number
-of elements of a floating point type.  The result is a vector that contains
-twice as many elements of an integral type whose size is half as wide.  The
-elements of the two vectors are merged (concatenated) to form the output
-vector.
-
-@item VEC_PACK_FLOAT_EXPR
-This node represents packing of elements of the two input vectors into the
-output vector, where the values are converted from fixed point to floating
-point.  Input operands are vectors that contain the same number of elements
-of an integral type.  The result is a vector that contains twice as many
-elements of floating point type whose size is half as wide.  The elements of
-the two vectors are merged (concatenated) to form the output vector.
-
-@item VEC_COND_EXPR
-These nodes represent @code{?:} expressions.  The three operands must be
-vectors of the same size and number of elements.  The second and third
-operands must have the same type as the entire expression.  The first
-operand is of signed integral vector type.  If an element of the first
-operand evaluates to a zero value, the corresponding element of the
-result is taken from the third operand. If it evaluates to a minus one
-value, it is taken from the second operand. It should never evaluate to
-any other value currently, but optimizations should not rely on that
-property. In contrast with a @code{COND_EXPR}, all operands are always
-evaluated.
-
-@item SAD_EXPR
-This node represents the Sum of Absolute Differences operation.  The three
-operands must be vectors of integral types.  The first and second operand
-must have the same type.  The size of the vector element of the third
-operand must be at lease twice of the size of the vector element of the
-first and second one.  The SAD is calculated between the first and second
-operands, added to the third operand, and returned.
-
 @end table
 
 
@@ -2698,12 +2720,12 @@ optimization options specified on the command line.
 @end ftable
 
 @c ---------------------------------------------------------------------
-@c Language-dependent trees
+@c Language-dependent Trees
 @c ---------------------------------------------------------------------
 
-@node Language-dependent trees
-@section Language-dependent trees
-@cindex language-dependent trees
+@node Language-dependent Trees
+@section Language-dependent Trees
+@cindex language-dependent Trees
 
 Front ends may wish to keep some state associated with various GENERIC
 trees while parsing.  To support this, trees provide a set of flags
  

Comments

Richard Biener Feb. 6, 2023, 7:10 a.m. UTC | #1
On Sat, Feb 4, 2023 at 9:35 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> This patch (primarily) documents the VEC_PERM_EXPR tree code in
> generic.texi.  For ease of review, it is provided below as a pair
> of diffs.  The first contains just the new text added to describe
> VEC_PERM_EXPR, the second tidies up this part of the documentation
> by sorting the tree codes into alphabetical order, and providing
> consistent section naming/capitalization, so changing this section
> from "Vectors" to "Vector Expressions" (matching the nearby
> "Unary and Binary Expressions").
>
> Tested with make pdf and make html on x86_64-pc-linux-gnu.
> The reviewer(s) can decide whether to approve just the new content,
> or the content+clean-up.  Ok for mainline?

+@item VEC_PERM_EXPR
+This node represents a vector permute/blend operation.  The three operands
+must be vectors of the same number of elements.  The first and second
+operands must be vectors of the same type as the entire expression,

this was recently relaxed for the case of constant permutes in which case
the first and second operands only have to have the same element type
as the result.  See tree-cfg.cc:verify_gimple_assign_ternary.

The following description will become a bit more awkward here and
for rhs1/rhs2 with different number of elements the modulo interpretation
doesn't hold - I believe we require in-bounds elements for constant
permutes.  Richard can probably clarify things here.

Thanks,
Richard.

>
>
> 2023-02-04  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * doc/generic.texi <Expression Trees>: Standardize capitalization
>         of section titles from "Expression trees".
>         <Language-dependent Trees>: Likewise standardize capitalization
>         from "Language-dependent trees".
>         <Constant expressions>: Capitalized from "Constant Expressions".
>         <Vector Expressions>: Standardized section name from "Vectors".
>         Document VEC_PERM_EXPR tree code.  Sort tree codes alphabetically.
>
>
> Thanks in advance,
> Roger
> --
>
  
Richard Sandiford Feb. 6, 2023, 12:22 p.m. UTC | #2
Richard Biener <richard.guenther@gmail.com> writes:
> On Sat, Feb 4, 2023 at 9:35 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>>
>>
>> This patch (primarily) documents the VEC_PERM_EXPR tree code in
>> generic.texi.  For ease of review, it is provided below as a pair
>> of diffs.  The first contains just the new text added to describe
>> VEC_PERM_EXPR, the second tidies up this part of the documentation
>> by sorting the tree codes into alphabetical order, and providing
>> consistent section naming/capitalization, so changing this section
>> from "Vectors" to "Vector Expressions" (matching the nearby
>> "Unary and Binary Expressions").
>>
>> Tested with make pdf and make html on x86_64-pc-linux-gnu.
>> The reviewer(s) can decide whether to approve just the new content,
>> or the content+clean-up.  Ok for mainline?
>
> +@item VEC_PERM_EXPR
> +This node represents a vector permute/blend operation.  The three operands
> +must be vectors of the same number of elements.  The first and second
> +operands must be vectors of the same type as the entire expression,
>
> this was recently relaxed for the case of constant permutes in which case
> the first and second operands only have to have the same element type
> as the result.  See tree-cfg.cc:verify_gimple_assign_ternary.
>
> The following description will become a bit more awkward here and
> for rhs1/rhs2 with different number of elements the modulo interpretation
> doesn't hold - I believe we require in-bounds elements for constant
> permutes.  Richard can probably clarify things here.

I thought that the modulo behaviour still applies when the node has a
constant selector, it's just that the in-range form is the canonical one.

With variable-length vectors, I think it's in principle possible to have
a stepped constant selector whose start elements are in-range but whose
final elements aren't (and instead wrap around when applied).
E.g. the selector could zip the last quarter of the inputs followed
by the first quarter.

Thanks,
Richard
  
Roger Sayle Feb. 6, 2023, 2:44 p.m. UTC | #3
Perhaps I'm missing something (I'm not too familiar with SVE semantics), but
is there
a reason that the solution for PR96473 uses a VEC_PERM_EXPR and not just a 
VEC_DUPLICATE_EXPR?  The folding of sv1d1rq (svptrue_..., ...) doesn't seem
to
require either the blending or the permutation functionality of a
VEC_PERM_EXPR.
Instead, it seems to be misusing (the modified) VEC_PERM_EXPR as a form of
VIEW_CONVERT_EXPR that allows us to convert/mismatch the type of the
operands
to the type of the result.

Conceptually, (as in Richard's original motivation for the PR),
svint32_t foo (int32x4_t x) { return svld1rq (svptrue_b8 (), &x[0]); }
can be optimized to (something like)
svint32_t foo (int32x4_t x) { return svdup_32 (x[0]); }  // or dup z0.q,
z0.q[0] equivalent
hence it makes sense for fold to transform the gimple form of the first,
into the
gimple form of the second(?)

Just curious.
Roger
--

> -----Original Message-----
> From: Richard Sandiford <richard.sandiford@arm.com>
> Sent: 06 February 2023 12:22
> To: Richard Biener <richard.guenther@gmail.com>
> Cc: Roger Sayle <roger@nextmovesoftware.com>; GCC Patches <gcc-
> patches@gcc.gnu.org>
> Subject: Re: [DOC PATCH] Document the VEC_PERM_EXPR tree code (and minor
> clean-ups).
> 
> Richard Biener <richard.guenther@gmail.com> writes:
> > On Sat, Feb 4, 2023 at 9:35 PM Roger Sayle <roger@nextmovesoftware.com>
> wrote:
> >>
> >>
> >> This patch (primarily) documents the VEC_PERM_EXPR tree code in
> >> generic.texi.  For ease of review, it is provided below as a pair of
> >> diffs.  The first contains just the new text added to describe
> >> VEC_PERM_EXPR, the second tidies up this part of the documentation by
> >> sorting the tree codes into alphabetical order, and providing
> >> consistent section naming/capitalization, so changing this section
> >> from "Vectors" to "Vector Expressions" (matching the nearby "Unary
> >> and Binary Expressions").
> >>
> >> Tested with make pdf and make html on x86_64-pc-linux-gnu.
> >> The reviewer(s) can decide whether to approve just the new content,
> >> or the content+clean-up.  Ok for mainline?
> >
> > +@item VEC_PERM_EXPR
> > +This node represents a vector permute/blend operation.  The three
> > +operands must be vectors of the same number of elements.  The first
> > +and second operands must be vectors of the same type as the entire
> > +expression,
> >
> > this was recently relaxed for the case of constant permutes in which
> > case the first and second operands only have to have the same element
> > type as the result.  See tree-cfg.cc:verify_gimple_assign_ternary.
> >
> > The following description will become a bit more awkward here and for
> > rhs1/rhs2 with different number of elements the modulo interpretation
> > doesn't hold - I believe we require in-bounds elements for constant
> > permutes.  Richard can probably clarify things here.
> 
> I thought that the modulo behaviour still applies when the node has a
constant
> selector, it's just that the in-range form is the canonical one.
> 
> With variable-length vectors, I think it's in principle possible to have a
stepped
> constant selector whose start elements are in-range but whose final
elements
> aren't (and instead wrap around when applied).
> E.g. the selector could zip the last quarter of the inputs followed by the
first
> quarter.
> 
> Thanks,
> Richard
  
Prathamesh Kulkarni Feb. 7, 2023, 8:07 a.m. UTC | #4
On Mon, 6 Feb 2023 at 20:14, Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> Perhaps I'm missing something (I'm not too familiar with SVE semantics), but
> is there
> a reason that the solution for PR96473 uses a VEC_PERM_EXPR and not just a
> VEC_DUPLICATE_EXPR?  The folding of sv1d1rq (svptrue_..., ...) doesn't seem
> to
> require either the blending or the permutation functionality of a
> VEC_PERM_EXPR.
> Instead, it seems to be misusing (the modified) VEC_PERM_EXPR as a form of
> VIEW_CONVERT_EXPR that allows us to convert/mismatch the type of the
> operands
> to the type of the result.
Hi,
I am not sure if we could use VEC_DUPLICATE_EXPR for PR96463 case as-is.
Perhaps we could extend VEC_DUPLICATE_EXPR to take N operands,
so the resulting vector has npatterns = N, nelts_per_pattern = 1 ?
AFAIU, extending VEC_PERM_EXPR to handle vectors with different lengths,
would allow for more optimization opportunities besides PR96463.
>
> Conceptually, (as in Richard's original motivation for the PR),
> svint32_t foo (int32x4_t x) { return svld1rq (svptrue_b8 (), &x[0]); }
> can be optimized to (something like)
> svint32_t foo (int32x4_t x) { return svdup_32 (x[0]); }  // or dup z0.q,
> z0.q[0] equivalent
I guess that should be equivalent to svdupq_s32 (x[0], x[1], x[2], x[3]) ?

Thanks,
Prathamesh



> hence it makes sense for fold to transform the gimple form of the first,
> into the
> gimple form of the second(?)
>
> Just curious.
> Roger
> --
>
> > -----Original Message-----
> > From: Richard Sandiford <richard.sandiford@arm.com>
> > Sent: 06 February 2023 12:22
> > To: Richard Biener <richard.guenther@gmail.com>
> > Cc: Roger Sayle <roger@nextmovesoftware.com>; GCC Patches <gcc-
> > patches@gcc.gnu.org>
> > Subject: Re: [DOC PATCH] Document the VEC_PERM_EXPR tree code (and minor
> > clean-ups).
> >
> > Richard Biener <richard.guenther@gmail.com> writes:
> > > On Sat, Feb 4, 2023 at 9:35 PM Roger Sayle <roger@nextmovesoftware.com>
> > wrote:
> > >>
> > >>
> > >> This patch (primarily) documents the VEC_PERM_EXPR tree code in
> > >> generic.texi.  For ease of review, it is provided below as a pair of
> > >> diffs.  The first contains just the new text added to describe
> > >> VEC_PERM_EXPR, the second tidies up this part of the documentation by
> > >> sorting the tree codes into alphabetical order, and providing
> > >> consistent section naming/capitalization, so changing this section
> > >> from "Vectors" to "Vector Expressions" (matching the nearby "Unary
> > >> and Binary Expressions").
> > >>
> > >> Tested with make pdf and make html on x86_64-pc-linux-gnu.
> > >> The reviewer(s) can decide whether to approve just the new content,
> > >> or the content+clean-up.  Ok for mainline?
> > >
> > > +@item VEC_PERM_EXPR
> > > +This node represents a vector permute/blend operation.  The three
> > > +operands must be vectors of the same number of elements.  The first
> > > +and second operands must be vectors of the same type as the entire
> > > +expression,
> > >
> > > this was recently relaxed for the case of constant permutes in which
> > > case the first and second operands only have to have the same element
> > > type as the result.  See tree-cfg.cc:verify_gimple_assign_ternary.
> > >
> > > The following description will become a bit more awkward here and for
> > > rhs1/rhs2 with different number of elements the modulo interpretation
> > > doesn't hold - I believe we require in-bounds elements for constant
> > > permutes.  Richard can probably clarify things here.
> >
> > I thought that the modulo behaviour still applies when the node has a
> constant
> > selector, it's just that the in-range form is the canonical one.
> >
> > With variable-length vectors, I think it's in principle possible to have a
> stepped
> > constant selector whose start elements are in-range but whose final
> elements
> > aren't (and instead wrap around when applied).
> > E.g. the selector could zip the last quarter of the inputs followed by the
> first
> > quarter.
> >
> > Thanks,
> > Richard
>
  
Sandra Loosemore March 12, 2023, 1:37 a.m. UTC | #5
On 2/4/23 13:33, Roger Sayle wrote:
> 
> This patch (primarily) documents the VEC_PERM_EXPR tree code in
> generic.texi.  For ease of review, it is provided below as a pair
> of diffs.  The first contains just the new text added to describe
> VEC_PERM_EXPR, the second tidies up this part of the documentation
> by sorting the tree codes into alphabetical order, and providing
> consistent section naming/capitalization, so changing this section
> from "Vectors" to "Vector Expressions" (matching the nearby
> "Unary and Binary Expressions").
> 
> Tested with make pdf and make html on x86_64-pc-linux-gnu.
> The reviewer(s) can decide whether to approve just the new content,
> or the content+clean-up.  Ok for mainline?
> 
> 
> 2023-02-04  Roger Sayle  <roger@nextmovesoftware.com>
> 
> gcc/ChangeLog
> 	* doc/generic.texi <Expression Trees>: Standardize capitalization
> 	of section titles from "Expression trees".
> 	<Language-dependent Trees>: Likewise standardize capitalization
> 	from "Language-dependent trees".
> 	<Constant expressions>: Capitalized from "Constant Expressions".
> 	<Vector Expressions>: Standardized section name from "Vectors".
> 	Document VEC_PERM_EXPR tree code.  Sort tree codes alphabetically.

Trying to catch up on old mail here....

IIUC the proposed VEC_PERM_EXPR wording was rejected on technical 
grounds.  I confess I know nothing about this, so I can't usefully 
suggest alternate wording myself.  :-(

The other changes look OK except that the correct capitalization would 
be "Language-Dependent Trees".

-Sandra
  

Patch

diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 3f52d30..4e8f131 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1826,6 +1826,7 @@  a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
 @tindex VEC_PACK_FIX_TRUNC_EXPR
 @tindex VEC_PACK_FLOAT_EXPR
 @tindex VEC_COND_EXPR
+@tindex VEC_PERM_EXPR
 @tindex SAD_EXPR
 
 @table @code
@@ -1967,6 +1968,27 @@  any other value currently, but optimizations should not rely on that
 property. In contrast with a @code{COND_EXPR}, all operands are always
 evaluated.
 
+@item VEC_PERM_EXPR
+This node represents a vector permute/blend operation.  The three operands
+must be vectors of the same number of elements.  The first and second
+operands must be vectors of the same type as the entire expression, and
+the third operand, @dfn{selector}, must be an integral vector type.
+
+The input elements are numbered from 0 in operand 1 through
+@math{2*@var{N}-1} in operand 2.  The elements of the selector are
+interpreted modulo @math{2*@var{N}}.
+
+The expression
+@code{@var{out} = VEC_PERM_EXPR<@var{v0}, @var{v1}, @var{selector}>}, 
+where @var{v0}, @var{v1} and @var{selector} have @var{N} elements, means
+@smallexample
+  for (int i = 0; i < N; i++)
+    @{
+      int j = selector[i] % (2*N);
+      out[i] = j < N ? v0[j] : v1[j-N];
+    @}
+@end smallexample
+
 @item SAD_EXPR
 This node represents the Sum of Absolute Differences operation.  The three
 operands must be vectors of integral types.  The first and second operand