PR middle-end/65855: Scalar evolution for quadratic chrecs

  This patch improves GCC's scalar evolution and final value replacement
optimizations by supporting triangular/quadratic/trapezoid chrecs which
resolves both PR middle-end/65855 and PR c/80852, but alas not (yet)
PR tree-optimization/46186.

I've listed Richard Biener as co-author, as this solution is based
heavily on his proposed patch in comment #4 of PR 65855 from 2015,
but with several significant changes.  The most important change is
a correction to formula used. For the scalar evolution {a, +, {b, +, c}},
there was an off-by-one error, so chrec_apply should not return
a + b*x + c*x*(x+1)/2, but a + b*x + c*x*(x-1)/2, which explains
why the original patch didn't produce the expected results.

Another significant set of changes involves handling the various
type mismatches and overflows.  In chrec_apply the evolution is
evaluated after x iterations (which is an unsigned integer type,
called orig_type in this patch) but a, b and c may be any type
including floating point (see PR tree-opt/23391) and including
types that trap on overflow with -ftrapv (see PR tree-opt/79721),
and in the case of pointer arithmetic, b and c may have a
different type (sizetype) to a!  Additionally, the division by
two in "c*x*(x-1)/2" requires the correct top bit in modulo
arithmetic, which means that the multiplication typically needs
to be performed with more precision (in a wider mode) than
orig_type [unless c is an even integer constant, or x (the
number of iterations) is a known constant at compile-time].

Finally, final value replacement is only an effective optimization
if the expression used to compute the result of the loop is cheaper
than the loop itself, hence chrec_apply needs to return an optimized
folded tree with the minimal number of operators.  Hence if b == c,
this patch calculates "a + c*x*(x+1)/2", when c is even we can
perform the division at compile-time, and when x is a non-trivial
expression, we wrap it in a SAVE_EXPR so that the lowering to
gimple can reuse the common subexpression.

Correctly handling all of the corner cases results in a patch
significantly larger than the original "proof-of-concept".
There's one remaining refinement, marked as TODO in the patch,
which is to also support unsigned 64-bit to 128-bit widening
multiplications (umulditi3) on targets that support it, such as
x86_64, as used by LLVM, but this patch provides the core
target-independent functionality.  [This last piece would
handle/resolve the testcase in PR tree-opt/46186 which
requires 128-bit TImode operations, not suitable for all
backend targets].

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures.  Ok for mainline?

2022-02-18  Roger Sayle  <roger@nextmovesoftware.com>
	    Richard Biener  <rguenther@suse.de>

gcc/ChangeLog
	PR target/65855
	PR c/80852
	* tree-chrec.cc (chrec_fold_divide_by_2): New function to
	divide a chrec by two, honoring the type of the chrec.
	(chrec_save_expr): New function to wrap a chrec in a
	SAVE_EXPR, but without TREE_SIDE_EFFECTS.
	(chrec_apply_triangular): New helper function of chrec
	apply to evaluate a quadratic/triangular chrec.
	(chrec_apply): Expand/clarify comment before function.
	Handle application of any chrec after zero iterations, i.e. A.
	Recursively handle cases where the iteration count is
	conditional.  Handle quadratic/triangular chrecs by calling
	the new chrec_apply_triangular function.
	(chrec_convert_rhs): Handle conversion of integer constants
	to scalar floating point types here (moved from chrec_apply).
	* tree-scalar-evolution.cc (interpret_expr): Handle SAVE_EXPR
	by (tail call) recursion.
	(expression_expensive_p): Calculate the cost of a SAVE_EXPR
	as half the cost of its operand, i.e. assume it is reused.

gcc/testsuite/ChangeLog
	PR target/65855
	PR c/80852
	* gcc.dg/tree-ssa/pr65855.c: New test case.
	* gcc.dg/tree-ssa/pr80852.c: New test case.
	* gcc.dg/vect/vect-iv-11.c: Update to reflect that the loop is
	no longer vectorized, but calculated by final value replacement.

Roger
--

Message ID	003301d824f4$25ec8f70$71c5ae50$@nextmovesoftware.com
State	New
Headers	DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CD455385842F From: "Roger Sayle" <roger@nextmovesoftware.com> To: <gcc-patches@gcc.gnu.org> Subject: [PATCH] PR middle-end/65855: Scalar evolution for quadratic chrecs Date: Fri, 18 Feb 2022 18:20:02 -0000 Message-ID: <003301d824f4$25ec8f70$71c5ae50$@nextmovesoftware.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0034_01D824F4.25EDC7F0" Thread-Index: Adgk9CDwYgj6GqTnTFG1+NdnNBfrcw== Content-Language: en-gb Precedence: list Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
Series	PR middle-end/65855: Scalar evolution for quadratic chrecs \| PR middle-end/65855: Scalar evolution for quadratic chrecs

PR middle-end/65855: Scalar evolution for quadratic chrecs

Commit Message

Comments

Patch