tree-optimization/105219 - bogus max iters for vectorized epilogue
Commit Message
The following makes sure to take into account prologue peeling
when trying to narrow down the maximum number of iterations
computed for the epilogue of a vectorized epilogue.
Bootstrap & regtest running on x86_64-unknown-linux-gnu.
I did not verify this solves the original aarch64 testcase yet
but it looks like a simpler fix and explains why I don't see
the issue on the 11 branch which does otherwise the same transforms.
Richard.
2022-04-27 Richard Biener <rguenther@suse.de>
PR tree-optimization/105219
* tree-vect-loop.cc (vect_transform_loop): Disable
special code narrowing the vectorized epilogue epilogue
max iterations when peeling for alignment was in effect.
* gcc.dg/vect/pr105219.c: New testcase.
---
gcc/testsuite/gcc.dg/vect/pr105219.c | 29 ++++++++++++++++++++++++++++
gcc/tree-vect-loop.cc | 2 +-
2 files changed, 30 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/pr105219.c
new file mode 100644
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-additional-options "-O3" } */
+/* { dg-additional-options "-mtune=intel" { target x86_64-*-* i?86-*-* } } */
+
+#include "tree-vect.h"
+
+int data[128];
+
+void __attribute((noipa))
+foo (int *data, int n)
+{
+ for (int i = 0; i < n; ++i)
+ data[i] = i;
+}
+
+int main()
+{
+ check_vect ();
+ for (int start = 0; start < 16; ++start)
+ for (int n = 1; n < 3*16; ++n)
+ {
+ __builtin_memset (data, 0, sizeof (data));
+ foo (&data[start], n);
+ for (int j = 0; j < n; ++j)
+ if (data[start + j] != j)
+ __builtin_abort ();
+ }
+ return 0;
+}
@@ -9977,7 +9977,7 @@ vect_transform_loop (loop_vec_info loop_vinfo, gimple *loop_vectorized_call)
lowest_vf) - 1
: wi::udiv_floor (loop->nb_iterations_upper_bound + bias_for_lowest,
lowest_vf) - 1);
- if (main_vinfo)
+ if (main_vinfo && !main_vinfo->peeling_for_alignment)
{
unsigned int bound;
poly_uint64 main_iters