First refactor of vect_analyze_loop

Message ID 3q36qpn1-o5n7-nq2r-455p-rprqqsr532@fhfr.qr
State New
Headers
Series First refactor of vect_analyze_loop |

Commit Message

Richard Biener Oct. 27, 2021, 12:10 p.m. UTC
  This refactors the main loop analysis part in vect_analyze_loop,
re-purposing the existing vect_reanalyze_as_main_loop for this
to reduce code duplication.  Failure flow is a bit tricky since
we want to extract info from the analyzed loop but I wanted to
share the destruction part.  Thus I add some std::function and
lambda to funnel post-analysis for the case we want that
(when analyzing from the main iteration but not when re-analyzing
an epilogue as main).

I realize this probably doesn't help the unroll case yet, but it
looked like an improvement.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

2021-10-27  Richard Biener  <rguenther@suse.de>

	* tree-vect-loop.c: Include <functional>.
	(vect_reanalyze_as_main_loop): Rename to...
	(vect_analyze_loop_1): ... this and generalize to be
	able to use it twice ...
	(vect_analyze_loop): ... here.
---
 gcc/tree-vect-loop.c | 202 ++++++++++++++++++++++---------------------
 1 file changed, 102 insertions(+), 100 deletions(-)
  

Comments

Richard Sandiford Oct. 27, 2021, 4:27 p.m. UTC | #1
Richard Biener <rguenther@suse.de> writes:
> This refactors the main loop analysis part in vect_analyze_loop,
> re-purposing the existing vect_reanalyze_as_main_loop for this
> to reduce code duplication.  Failure flow is a bit tricky since
> we want to extract info from the analyzed loop but I wanted to
> share the destruction part.  Thus I add some std::function and
> lambda to funnel post-analysis for the case we want that
> (when analyzing from the main iteration but not when re-analyzing
> an epilogue as main).

Thanks for cleaning this up.

FWIW, as I mentioned on irc, I think the loop could be simplified quite
a bit if we were prepared to analyse loops both as an epilogue and
(independently) as a main loop.

I think the geology of the code is something like this:

layer 1:
  Original loop that tries fallback vector modes if the autodetected
  one fails.

layer 2:
  Add support for simdlen.  This required continuing after finding
  a match in case a later mode corresponded with the simdlen.

layer 3:
  Add epilogue vinfos.

layer 4:
  Restructure to support layers 5 and 6.

layer 5:
  Add support for multiple vector sizes in a loop.  This needed extra
  code to avoid redundant analysis attempts.

layer 6:
  Add VECT_COMPARE_COSTS (first cut).  At the time this was relatively
  simple [bcc7e346bf9b5dc77797ea949d6adc740deb30ca] since it just meant
  tweaking the “continuing” condition from (2).

  However, a (deliberate) wart was that it only tried treating each
  mode as a replacement for the loop_vinfo at the end of the current
  list (if the main loop is the head of the list and epilogues follow).

  This was supposed to be a compile-time improvement, since it meant
  we still only analysed with each mode once.

layer 7:
  Reanalyze a replacement epilogue loop as a main loop before comparing
  it with the existing main loop.  This prevented a wrong code bug but
  defeated part of the compile-time optimisation from (6).

So it's already necessary to analyse a loop as both an epilogue loop
and a main loop in some cases.

The requirement to analyse loops only once also prevents us from being
able to vectorise the epilogue of an omp simdlen loop, because for
something like -mpreferred-vector-width=256, we'd try AVX256 before
AVX512, even if the simdlen forced AVX512.

> I realize this probably doesn't help the unroll case yet, but it
> looked like an improvement.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> OK?
>
> Thanks,
> Richard.
>
> 2021-10-27  Richard Biener  <rguenther@suse.de>
>
> 	* tree-vect-loop.c: Include <functional>.
> 	(vect_reanalyze_as_main_loop): Rename to...
> 	(vect_analyze_loop_1): ... this and generalize to be
> 	able to use it twice ...
> 	(vect_analyze_loop): ... here.
> ---
>  gcc/tree-vect-loop.c | 202 ++++++++++++++++++++++---------------------
>  1 file changed, 102 insertions(+), 100 deletions(-)
>
> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> index 961c1623f81..9a62475a69f 100644
> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
>  <http://www.gnu.org/licenses/>.  */
>  
>  #define INCLUDE_ALGORITHM
> +#define INCLUDE_FUNCTIONAL
>  #include "config.h"
>  #include "system.h"
>  #include "coretypes.h"
> @@ -2898,43 +2899,63 @@ vect_joust_loop_vinfos (loop_vec_info new_loop_vinfo,
>    return true;
>  }
>  
> -/* If LOOP_VINFO is already a main loop, return it unmodified.  Otherwise
> -   try to reanalyze it as a main loop.  Return the loop_vinfo on success
> -   and null on failure.  */
> +/* Analyze LOOP with VECTOR_MODE and as epilogue if MAIN_LOOP_VINFO is
> +   not NULL.  Process the analyzed loop with PROCESS even if analysis
> +   failed.  Sets *N_STMTS and FATAL according to the analysis.
> +   Return the loop_vinfo on success and wrapped null on failure.  */
>  
> -static loop_vec_info
> -vect_reanalyze_as_main_loop (loop_vec_info loop_vinfo, unsigned int *n_stmts)
> +static opt_loop_vec_info
> +vect_analyze_loop_1 (class loop *loop, vec_info_shared *shared,
> +		     machine_mode vector_mode, loop_vec_info main_loop_vinfo,
> +		     unsigned int *n_stmts, bool &fatal,
> +		     std::function<void(loop_vec_info)> process = nullptr)
>  {
> -  if (!LOOP_VINFO_EPILOGUE_P (loop_vinfo))
> -    return loop_vinfo;
> +  /* Check the CFG characteristics of the loop (nesting, entry/exit).  */
> +  opt_loop_vec_info loop_vinfo = vect_analyze_loop_form (loop, shared);
> +  if (!loop_vinfo)
> +    {
> +      if (dump_enabled_p ())
> +	dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> +			 "bad loop form.\n");
> +      gcc_checking_assert (main_loop_vinfo == NULL);
> +      return loop_vinfo;
> +    }
> +  loop_vinfo->vector_mode = vector_mode;
>  
> -  if (dump_enabled_p ())
> -    dump_printf_loc (MSG_NOTE, vect_location,
> -		     "***** Reanalyzing as a main loop with vector mode %s\n",
> -		     GET_MODE_NAME (loop_vinfo->vector_mode));
> +  if (main_loop_vinfo)
> +    LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) = main_loop_vinfo;
>  
> -  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> -  vec_info_shared *shared = loop_vinfo->shared;
> -  opt_loop_vec_info main_loop_vinfo = vect_analyze_loop_form (loop, shared);
> -  gcc_assert (main_loop_vinfo);
> +  /* Run the main analysis.  */
> +  fatal = false;

Think this should be at the top, since we have an early return above.
The early return should be fatal.

> +  opt_result res = vect_analyze_loop_2 (loop_vinfo, fatal, n_stmts);
> +  loop->aux = NULL;
>  
> -  main_loop_vinfo->vector_mode = loop_vinfo->vector_mode;
> +  /* Process info before we destroy loop_vinfo upon analysis failure.  */
> +  if (process)
> +    process (loop_vinfo);
>  
> -  bool fatal = false;
> -  bool res = vect_analyze_loop_2 (main_loop_vinfo, fatal, n_stmts);
> -  loop->aux = NULL;
> -  if (!res)
> +  if (dump_enabled_p ())
>      {
> -      if (dump_enabled_p ())
> +      if (res)
>  	dump_printf_loc (MSG_NOTE, vect_location,
> -			 "***** Failed to analyze main loop with vector"
> -			 " mode %s\n",
> +			 "***** Analysis succeeded with vector mode %s\n",
>  			 GET_MODE_NAME (loop_vinfo->vector_mode));
> -      delete main_loop_vinfo;
> -      return NULL;
> +      else
> +	dump_printf_loc (MSG_NOTE, vect_location,
> +			 "***** Analysis failed with vector mode %s\n",
> +			 GET_MODE_NAME (loop_vinfo->vector_mode));
> +    }
> +
> +  if (!res)
> +    {
> +      delete loop_vinfo;
> +      if (fatal)
> +	gcc_checking_assert (main_loop_vinfo == NULL);
> +      return opt_loop_vec_info::propagate_failure (res);
>      }
> -  LOOP_VINFO_VECTORIZABLE_P (main_loop_vinfo) = 1;
> -  return main_loop_vinfo;
> +
> +  LOOP_VINFO_VECTORIZABLE_P (loop_vinfo) = 1;
> +  return loop_vinfo;
>  }
>  
>  /* Function vect_analyze_loop.
> @@ -2981,20 +3002,6 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
>    unsigned HOST_WIDE_INT simdlen = loop->simdlen;
>    while (1)
>      {
> -      /* Check the CFG characteristics of the loop (nesting, entry/exit).  */
> -      opt_loop_vec_info loop_vinfo = vect_analyze_loop_form (loop, shared);
> -      if (!loop_vinfo)
> -	{
> -	  if (dump_enabled_p ())
> -	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> -			     "bad loop form.\n");
> -	  gcc_checking_assert (first_loop_vinfo == NULL);
> -	  return loop_vinfo;
> -	}
> -      loop_vinfo->vector_mode = next_vector_mode;
> -
> -      bool fatal = false;
> -
>        /* When pick_lowest_cost_p is true, we should in principle iterate
>  	 over all the loop_vec_infos that LOOP_VINFO could replace and
>  	 try to vectorize LOOP_VINFO under the same conditions.
> @@ -3023,41 +3030,35 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
>  	 LOOP_VINFO fails when treated as an epilogue loop, succeeds when
>  	 treated as a standalone loop, and ends up being genuinely cheaper
>  	 than FIRST_LOOP_VINFO.  */
> -      if (vect_epilogues)
> -	LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) = first_loop_vinfo;
>  
> -      res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts);
> -      if (mode_i == 0)
> -	autodetected_vector_mode = loop_vinfo->vector_mode;
> -      if (dump_enabled_p ())
> +      bool fatal;
> +      auto cb = [&] (loop_vec_info loop_vinfo)
>  	{
> -	  if (res)
> -	    dump_printf_loc (MSG_NOTE, vect_location,
> -			     "***** Analysis succeeded with vector mode %s\n",
> -			     GET_MODE_NAME (loop_vinfo->vector_mode));
> -	  else
> -	    dump_printf_loc (MSG_NOTE, vect_location,
> -			     "***** Analysis failed with vector mode %s\n",
> -			     GET_MODE_NAME (loop_vinfo->vector_mode));
> -	}
> -
> -      loop->aux = NULL;
> -
> -      if (!fatal)
> -	while (mode_i < vector_modes.length ()
> -	       && vect_chooses_same_modes_p (loop_vinfo, vector_modes[mode_i]))
> -	  {
> -	    if (dump_enabled_p ())
> -	      dump_printf_loc (MSG_NOTE, vect_location,
> -			       "***** The result for vector mode %s would"
> -			       " be the same\n",
> -			       GET_MODE_NAME (vector_modes[mode_i]));
> -	    mode_i += 1;
> -	  }
> +	  if (mode_i ==0)
> +	    autodetected_vector_mode = loop_vinfo->vector_mode;
> +	  if (!fatal)
> +	    while (mode_i < vector_modes.length ()
> +		   && vect_chooses_same_modes_p (loop_vinfo,
> +						 vector_modes[mode_i]))
> +	      {
> +		if (dump_enabled_p ())
> +		  dump_printf_loc (MSG_NOTE, vect_location,
> +				   "***** The result for vector mode %s would"
> +				   " be the same\n",
> +				   GET_MODE_NAME (vector_modes[mode_i]));
> +		mode_i += 1;
> +	      }
> +	};
> +      opt_loop_vec_info loop_vinfo
> +	= vect_analyze_loop_1 (loop, shared, next_vector_mode,
> +			       vect_epilogues
> +			       ? (loop_vec_info)first_loop_vinfo : NULL,
> +			       &n_stmts, fatal, cb);
> +      if (fatal)
> +	break;
>  
> -      if (res)
> +      if (loop_vinfo)
>  	{
> -	  LOOP_VINFO_VECTORIZABLE_P (loop_vinfo) = 1;
>  	  vectorized_loops++;
>  
>  	  /* Once we hit the desired simdlen for the first time,
> @@ -3084,33 +3085,44 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
>  	      if (vinfos.is_empty ()
>  		  && vect_joust_loop_vinfos (loop_vinfo, first_loop_vinfo))
>  		{
> -		  loop_vec_info main_loop_vinfo
> -		    = vect_reanalyze_as_main_loop (loop_vinfo, &n_stmts);
> -		  if (main_loop_vinfo == loop_vinfo)
> -		    {
> -		      delete first_loop_vinfo;
> -		      first_loop_vinfo = opt_loop_vec_info::success (NULL);
> -		    }
> -		  else if (main_loop_vinfo
> -			   && vect_joust_loop_vinfos (main_loop_vinfo,
> -						      first_loop_vinfo))
> +		  if (!vect_epilogues)

!vect_epilogues is correct for current uses, but I think the original
!LOOP_VINFO_EPILOGUE_P (loop_vinfo) was more general.  As mentioned above,
in principle there's no reason why we couldn't reanalyse a loop as a
main loop if we fail to analyse it as an epilogue.

Richard

>  		    {
>  		      delete first_loop_vinfo;
>  		      first_loop_vinfo = opt_loop_vec_info::success (NULL);
> -		      delete loop_vinfo;
> -		      loop_vinfo
> -			= opt_loop_vec_info::success (main_loop_vinfo);
>  		    }
>  		  else
>  		    {
>  		      if (dump_enabled_p ())
>  			dump_printf_loc (MSG_NOTE, vect_location,
> -					 "***** No longer preferring vector"
> -					 " mode %s after reanalyzing the loop"
> -					 " as a main loop\n",
> +					 "***** Reanalyzing as a main loop "
> +					 "with vector mode %s\n",
>  					 GET_MODE_NAME
> -					   (main_loop_vinfo->vector_mode));
> -		      delete main_loop_vinfo;
> +					   (loop_vinfo->vector_mode));
> +		      opt_loop_vec_info main_loop_vinfo
> +			= vect_analyze_loop_1 (loop, shared,
> +					       loop_vinfo->vector_mode,
> +					       NULL, &n_stmts, fatal);
> +		      if (main_loop_vinfo
> +			  && vect_joust_loop_vinfos (main_loop_vinfo,
> +						     first_loop_vinfo))
> +			{
> +			  delete first_loop_vinfo;
> +			  first_loop_vinfo = opt_loop_vec_info::success (NULL);
> +			  delete loop_vinfo;
> +			  loop_vinfo
> +			    = opt_loop_vec_info::success (main_loop_vinfo);
> +			}
> +		      else
> +			{
> +			  if (dump_enabled_p ())
> +			    dump_printf_loc (MSG_NOTE, vect_location,
> +					     "***** No longer preferring vector"
> +					     " mode %s after reanalyzing the "
> +					     " loop as a main loop\n",
> +					     GET_MODE_NAME
> +					       (loop_vinfo->vector_mode));
> +			  delete main_loop_vinfo;
> +			}
>  		    }
>  		}
>  	    }
> @@ -3159,16 +3171,6 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
>  	  if (!simdlen && !vect_epilogues && !pick_lowest_cost_p)
>  	    break;
>  	}
> -      else
> -	{
> -	  delete loop_vinfo;
> -	  loop_vinfo = opt_loop_vec_info::success (NULL);
> -	  if (fatal)
> -	    {
> -	      gcc_checking_assert (first_loop_vinfo == NULL);
> -	      break;
> -	    }
> -	}
>  
>        /* Handle the case that the original loop can use partial
>  	 vectorization, but want to only adopt it for the epilogue.
  
Richard Biener Nov. 4, 2021, 12:12 p.m. UTC | #2
On Wed, 27 Oct 2021, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> > This refactors the main loop analysis part in vect_analyze_loop,
> > re-purposing the existing vect_reanalyze_as_main_loop for this
> > to reduce code duplication.  Failure flow is a bit tricky since
> > we want to extract info from the analyzed loop but I wanted to
> > share the destruction part.  Thus I add some std::function and
> > lambda to funnel post-analysis for the case we want that
> > (when analyzing from the main iteration but not when re-analyzing
> > an epilogue as main).
> 
> Thanks for cleaning this up.
> 
> FWIW, as I mentioned on irc, I think the loop could be simplified quite
> a bit if we were prepared to analyse loops both as an epilogue and
> (independently) as a main loop.
> 
> I think the geology of the code is something like this:
> 
> layer 1:
>   Original loop that tries fallback vector modes if the autodetected
>   one fails.
> 
> layer 2:
>   Add support for simdlen.  This required continuing after finding
>   a match in case a later mode corresponded with the simdlen.
> 
> layer 3:
>   Add epilogue vinfos.
> 
> layer 4:
>   Restructure to support layers 5 and 6.
> 
> layer 5:
>   Add support for multiple vector sizes in a loop.  This needed extra
>   code to avoid redundant analysis attempts.
> 
> layer 6:
>   Add VECT_COMPARE_COSTS (first cut).  At the time this was relatively
>   simple [bcc7e346bf9b5dc77797ea949d6adc740deb30ca] since it just meant
>   tweaking the ?continuing? condition from (2).
> 
>   However, a (deliberate) wart was that it only tried treating each
>   mode as a replacement for the loop_vinfo at the end of the current
>   list (if the main loop is the head of the list and epilogues follow).
> 
>   This was supposed to be a compile-time improvement, since it meant
>   we still only analysed with each mode once.
> 
> layer 7:
>   Reanalyze a replacement epilogue loop as a main loop before comparing
>   it with the existing main loop.  This prevented a wrong code bug but
>   defeated part of the compile-time optimisation from (6).
> 
> So it's already necessary to analyse a loop as both an epilogue loop
> and a main loop in some cases.
> 
> The requirement to analyse loops only once also prevents us from being
> able to vectorise the epilogue of an omp simdlen loop, because for
> something like -mpreferred-vector-width=256, we'd try AVX256 before
> AVX512, even if the simdlen forced AVX512.
> 
> > I realize this probably doesn't help the unroll case yet, but it
> > looked like an improvement.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >
> > OK?
> >
> > Thanks,
> > Richard.
> >
> > 2021-10-27  Richard Biener  <rguenther@suse.de>
> >
> > 	* tree-vect-loop.c: Include <functional>.
> > 	(vect_reanalyze_as_main_loop): Rename to...
> > 	(vect_analyze_loop_1): ... this and generalize to be
> > 	able to use it twice ...
> > 	(vect_analyze_loop): ... here.
> > ---
> >  gcc/tree-vect-loop.c | 202 ++++++++++++++++++++++---------------------
> >  1 file changed, 102 insertions(+), 100 deletions(-)
> >
> > diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> > index 961c1623f81..9a62475a69f 100644
> > --- a/gcc/tree-vect-loop.c
> > +++ b/gcc/tree-vect-loop.c
> > @@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
> >  <http://www.gnu.org/licenses/>.  */
> >  
> >  #define INCLUDE_ALGORITHM
> > +#define INCLUDE_FUNCTIONAL
> >  #include "config.h"
> >  #include "system.h"
> >  #include "coretypes.h"
> > @@ -2898,43 +2899,63 @@ vect_joust_loop_vinfos (loop_vec_info new_loop_vinfo,
> >    return true;
> >  }
> >  
> > -/* If LOOP_VINFO is already a main loop, return it unmodified.  Otherwise
> > -   try to reanalyze it as a main loop.  Return the loop_vinfo on success
> > -   and null on failure.  */
> > +/* Analyze LOOP with VECTOR_MODE and as epilogue if MAIN_LOOP_VINFO is
> > +   not NULL.  Process the analyzed loop with PROCESS even if analysis
> > +   failed.  Sets *N_STMTS and FATAL according to the analysis.
> > +   Return the loop_vinfo on success and wrapped null on failure.  */
> >  
> > -static loop_vec_info
> > -vect_reanalyze_as_main_loop (loop_vec_info loop_vinfo, unsigned int *n_stmts)
> > +static opt_loop_vec_info
> > +vect_analyze_loop_1 (class loop *loop, vec_info_shared *shared,
> > +		     machine_mode vector_mode, loop_vec_info main_loop_vinfo,
> > +		     unsigned int *n_stmts, bool &fatal,
> > +		     std::function<void(loop_vec_info)> process = nullptr)
> >  {
> > -  if (!LOOP_VINFO_EPILOGUE_P (loop_vinfo))
> > -    return loop_vinfo;
> > +  /* Check the CFG characteristics of the loop (nesting, entry/exit).  */
> > +  opt_loop_vec_info loop_vinfo = vect_analyze_loop_form (loop, shared);
> > +  if (!loop_vinfo)
> > +    {
> > +      if (dump_enabled_p ())
> > +	dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > +			 "bad loop form.\n");
> > +      gcc_checking_assert (main_loop_vinfo == NULL);
> > +      return loop_vinfo;
> > +    }
> > +  loop_vinfo->vector_mode = vector_mode;
> >  
> > -  if (dump_enabled_p ())
> > -    dump_printf_loc (MSG_NOTE, vect_location,
> > -		     "***** Reanalyzing as a main loop with vector mode %s\n",
> > -		     GET_MODE_NAME (loop_vinfo->vector_mode));
> > +  if (main_loop_vinfo)
> > +    LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) = main_loop_vinfo;
> >  
> > -  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> > -  vec_info_shared *shared = loop_vinfo->shared;
> > -  opt_loop_vec_info main_loop_vinfo = vect_analyze_loop_form (loop, shared);
> > -  gcc_assert (main_loop_vinfo);
> > +  /* Run the main analysis.  */
> > +  fatal = false;
> 
> Think this should be at the top, since we have an early return above.
> The early return should be fatal.

Indeed.  I've split and split out vect_analyze_loop_form instead, the
failing part should only be required once for each loop.

[...]

> > +		  if (!vect_epilogues)
> 
> !vect_epilogues is correct for current uses, but I think the original
> !LOOP_VINFO_EPILOGUE_P (loop_vinfo) was more general.  As mentioned above,
> in principle there's no reason why we couldn't reanalyse a loop as a
> main loop if we fail to analyse it as an epilogue.

OK, restored that.

The following is mainly the original reorg with the additional
refactoring of vect_analyze_loop_form.  I think I'll put this in
before rewriting the main iteration to first only analyze main
loops (and then possibly unrolled main loops) and only after
settling for the cheapest main loop consider epilogue
vectorization.

As you said the original approach of saving extra analyses by
using epilogue analysis as main loop analysis became moot and
with partial vectors the re-analysis as epilogue wasn't
re-usable anyway.  What we could eventually remember is
modes that fail vectorization, those will likely not succeed
when analyzed in epilogue context either.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Since this is refactoring that should not change behavior
but re-organizing the analysis loop might I'd like to put
this onto trunk as intermediate step.  Is that OK?

Thanks,
Richard.

From 3a4eeb74ee13506a0014d6cc3a3a5de7e6f49532 Mon Sep 17 00:00:00 2001
From: Richard Biener <rguenther@suse.de>
Date: Wed, 27 Oct 2021 13:14:41 +0200
Subject: [PATCH] First refactor of vect_analyze_loop
To: gcc-patches@gcc.gnu.org

This refactors the main loop analysis part in vect_analyze_loop,
re-purposing the existing vect_reanalyze_as_main_loop for this
to reduce code duplication.  Failure flow is a bit tricky since
we want to extract info from the analyzed loop but I wanted to
share the destruction part.  Thus I add some std::function and
lambda to funnel post-analysis for the case we want that
(when analyzing from the main iteration but not when re-analyzing
an epilogue as main).

In addition I split vect_analyze_loop_form into analysis and
vinfo creation so we can do the analysis only once, simplifying
the new vect_analyze_loop_1.

As discussed we probably want to change the loop over vector
modes to first only analyze things as the main loop, picking
the best (or simd VF) mode for the main loop and then analyze
for a vectorized epilogue.  The unroll would then integrate
with the main loop vectorization.  I think that currently
we may fail to analyze the epilogue with the same mode as
the main loop when using partial vectors since we increment
mode_i before doing that.

2021-11-04  Richard Biener  <rguenther@suse.de>

	* tree-vectorizer.h (struct vect_loop_form_info): New.
	(vect_analyze_loop_form): Adjust.
	(vect_create_loop_vinfo): New.
	* tree-parloops.c (gather_scalar_reductions): Adjust for
	vect_analyze_loop_form API change.
	* tree-vect-loop.c: Include <functional>.
	(vect_analyze_loop_form_1): Rename to vect_analyze_loop_form,
	take struct vect_loop_form_info as output parameter and adjust.
	(vect_analyze_loop_form): Rename to vect_create_loop_vinfo and
	split out call to the original vect_analyze_loop_form_1.
	(vect_reanalyze_as_main_loop): Rename to...
	(vect_analyze_loop_1): ... this, factor out the call to
	vect_analyze_loop_form and generalize to be able to use it twice ...
	(vect_analyze_loop): ... here.  Perform vect_analyze_loop_form
	once only and here.
---
 gcc/tree-parloops.c   |  11 +-
 gcc/tree-vect-loop.c  | 290 +++++++++++++++++++++---------------------
 gcc/tree-vectorizer.h |  13 +-
 3 files changed, 160 insertions(+), 154 deletions(-)

diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 5e64d5ed7a3..96932c70336 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -3298,10 +3298,11 @@ gather_scalar_reductions (loop_p loop, reduction_info_table_type *reduction_list
   auto_vec<gimple *, 4> double_reduc_stmts;
 
   vec_info_shared shared;
-  simple_loop_info = vect_analyze_loop_form (loop, &shared);
-  if (simple_loop_info == NULL)
+  vect_loop_form_info info;
+  if (!vect_analyze_loop_form (loop, &info))
     goto gather_done;
 
+  simple_loop_info = vect_create_loop_vinfo (loop, &shared, &info);
   for (gsi = gsi_start_phis (loop->header); !gsi_end_p (gsi); gsi_next (&gsi))
     {
       gphi *phi = gsi.phi ();
@@ -3339,9 +3340,11 @@ gather_scalar_reductions (loop_p loop, reduction_info_table_type *reduction_list
   if (!double_reduc_phis.is_empty ())
     {
       vec_info_shared shared;
-      simple_loop_info = vect_analyze_loop_form (loop->inner, &shared);
-      if (simple_loop_info)
+      vect_loop_form_info info;
+      if (vect_analyze_loop_form (loop->inner, &info))
 	{
+	  simple_loop_info
+	    = vect_create_loop_vinfo (loop->inner, &shared, &info);
 	  gphi *phi;
 	  unsigned int i;
 
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 961c1623f81..83826d4a6b8 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #define INCLUDE_ALGORITHM
+#define INCLUDE_FUNCTIONAL
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -1309,7 +1310,7 @@ vect_compute_single_scalar_iteration_cost (loop_vec_info loop_vinfo)
 }
 
 
-/* Function vect_analyze_loop_form_1.
+/* Function vect_analyze_loop_form.
 
    Verify that certain CFG restrictions hold, including:
    - the loop has a pre-header
@@ -1319,9 +1320,7 @@ vect_compute_single_scalar_iteration_cost (loop_vec_info loop_vinfo)
      niter could be analyzed under some assumptions.  */
 
 opt_result
-vect_analyze_loop_form_1 (class loop *loop, gcond **loop_cond,
-			  tree *assumptions, tree *number_of_iterationsm1,
-			  tree *number_of_iterations, gcond **inner_loop_cond)
+vect_analyze_loop_form (class loop *loop, vect_loop_form_info *info)
 {
   DUMP_VECT_SCOPE ("vect_analyze_loop_form");
 
@@ -1329,6 +1328,7 @@ vect_analyze_loop_form_1 (class loop *loop, gcond **loop_cond,
      vs. an outer (nested) loop.
      (FORNOW. May want to relax some of these restrictions in the future).  */
 
+  info->inner_loop_cond = NULL;
   if (!loop->inner)
     {
       /* Inner-most loop.  We currently require that the number of BBs is
@@ -1393,11 +1393,8 @@ vect_analyze_loop_form_1 (class loop *loop, gcond **loop_cond,
 				       " unsupported outerloop form.\n");
 
       /* Analyze the inner-loop.  */
-      tree inner_niterm1, inner_niter, inner_assumptions;
-      opt_result res
-	= vect_analyze_loop_form_1 (loop->inner, inner_loop_cond,
-				    &inner_assumptions, &inner_niterm1,
-				    &inner_niter, NULL);
+      vect_loop_form_info inner;
+      opt_result res = vect_analyze_loop_form (loop->inner, &inner);
       if (!res)
 	{
 	  if (dump_enabled_p ())
@@ -1408,11 +1405,11 @@ vect_analyze_loop_form_1 (class loop *loop, gcond **loop_cond,
 
       /* Don't support analyzing niter under assumptions for inner
 	 loop.  */
-      if (!integer_onep (inner_assumptions))
+      if (!integer_onep (inner.assumptions))
 	return opt_result::failure_at (vect_location,
 				       "not vectorized: Bad inner loop.\n");
 
-      if (!expr_invariant_in_loop_p (loop, inner_niter))
+      if (!expr_invariant_in_loop_p (loop, inner.number_of_iterations))
 	return opt_result::failure_at (vect_location,
 				       "not vectorized: inner-loop count not"
 				       " invariant.\n");
@@ -1420,6 +1417,7 @@ vect_analyze_loop_form_1 (class loop *loop, gcond **loop_cond,
       if (dump_enabled_p ())
         dump_printf_loc (MSG_NOTE, vect_location,
 			 "Considering outer-loop vectorization.\n");
+      info->inner_loop_cond = inner.loop_cond;
     }
 
   if (!single_exit (loop))
@@ -1446,48 +1444,42 @@ vect_analyze_loop_form_1 (class loop *loop, gcond **loop_cond,
 				   "not vectorized:"
 				   " abnormal loop exit edge.\n");
 
-  *loop_cond = vect_get_loop_niters (loop, assumptions, number_of_iterations,
-				     number_of_iterationsm1);
-  if (!*loop_cond)
+  info->loop_cond
+    = vect_get_loop_niters (loop, &info->assumptions,
+			    &info->number_of_iterations,
+			    &info->number_of_iterationsm1);
+  if (!info->loop_cond)
     return opt_result::failure_at
       (vect_location,
        "not vectorized: complicated exit condition.\n");
 
-  if (integer_zerop (*assumptions)
-      || !*number_of_iterations
-      || chrec_contains_undetermined (*number_of_iterations))
+  if (integer_zerop (info->assumptions)
+      || !info->number_of_iterations
+      || chrec_contains_undetermined (info->number_of_iterations))
     return opt_result::failure_at
-      (*loop_cond,
+      (info->loop_cond,
        "not vectorized: number of iterations cannot be computed.\n");
 
-  if (integer_zerop (*number_of_iterations))
+  if (integer_zerop (info->number_of_iterations))
     return opt_result::failure_at
-      (*loop_cond,
+      (info->loop_cond,
        "not vectorized: number of iterations = 0.\n");
 
   return opt_result::success ();
 }
 
-/* Analyze LOOP form and return a loop_vec_info if it is of suitable form.  */
+/* Create a loop_vec_info for LOOP with SHARED and the
+   vect_analyze_loop_form result.  */
 
-opt_loop_vec_info
-vect_analyze_loop_form (class loop *loop, vec_info_shared *shared)
+loop_vec_info
+vect_create_loop_vinfo (class loop *loop, vec_info_shared *shared,
+			const vect_loop_form_info *info)
 {
-  tree assumptions, number_of_iterations, number_of_iterationsm1;
-  gcond *loop_cond, *inner_loop_cond = NULL;
-
-  opt_result res
-    = vect_analyze_loop_form_1 (loop, &loop_cond,
-				&assumptions, &number_of_iterationsm1,
-				&number_of_iterations, &inner_loop_cond);
-  if (!res)
-    return opt_loop_vec_info::propagate_failure (res);
-
   loop_vec_info loop_vinfo = new _loop_vec_info (loop, shared);
-  LOOP_VINFO_NITERSM1 (loop_vinfo) = number_of_iterationsm1;
-  LOOP_VINFO_NITERS (loop_vinfo) = number_of_iterations;
-  LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo) = number_of_iterations;
-  if (!integer_onep (assumptions))
+  LOOP_VINFO_NITERSM1 (loop_vinfo) = info->number_of_iterationsm1;
+  LOOP_VINFO_NITERS (loop_vinfo) = info->number_of_iterations;
+  LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo) = info->number_of_iterations;
+  if (!integer_onep (info->assumptions))
     {
       /* We consider to vectorize this loop by versioning it under
 	 some assumptions.  In order to do this, we need to clear
@@ -1498,7 +1490,7 @@ vect_analyze_loop_form (class loop *loop, vec_info_shared *shared)
 	 analysis are done under the assumptions.  */
       loop_constraint_set (loop, LOOP_C_FINITE);
       /* Also record the assumptions for versioning.  */
-      LOOP_VINFO_NITERS_ASSUMPTIONS (loop_vinfo) = assumptions;
+      LOOP_VINFO_NITERS_ASSUMPTIONS (loop_vinfo) = info->assumptions;
     }
 
   if (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
@@ -1507,17 +1499,17 @@ vect_analyze_loop_form (class loop *loop, vec_info_shared *shared)
         {
           dump_printf_loc (MSG_NOTE, vect_location,
 			   "Symbolic number of iterations is ");
-	  dump_generic_expr (MSG_NOTE, TDF_DETAILS, number_of_iterations);
+	  dump_generic_expr (MSG_NOTE, TDF_DETAILS, info->number_of_iterations);
           dump_printf (MSG_NOTE, "\n");
         }
     }
 
-  stmt_vec_info loop_cond_info = loop_vinfo->lookup_stmt (loop_cond);
+  stmt_vec_info loop_cond_info = loop_vinfo->lookup_stmt (info->loop_cond);
   STMT_VINFO_TYPE (loop_cond_info) = loop_exit_ctrl_vec_info_type;
-  if (inner_loop_cond)
+  if (info->inner_loop_cond)
     {
       stmt_vec_info inner_loop_cond_info
-	= loop_vinfo->lookup_stmt (inner_loop_cond);
+	= loop_vinfo->lookup_stmt (info->inner_loop_cond);
       STMT_VINFO_TYPE (inner_loop_cond_info) = loop_exit_ctrl_vec_info_type;
       /* If we have an estimate on the number of iterations of the inner
 	 loop use that to limit the scale for costing, otherwise use
@@ -1530,7 +1522,7 @@ vect_analyze_loop_form (class loop *loop, vec_info_shared *shared)
 
   gcc_assert (!loop->aux);
   loop->aux = loop_vinfo;
-  return opt_loop_vec_info::success (loop_vinfo);
+  return loop_vinfo;
 }
 
 
@@ -2898,43 +2890,56 @@ vect_joust_loop_vinfos (loop_vec_info new_loop_vinfo,
   return true;
 }
 
-/* If LOOP_VINFO is already a main loop, return it unmodified.  Otherwise
-   try to reanalyze it as a main loop.  Return the loop_vinfo on success
-   and null on failure.  */
-
-static loop_vec_info
-vect_reanalyze_as_main_loop (loop_vec_info loop_vinfo, unsigned int *n_stmts)
+/* Analyze LOOP with VECTOR_MODE and as epilogue if MAIN_LOOP_VINFO is
+   not NULL.  Process the analyzed loop with PROCESS even if analysis
+   failed.  Sets *N_STMTS and FATAL according to the analysis.
+   Return the loop_vinfo on success and wrapped null on failure.  */
+
+static opt_loop_vec_info
+vect_analyze_loop_1 (class loop *loop, vec_info_shared *shared,
+		     const vect_loop_form_info *loop_form_info,
+		     machine_mode vector_mode, loop_vec_info main_loop_vinfo,
+		     unsigned int *n_stmts, bool &fatal,
+		     std::function<void(loop_vec_info)> process = nullptr)
 {
-  if (!LOOP_VINFO_EPILOGUE_P (loop_vinfo))
-    return loop_vinfo;
+  loop_vec_info loop_vinfo
+    = vect_create_loop_vinfo (loop, shared, loop_form_info);
+  loop_vinfo->vector_mode = vector_mode;
 
-  if (dump_enabled_p ())
-    dump_printf_loc (MSG_NOTE, vect_location,
-		     "***** Reanalyzing as a main loop with vector mode %s\n",
-		     GET_MODE_NAME (loop_vinfo->vector_mode));
+  if (main_loop_vinfo)
+    LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) = main_loop_vinfo;
 
-  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-  vec_info_shared *shared = loop_vinfo->shared;
-  opt_loop_vec_info main_loop_vinfo = vect_analyze_loop_form (loop, shared);
-  gcc_assert (main_loop_vinfo);
+  /* Run the main analysis.  */
+  fatal = false;
+  opt_result res = vect_analyze_loop_2 (loop_vinfo, fatal, n_stmts);
+  loop->aux = NULL;
 
-  main_loop_vinfo->vector_mode = loop_vinfo->vector_mode;
+  /* Process info before we destroy loop_vinfo upon analysis failure.  */
+  if (process)
+    process (loop_vinfo);
 
-  bool fatal = false;
-  bool res = vect_analyze_loop_2 (main_loop_vinfo, fatal, n_stmts);
-  loop->aux = NULL;
-  if (!res)
+  if (dump_enabled_p ())
     {
-      if (dump_enabled_p ())
+      if (res)
 	dump_printf_loc (MSG_NOTE, vect_location,
-			 "***** Failed to analyze main loop with vector"
-			 " mode %s\n",
+			 "***** Analysis succeeded with vector mode %s\n",
+			 GET_MODE_NAME (loop_vinfo->vector_mode));
+      else
+	dump_printf_loc (MSG_NOTE, vect_location,
+			 "***** Analysis failed with vector mode %s\n",
 			 GET_MODE_NAME (loop_vinfo->vector_mode));
-      delete main_loop_vinfo;
-      return NULL;
     }
-  LOOP_VINFO_VECTORIZABLE_P (main_loop_vinfo) = 1;
-  return main_loop_vinfo;
+
+  if (!res)
+    {
+      delete loop_vinfo;
+      if (fatal)
+	gcc_checking_assert (main_loop_vinfo == NULL);
+      return opt_loop_vec_info::propagate_failure (res);
+    }
+
+  LOOP_VINFO_VECTORIZABLE_P (loop_vinfo) = 1;
+  return opt_loop_vec_info::success (loop_vinfo);
 }
 
 /* Function vect_analyze_loop.
@@ -2967,34 +2972,29 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
        "not vectorized: loop nest containing two or more consecutive inner"
        " loops cannot be vectorized\n");
 
+  /* Analyze the loop form.  */
+  vect_loop_form_info loop_form_info;
+  opt_result res = vect_analyze_loop_form (loop, &loop_form_info);
+  if (!res)
+    {
+      if (dump_enabled_p ())
+	dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			 "bad loop form.\n");
+      return opt_loop_vec_info::propagate_failure (res);
+    }
+
   unsigned n_stmts = 0;
   machine_mode autodetected_vector_mode = VOIDmode;
   opt_loop_vec_info first_loop_vinfo = opt_loop_vec_info::success (NULL);
   machine_mode next_vector_mode = VOIDmode;
   poly_uint64 lowest_th = 0;
-  unsigned vectorized_loops = 0;
   bool pick_lowest_cost_p = ((autovec_flags & VECT_COMPARE_COSTS)
 			     && !unlimited_cost_model (loop));
 
   bool vect_epilogues = false;
-  opt_result res = opt_result::success ();
   unsigned HOST_WIDE_INT simdlen = loop->simdlen;
   while (1)
     {
-      /* Check the CFG characteristics of the loop (nesting, entry/exit).  */
-      opt_loop_vec_info loop_vinfo = vect_analyze_loop_form (loop, shared);
-      if (!loop_vinfo)
-	{
-	  if (dump_enabled_p ())
-	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-			     "bad loop form.\n");
-	  gcc_checking_assert (first_loop_vinfo == NULL);
-	  return loop_vinfo;
-	}
-      loop_vinfo->vector_mode = next_vector_mode;
-
-      bool fatal = false;
-
       /* When pick_lowest_cost_p is true, we should in principle iterate
 	 over all the loop_vec_infos that LOOP_VINFO could replace and
 	 try to vectorize LOOP_VINFO under the same conditions.
@@ -3023,43 +3023,36 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
 	 LOOP_VINFO fails when treated as an epilogue loop, succeeds when
 	 treated as a standalone loop, and ends up being genuinely cheaper
 	 than FIRST_LOOP_VINFO.  */
-      if (vect_epilogues)
-	LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) = first_loop_vinfo;
 
-      res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts);
-      if (mode_i == 0)
-	autodetected_vector_mode = loop_vinfo->vector_mode;
-      if (dump_enabled_p ())
+      bool fatal;
+      auto cb = [&] (loop_vec_info loop_vinfo)
 	{
-	  if (res)
-	    dump_printf_loc (MSG_NOTE, vect_location,
-			     "***** Analysis succeeded with vector mode %s\n",
-			     GET_MODE_NAME (loop_vinfo->vector_mode));
-	  else
-	    dump_printf_loc (MSG_NOTE, vect_location,
-			     "***** Analysis failed with vector mode %s\n",
-			     GET_MODE_NAME (loop_vinfo->vector_mode));
-	}
-
-      loop->aux = NULL;
-
-      if (!fatal)
-	while (mode_i < vector_modes.length ()
-	       && vect_chooses_same_modes_p (loop_vinfo, vector_modes[mode_i]))
-	  {
-	    if (dump_enabled_p ())
-	      dump_printf_loc (MSG_NOTE, vect_location,
-			       "***** The result for vector mode %s would"
-			       " be the same\n",
-			       GET_MODE_NAME (vector_modes[mode_i]));
-	    mode_i += 1;
-	  }
+	  if (mode_i ==0)
+	    autodetected_vector_mode = loop_vinfo->vector_mode;
+	  if (!fatal)
+	    while (mode_i < vector_modes.length ()
+		   && vect_chooses_same_modes_p (loop_vinfo,
+						 vector_modes[mode_i]))
+	      {
+		if (dump_enabled_p ())
+		  dump_printf_loc (MSG_NOTE, vect_location,
+				   "***** The result for vector mode %s would"
+				   " be the same\n",
+				   GET_MODE_NAME (vector_modes[mode_i]));
+		mode_i += 1;
+	      }
+	};
+      opt_loop_vec_info loop_vinfo
+	= vect_analyze_loop_1 (loop, shared, &loop_form_info,
+			       next_vector_mode,
+			       vect_epilogues
+			       ? (loop_vec_info)first_loop_vinfo : NULL,
+			       &n_stmts, fatal, cb);
+      if (fatal)
+	break;
 
-      if (res)
+      if (loop_vinfo)
 	{
-	  LOOP_VINFO_VECTORIZABLE_P (loop_vinfo) = 1;
-	  vectorized_loops++;
-
 	  /* Once we hit the desired simdlen for the first time,
 	     discard any previous attempts.  */
 	  if (simdlen
@@ -3084,33 +3077,44 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
 	      if (vinfos.is_empty ()
 		  && vect_joust_loop_vinfos (loop_vinfo, first_loop_vinfo))
 		{
-		  loop_vec_info main_loop_vinfo
-		    = vect_reanalyze_as_main_loop (loop_vinfo, &n_stmts);
-		  if (main_loop_vinfo == loop_vinfo)
+		  if (!LOOP_VINFO_EPILOGUE_P (loop_vinfo))
 		    {
 		      delete first_loop_vinfo;
 		      first_loop_vinfo = opt_loop_vec_info::success (NULL);
 		    }
-		  else if (main_loop_vinfo
-			   && vect_joust_loop_vinfos (main_loop_vinfo,
-						      first_loop_vinfo))
-		    {
-		      delete first_loop_vinfo;
-		      first_loop_vinfo = opt_loop_vec_info::success (NULL);
-		      delete loop_vinfo;
-		      loop_vinfo
-			= opt_loop_vec_info::success (main_loop_vinfo);
-		    }
 		  else
 		    {
 		      if (dump_enabled_p ())
 			dump_printf_loc (MSG_NOTE, vect_location,
-					 "***** No longer preferring vector"
-					 " mode %s after reanalyzing the loop"
-					 " as a main loop\n",
+					 "***** Reanalyzing as a main loop "
+					 "with vector mode %s\n",
 					 GET_MODE_NAME
-					   (main_loop_vinfo->vector_mode));
-		      delete main_loop_vinfo;
+					   (loop_vinfo->vector_mode));
+		      opt_loop_vec_info main_loop_vinfo
+			= vect_analyze_loop_1 (loop, shared, &loop_form_info,
+					       loop_vinfo->vector_mode,
+					       NULL, &n_stmts, fatal);
+		      if (main_loop_vinfo
+			  && vect_joust_loop_vinfos (main_loop_vinfo,
+						     first_loop_vinfo))
+			{
+			  delete first_loop_vinfo;
+			  first_loop_vinfo = opt_loop_vec_info::success (NULL);
+			  delete loop_vinfo;
+			  loop_vinfo
+			    = opt_loop_vec_info::success (main_loop_vinfo);
+			}
+		      else
+			{
+			  if (dump_enabled_p ())
+			    dump_printf_loc (MSG_NOTE, vect_location,
+					     "***** No longer preferring vector"
+					     " mode %s after reanalyzing the "
+					     " loop as a main loop\n",
+					     GET_MODE_NAME
+					       (loop_vinfo->vector_mode));
+			  delete main_loop_vinfo;
+			}
 		    }
 		}
 	    }
@@ -3159,16 +3163,6 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
 	  if (!simdlen && !vect_epilogues && !pick_lowest_cost_p)
 	    break;
 	}
-      else
-	{
-	  delete loop_vinfo;
-	  loop_vinfo = opt_loop_vec_info::success (NULL);
-	  if (fatal)
-	    {
-	      gcc_checking_assert (first_loop_vinfo == NULL);
-	      break;
-	    }
-	}
 
       /* Handle the case that the original loop can use partial
 	 vectorization, but want to only adopt it for the epilogue.
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 73347ce1f4e..20eda72f829 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -2061,8 +2061,17 @@ extern bool reduction_fn_for_scalar_code (enum tree_code, internal_fn *);
 
 /* Drive for loop transformation stage.  */
 extern class loop *vect_transform_loop (loop_vec_info, gimple *);
-extern opt_loop_vec_info vect_analyze_loop_form (class loop *,
-						 vec_info_shared *);
+struct vect_loop_form_info
+{
+  tree number_of_iterations;
+  tree number_of_iterationsm1;
+  tree assumptions;
+  gcond *loop_cond;
+  gcond *inner_loop_cond;
+};
+extern opt_result vect_analyze_loop_form (class loop *, vect_loop_form_info *);
+extern loop_vec_info vect_create_loop_vinfo (class loop *, vec_info_shared *,
+					     const vect_loop_form_info *);
 extern bool vectorizable_live_operation (vec_info *,
 					 stmt_vec_info, gimple_stmt_iterator *,
 					 slp_tree, slp_instance, int,
  
Richard Sandiford Nov. 4, 2021, 5:59 p.m. UTC | #3
Richard Biener <rguenther@suse.de> writes:
>> > [...]
>> > @@ -2898,43 +2899,63 @@ vect_joust_loop_vinfos (loop_vec_info new_loop_vinfo,
>> >    return true;
>> >  }
>> >  
>> > -/* If LOOP_VINFO is already a main loop, return it unmodified.  Otherwise
>> > -   try to reanalyze it as a main loop.  Return the loop_vinfo on success
>> > -   and null on failure.  */
>> > +/* Analyze LOOP with VECTOR_MODE and as epilogue if MAIN_LOOP_VINFO is
>> > +   not NULL.  Process the analyzed loop with PROCESS even if analysis
>> > +   failed.  Sets *N_STMTS and FATAL according to the analysis.
>> > +   Return the loop_vinfo on success and wrapped null on failure.  */
>> >  
>> > -static loop_vec_info
>> > -vect_reanalyze_as_main_loop (loop_vec_info loop_vinfo, unsigned int *n_stmts)
>> > +static opt_loop_vec_info
>> > +vect_analyze_loop_1 (class loop *loop, vec_info_shared *shared,
>> > +		     machine_mode vector_mode, loop_vec_info main_loop_vinfo,
>> > +		     unsigned int *n_stmts, bool &fatal,
>> > +		     std::function<void(loop_vec_info)> process = nullptr)
>> >  {
>> > -  if (!LOOP_VINFO_EPILOGUE_P (loop_vinfo))
>> > -    return loop_vinfo;
>> > +  /* Check the CFG characteristics of the loop (nesting, entry/exit).  */
>> > +  opt_loop_vec_info loop_vinfo = vect_analyze_loop_form (loop, shared);
>> > +  if (!loop_vinfo)
>> > +    {
>> > +      if (dump_enabled_p ())
>> > +	dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>> > +			 "bad loop form.\n");
>> > +      gcc_checking_assert (main_loop_vinfo == NULL);
>> > +      return loop_vinfo;
>> > +    }
>> > +  loop_vinfo->vector_mode = vector_mode;
>> >  
>> > -  if (dump_enabled_p ())
>> > -    dump_printf_loc (MSG_NOTE, vect_location,
>> > -		     "***** Reanalyzing as a main loop with vector mode %s\n",
>> > -		     GET_MODE_NAME (loop_vinfo->vector_mode));
>> > +  if (main_loop_vinfo)
>> > +    LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) = main_loop_vinfo;
>> >  
>> > -  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>> > -  vec_info_shared *shared = loop_vinfo->shared;
>> > -  opt_loop_vec_info main_loop_vinfo = vect_analyze_loop_form (loop, shared);
>> > -  gcc_assert (main_loop_vinfo);
>> > +  /* Run the main analysis.  */
>> > +  fatal = false;
>> 
>> Think this should be at the top, since we have an early return above.
>> The early return should be fatal.
>
> Indeed.  I've split and split out vect_analyze_loop_form instead, the
> failing part should only be required once for each loop.

Ah, yeah, agree that's nicer.

> [...]
>
>> > +		  if (!vect_epilogues)
>> 
>> !vect_epilogues is correct for current uses, but I think the original
>> !LOOP_VINFO_EPILOGUE_P (loop_vinfo) was more general.  As mentioned above,
>> in principle there's no reason why we couldn't reanalyse a loop as a
>> main loop if we fail to analyse it as an epilogue.
>
> OK, restored that.
>
> The following is mainly the original reorg with the additional
> refactoring of vect_analyze_loop_form.  I think I'll put this in
> before rewriting the main iteration to first only analyze main
> loops (and then possibly unrolled main loops) and only after
> settling for the cheapest main loop consider epilogue
> vectorization.
>
> As you said the original approach of saving extra analyses by
> using epilogue analysis as main loop analysis became moot and
> with partial vectors the re-analysis as epilogue wasn't
> re-usable anyway.  What we could eventually remember is
> modes that fail vectorization, those will likely not succeed
> when analyzed in epilogue context either.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> Since this is refactoring that should not change behavior
> but re-organizing the analysis loop might I'd like to put
> this onto trunk as intermediate step.  Is that OK?

Yeah, looks good to me FWIW.  Just a couple of small comments:

> @@ -3023,43 +3023,36 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
>  	 LOOP_VINFO fails when treated as an epilogue loop, succeeds when
>  	 treated as a standalone loop, and ends up being genuinely cheaper
>  	 than FIRST_LOOP_VINFO.  */
> -      if (vect_epilogues)
> -	LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) = first_loop_vinfo;
>  
> -      res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts);
> -      if (mode_i == 0)
> -	autodetected_vector_mode = loop_vinfo->vector_mode;
> -      if (dump_enabled_p ())
> +      bool fatal;
> +      auto cb = [&] (loop_vec_info loop_vinfo)
>  	{
> -	  if (res)
> -	    dump_printf_loc (MSG_NOTE, vect_location,
> -			     "***** Analysis succeeded with vector mode %s\n",
> -			     GET_MODE_NAME (loop_vinfo->vector_mode));
> -	  else
> -	    dump_printf_loc (MSG_NOTE, vect_location,
> -			     "***** Analysis failed with vector mode %s\n",
> -			     GET_MODE_NAME (loop_vinfo->vector_mode));
> -	}
> -
> -      loop->aux = NULL;
> -
> -      if (!fatal)
> -	while (mode_i < vector_modes.length ()
> -	       && vect_chooses_same_modes_p (loop_vinfo, vector_modes[mode_i]))
> -	  {
> -	    if (dump_enabled_p ())
> -	      dump_printf_loc (MSG_NOTE, vect_location,
> -			       "***** The result for vector mode %s would"
> -			       " be the same\n",
> -			       GET_MODE_NAME (vector_modes[mode_i]));
> -	    mode_i += 1;
> -	  }
> +	  if (mode_i ==0)

s/==0/== 0/

> +	    autodetected_vector_mode = loop_vinfo->vector_mode;
> +	  if (!fatal)
> +	    while (mode_i < vector_modes.length ()
> +		   && vect_chooses_same_modes_p (loop_vinfo,
> +						 vector_modes[mode_i]))
> +	      {
> +		if (dump_enabled_p ())
> +		  dump_printf_loc (MSG_NOTE, vect_location,
> +				   "***** The result for vector mode %s would"
> +				   " be the same\n",
> +				   GET_MODE_NAME (vector_modes[mode_i]));
> +		mode_i += 1;
> +	      }

I guess the autodetected_vector_mode part is redundant when fatal,
so perhaps we could avoid calling the callback for fatal failures
and remove the !fatal above.  Just a suggestion though, either way's
fine with me.

Thanks,
Richard

> +	};
> +      opt_loop_vec_info loop_vinfo
> +	= vect_analyze_loop_1 (loop, shared, &loop_form_info,
> +			       next_vector_mode,
> +			       vect_epilogues
> +			       ? (loop_vec_info)first_loop_vinfo : NULL,
> +			       &n_stmts, fatal, cb);
> +      if (fatal)
> +	break;
>  
> -      if (res)
> +      if (loop_vinfo)
>  	{
> -	  LOOP_VINFO_VECTORIZABLE_P (loop_vinfo) = 1;
> -	  vectorized_loops++;
> -
>  	  /* Once we hit the desired simdlen for the first time,
>  	     discard any previous attempts.  */
>  	  if (simdlen
> @@ -3084,33 +3077,44 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
>  	      if (vinfos.is_empty ()
>  		  && vect_joust_loop_vinfos (loop_vinfo, first_loop_vinfo))
>  		{
> -		  loop_vec_info main_loop_vinfo
> -		    = vect_reanalyze_as_main_loop (loop_vinfo, &n_stmts);
> -		  if (main_loop_vinfo == loop_vinfo)
> +		  if (!LOOP_VINFO_EPILOGUE_P (loop_vinfo))
>  		    {
>  		      delete first_loop_vinfo;
>  		      first_loop_vinfo = opt_loop_vec_info::success (NULL);
>  		    }
> -		  else if (main_loop_vinfo
> -			   && vect_joust_loop_vinfos (main_loop_vinfo,
> -						      first_loop_vinfo))
> -		    {
> -		      delete first_loop_vinfo;
> -		      first_loop_vinfo = opt_loop_vec_info::success (NULL);
> -		      delete loop_vinfo;
> -		      loop_vinfo
> -			= opt_loop_vec_info::success (main_loop_vinfo);
> -		    }
>  		  else
>  		    {
>  		      if (dump_enabled_p ())
>  			dump_printf_loc (MSG_NOTE, vect_location,
> -					 "***** No longer preferring vector"
> -					 " mode %s after reanalyzing the loop"
> -					 " as a main loop\n",
> +					 "***** Reanalyzing as a main loop "
> +					 "with vector mode %s\n",
>  					 GET_MODE_NAME
> -					   (main_loop_vinfo->vector_mode));
> -		      delete main_loop_vinfo;
> +					   (loop_vinfo->vector_mode));
> +		      opt_loop_vec_info main_loop_vinfo
> +			= vect_analyze_loop_1 (loop, shared, &loop_form_info,
> +					       loop_vinfo->vector_mode,
> +					       NULL, &n_stmts, fatal);
> +		      if (main_loop_vinfo
> +			  && vect_joust_loop_vinfos (main_loop_vinfo,
> +						     first_loop_vinfo))
> +			{
> +			  delete first_loop_vinfo;
> +			  first_loop_vinfo = opt_loop_vec_info::success (NULL);
> +			  delete loop_vinfo;
> +			  loop_vinfo
> +			    = opt_loop_vec_info::success (main_loop_vinfo);
> +			}
> +		      else
> +			{
> +			  if (dump_enabled_p ())
> +			    dump_printf_loc (MSG_NOTE, vect_location,
> +					     "***** No longer preferring vector"
> +					     " mode %s after reanalyzing the "
> +					     " loop as a main loop\n",
> +					     GET_MODE_NAME
> +					       (loop_vinfo->vector_mode));
> +			  delete main_loop_vinfo;
> +			}
>  		    }
>  		}
>  	    }
> @@ -3159,16 +3163,6 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
>  	  if (!simdlen && !vect_epilogues && !pick_lowest_cost_p)
>  	    break;
>  	}
> -      else
> -	{
> -	  delete loop_vinfo;
> -	  loop_vinfo = opt_loop_vec_info::success (NULL);
> -	  if (fatal)
> -	    {
> -	      gcc_checking_assert (first_loop_vinfo == NULL);
> -	      break;
> -	    }
> -	}
>  
>        /* Handle the case that the original loop can use partial
>  	 vectorization, but want to only adopt it for the epilogue.
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index 73347ce1f4e..20eda72f829 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -2061,8 +2061,17 @@ extern bool reduction_fn_for_scalar_code (enum tree_code, internal_fn *);
>  
>  /* Drive for loop transformation stage.  */
>  extern class loop *vect_transform_loop (loop_vec_info, gimple *);
> -extern opt_loop_vec_info vect_analyze_loop_form (class loop *,
> -						 vec_info_shared *);
> +struct vect_loop_form_info
> +{
> +  tree number_of_iterations;
> +  tree number_of_iterationsm1;
> +  tree assumptions;
> +  gcond *loop_cond;
> +  gcond *inner_loop_cond;
> +};
> +extern opt_result vect_analyze_loop_form (class loop *, vect_loop_form_info *);
> +extern loop_vec_info vect_create_loop_vinfo (class loop *, vec_info_shared *,
> +					     const vect_loop_form_info *);
>  extern bool vectorizable_live_operation (vec_info *,
>  					 stmt_vec_info, gimple_stmt_iterator *,
>  					 slp_tree, slp_instance, int,
  

Patch

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 961c1623f81..9a62475a69f 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -20,6 +20,7 @@  along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #define INCLUDE_ALGORITHM
+#define INCLUDE_FUNCTIONAL
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -2898,43 +2899,63 @@  vect_joust_loop_vinfos (loop_vec_info new_loop_vinfo,
   return true;
 }
 
-/* If LOOP_VINFO is already a main loop, return it unmodified.  Otherwise
-   try to reanalyze it as a main loop.  Return the loop_vinfo on success
-   and null on failure.  */
+/* Analyze LOOP with VECTOR_MODE and as epilogue if MAIN_LOOP_VINFO is
+   not NULL.  Process the analyzed loop with PROCESS even if analysis
+   failed.  Sets *N_STMTS and FATAL according to the analysis.
+   Return the loop_vinfo on success and wrapped null on failure.  */
 
-static loop_vec_info
-vect_reanalyze_as_main_loop (loop_vec_info loop_vinfo, unsigned int *n_stmts)
+static opt_loop_vec_info
+vect_analyze_loop_1 (class loop *loop, vec_info_shared *shared,
+		     machine_mode vector_mode, loop_vec_info main_loop_vinfo,
+		     unsigned int *n_stmts, bool &fatal,
+		     std::function<void(loop_vec_info)> process = nullptr)
 {
-  if (!LOOP_VINFO_EPILOGUE_P (loop_vinfo))
-    return loop_vinfo;
+  /* Check the CFG characteristics of the loop (nesting, entry/exit).  */
+  opt_loop_vec_info loop_vinfo = vect_analyze_loop_form (loop, shared);
+  if (!loop_vinfo)
+    {
+      if (dump_enabled_p ())
+	dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			 "bad loop form.\n");
+      gcc_checking_assert (main_loop_vinfo == NULL);
+      return loop_vinfo;
+    }
+  loop_vinfo->vector_mode = vector_mode;
 
-  if (dump_enabled_p ())
-    dump_printf_loc (MSG_NOTE, vect_location,
-		     "***** Reanalyzing as a main loop with vector mode %s\n",
-		     GET_MODE_NAME (loop_vinfo->vector_mode));
+  if (main_loop_vinfo)
+    LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) = main_loop_vinfo;
 
-  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
-  vec_info_shared *shared = loop_vinfo->shared;
-  opt_loop_vec_info main_loop_vinfo = vect_analyze_loop_form (loop, shared);
-  gcc_assert (main_loop_vinfo);
+  /* Run the main analysis.  */
+  fatal = false;
+  opt_result res = vect_analyze_loop_2 (loop_vinfo, fatal, n_stmts);
+  loop->aux = NULL;
 
-  main_loop_vinfo->vector_mode = loop_vinfo->vector_mode;
+  /* Process info before we destroy loop_vinfo upon analysis failure.  */
+  if (process)
+    process (loop_vinfo);
 
-  bool fatal = false;
-  bool res = vect_analyze_loop_2 (main_loop_vinfo, fatal, n_stmts);
-  loop->aux = NULL;
-  if (!res)
+  if (dump_enabled_p ())
     {
-      if (dump_enabled_p ())
+      if (res)
 	dump_printf_loc (MSG_NOTE, vect_location,
-			 "***** Failed to analyze main loop with vector"
-			 " mode %s\n",
+			 "***** Analysis succeeded with vector mode %s\n",
 			 GET_MODE_NAME (loop_vinfo->vector_mode));
-      delete main_loop_vinfo;
-      return NULL;
+      else
+	dump_printf_loc (MSG_NOTE, vect_location,
+			 "***** Analysis failed with vector mode %s\n",
+			 GET_MODE_NAME (loop_vinfo->vector_mode));
+    }
+
+  if (!res)
+    {
+      delete loop_vinfo;
+      if (fatal)
+	gcc_checking_assert (main_loop_vinfo == NULL);
+      return opt_loop_vec_info::propagate_failure (res);
     }
-  LOOP_VINFO_VECTORIZABLE_P (main_loop_vinfo) = 1;
-  return main_loop_vinfo;
+
+  LOOP_VINFO_VECTORIZABLE_P (loop_vinfo) = 1;
+  return loop_vinfo;
 }
 
 /* Function vect_analyze_loop.
@@ -2981,20 +3002,6 @@  vect_analyze_loop (class loop *loop, vec_info_shared *shared)
   unsigned HOST_WIDE_INT simdlen = loop->simdlen;
   while (1)
     {
-      /* Check the CFG characteristics of the loop (nesting, entry/exit).  */
-      opt_loop_vec_info loop_vinfo = vect_analyze_loop_form (loop, shared);
-      if (!loop_vinfo)
-	{
-	  if (dump_enabled_p ())
-	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-			     "bad loop form.\n");
-	  gcc_checking_assert (first_loop_vinfo == NULL);
-	  return loop_vinfo;
-	}
-      loop_vinfo->vector_mode = next_vector_mode;
-
-      bool fatal = false;
-
       /* When pick_lowest_cost_p is true, we should in principle iterate
 	 over all the loop_vec_infos that LOOP_VINFO could replace and
 	 try to vectorize LOOP_VINFO under the same conditions.
@@ -3023,41 +3030,35 @@  vect_analyze_loop (class loop *loop, vec_info_shared *shared)
 	 LOOP_VINFO fails when treated as an epilogue loop, succeeds when
 	 treated as a standalone loop, and ends up being genuinely cheaper
 	 than FIRST_LOOP_VINFO.  */
-      if (vect_epilogues)
-	LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) = first_loop_vinfo;
 
-      res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts);
-      if (mode_i == 0)
-	autodetected_vector_mode = loop_vinfo->vector_mode;
-      if (dump_enabled_p ())
+      bool fatal;
+      auto cb = [&] (loop_vec_info loop_vinfo)
 	{
-	  if (res)
-	    dump_printf_loc (MSG_NOTE, vect_location,
-			     "***** Analysis succeeded with vector mode %s\n",
-			     GET_MODE_NAME (loop_vinfo->vector_mode));
-	  else
-	    dump_printf_loc (MSG_NOTE, vect_location,
-			     "***** Analysis failed with vector mode %s\n",
-			     GET_MODE_NAME (loop_vinfo->vector_mode));
-	}
-
-      loop->aux = NULL;
-
-      if (!fatal)
-	while (mode_i < vector_modes.length ()
-	       && vect_chooses_same_modes_p (loop_vinfo, vector_modes[mode_i]))
-	  {
-	    if (dump_enabled_p ())
-	      dump_printf_loc (MSG_NOTE, vect_location,
-			       "***** The result for vector mode %s would"
-			       " be the same\n",
-			       GET_MODE_NAME (vector_modes[mode_i]));
-	    mode_i += 1;
-	  }
+	  if (mode_i ==0)
+	    autodetected_vector_mode = loop_vinfo->vector_mode;
+	  if (!fatal)
+	    while (mode_i < vector_modes.length ()
+		   && vect_chooses_same_modes_p (loop_vinfo,
+						 vector_modes[mode_i]))
+	      {
+		if (dump_enabled_p ())
+		  dump_printf_loc (MSG_NOTE, vect_location,
+				   "***** The result for vector mode %s would"
+				   " be the same\n",
+				   GET_MODE_NAME (vector_modes[mode_i]));
+		mode_i += 1;
+	      }
+	};
+      opt_loop_vec_info loop_vinfo
+	= vect_analyze_loop_1 (loop, shared, next_vector_mode,
+			       vect_epilogues
+			       ? (loop_vec_info)first_loop_vinfo : NULL,
+			       &n_stmts, fatal, cb);
+      if (fatal)
+	break;
 
-      if (res)
+      if (loop_vinfo)
 	{
-	  LOOP_VINFO_VECTORIZABLE_P (loop_vinfo) = 1;
 	  vectorized_loops++;
 
 	  /* Once we hit the desired simdlen for the first time,
@@ -3084,33 +3085,44 @@  vect_analyze_loop (class loop *loop, vec_info_shared *shared)
 	      if (vinfos.is_empty ()
 		  && vect_joust_loop_vinfos (loop_vinfo, first_loop_vinfo))
 		{
-		  loop_vec_info main_loop_vinfo
-		    = vect_reanalyze_as_main_loop (loop_vinfo, &n_stmts);
-		  if (main_loop_vinfo == loop_vinfo)
-		    {
-		      delete first_loop_vinfo;
-		      first_loop_vinfo = opt_loop_vec_info::success (NULL);
-		    }
-		  else if (main_loop_vinfo
-			   && vect_joust_loop_vinfos (main_loop_vinfo,
-						      first_loop_vinfo))
+		  if (!vect_epilogues)
 		    {
 		      delete first_loop_vinfo;
 		      first_loop_vinfo = opt_loop_vec_info::success (NULL);
-		      delete loop_vinfo;
-		      loop_vinfo
-			= opt_loop_vec_info::success (main_loop_vinfo);
 		    }
 		  else
 		    {
 		      if (dump_enabled_p ())
 			dump_printf_loc (MSG_NOTE, vect_location,
-					 "***** No longer preferring vector"
-					 " mode %s after reanalyzing the loop"
-					 " as a main loop\n",
+					 "***** Reanalyzing as a main loop "
+					 "with vector mode %s\n",
 					 GET_MODE_NAME
-					   (main_loop_vinfo->vector_mode));
-		      delete main_loop_vinfo;
+					   (loop_vinfo->vector_mode));
+		      opt_loop_vec_info main_loop_vinfo
+			= vect_analyze_loop_1 (loop, shared,
+					       loop_vinfo->vector_mode,
+					       NULL, &n_stmts, fatal);
+		      if (main_loop_vinfo
+			  && vect_joust_loop_vinfos (main_loop_vinfo,
+						     first_loop_vinfo))
+			{
+			  delete first_loop_vinfo;
+			  first_loop_vinfo = opt_loop_vec_info::success (NULL);
+			  delete loop_vinfo;
+			  loop_vinfo
+			    = opt_loop_vec_info::success (main_loop_vinfo);
+			}
+		      else
+			{
+			  if (dump_enabled_p ())
+			    dump_printf_loc (MSG_NOTE, vect_location,
+					     "***** No longer preferring vector"
+					     " mode %s after reanalyzing the "
+					     " loop as a main loop\n",
+					     GET_MODE_NAME
+					       (loop_vinfo->vector_mode));
+			  delete main_loop_vinfo;
+			}
 		    }
 		}
 	    }
@@ -3159,16 +3171,6 @@  vect_analyze_loop (class loop *loop, vec_info_shared *shared)
 	  if (!simdlen && !vect_epilogues && !pick_lowest_cost_p)
 	    break;
 	}
-      else
-	{
-	  delete loop_vinfo;
-	  loop_vinfo = opt_loop_vec_info::success (NULL);
-	  if (fatal)
-	    {
-	      gcc_checking_assert (first_loop_vinfo == NULL);
-	      break;
-	    }
-	}
 
       /* Handle the case that the original loop can use partial
 	 vectorization, but want to only adopt it for the epilogue.