[vect] : replace vect_scalar_cost_multiplier with param_vect_allow_possibly_not_worthwhile_vectorizations

Message ID patch-20445-tamar@arm.com
State Committed
Commit 4236a2df4fae0dd9b3396b9d425fdf53b81f5cae
Headers
Series [vect] : replace vect_scalar_cost_multiplier with param_vect_allow_possibly_not_worthwhile_vectorizations |

Commit Message

Tamar Christina April 8, 2026, 7:11 a.m. UTC
  The parameter vect_scalar_cost_multiplier was added in order to make it possible
to apply a scaling factor to the cost of a scalar loop in order to make
vectorization more or less profitable.

However because of the way the costing currently works, increasing the cost of a
scalar loop does not always result in vectorization because in some cases when
the cost of vectorization is very low we switch to looking at only the cost of
the epilog and prologs.

One case where this fell apart was uncounted loops.  As a result this patch
replaces the parameter with
param_vect_allow_possibly_not_worthwhile_vectorizations which just ignores the
result of costing.

Bootstrapped Regtested on aarch64-none-linux-gnu,
arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
-m32, -m64 and no issues.

Is this one better Richi? Robin are the RVV parts OK?

Thanks,
Tamar

gcc/ChangeLog:

	* doc/params.texi: Replace param_vect_scalar_cost_multiplier with
	param_vect_allow_possibly_not_worthwhile_vectorizations
	* params.opt: Likewise.
	* config/aarch64/aarch64.cc (aarch64_override_options_internal):
	Likewise.
	* config/riscv/riscv.cc (riscv_override_options_internal): Likewise.
	* tree-vect-loop.cc (vect_estimate_min_profitable_iters): Likewise.
	* tree-vect-slp.cc (vect_bb_vectorization_profitable_p): Likewise.
	(vect_slp_region): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/sve/cost_model_16.c: Updated.
	* gcc.target/aarch64/sve/cost_model_19.c: New test.

---


--
  

Comments

Robin Dapp April 8, 2026, 8:47 a.m. UTC | #1
> The parameter vect_scalar_cost_multiplier was added in order to make it possible
> to apply a scaling factor to the cost of a scalar loop in order to make
> vectorization more or less profitable.
>
> However because of the way the costing currently works, increasing the cost of a
> scalar loop does not always result in vectorization because in some cases when
> the cost of vectorization is very low we switch to looking at only the cost of
> the epilog and prologs.
>
> One case where this fell apart was uncounted loops.  As a result this patch
> replaces the parameter with
> param_vect_allow_possibly_not_worthwhile_vectorizations which just ignores the
> result of costing.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues.
>
> Robin are the RVV parts OK?

Of course, just the param name is very unwieldy :)  I don't have a better 
suggestion, though.  "allow speculative vectorization"?  "tentative"?  But it's 
not really tentative, either.  "allow aggressive vectorization"?
  
Richard Biener April 8, 2026, 11:15 a.m. UTC | #2
On Wed, 8 Apr 2026, Robin Dapp wrote:

> > The parameter vect_scalar_cost_multiplier was added in order to make it possible
> > to apply a scaling factor to the cost of a scalar loop in order to make
> > vectorization more or less profitable.
> >
> > However because of the way the costing currently works, increasing the cost of a
> > scalar loop does not always result in vectorization because in some cases when
> > the cost of vectorization is very low we switch to looking at only the cost of
> > the epilog and prologs.
> >
> > One case where this fell apart was uncounted loops.  As a result this patch
> > replaces the parameter with
> > param_vect_allow_possibly_not_worthwhile_vectorizations which just ignores the
> > result of costing.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > -m32, -m64 and no issues.
> >
> > Robin are the RVV parts OK?
> 
> Of course, just the param name is very unwieldy :)  I don't have a better 
> suggestion, though.  "allow speculative vectorization"?  "tentative"?  But it's 
> not really tentative, either.  "allow aggressive vectorization"?

Nobody will use this param, so the unwieldy name shouldn't be of
concern - it's only for debugging.  All disabled vectorizations
this way will never be profitable.

OK.

Richard.
  

Patch

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 1902a4ee9fa34a1f00e654760d0759af1d6cf432..be7933edb90f9f442ef090658ec4ca0260e4d8ea 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -19828,11 +19828,11 @@  aarch64_override_options_internal (struct gcc_options *opts)
       & AARCH64_EXTRA_TUNE_DISPATCH_SCHED)
     gcc_assert (aarch64_tune_params.dispatch_constraints != NULL);
 
-  /* Set scalar costing to a high value such that we always pick
-     vectorization.  Increase scalar costing by 10000%.  */
+  /* Enable possible unprofitable vectorization.  */
   if (opts->x_flag_aarch64_max_vectorization)
     SET_OPTION_IF_UNSET (opts, &global_options_set,
-			 param_vect_scalar_cost_multiplier, 10000);
+			 param_vect_allow_possibly_not_worthwhile_vectorizations,
+			 1);
 
   /* Synchronize the -mautovec-preference and aarch64_autovec_preference using
      whichever one is not default.  If both are set then prefer the param flag
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a610654775745567f6d3abfe7abcf131877e92b5..61d9500558047c3db90a1817f224d0fbdf92b098 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -12346,11 +12346,11 @@  riscv_override_options_internal (struct gcc_options *opts)
   /* Convert -march and -mrvv-vector-bits to a chunks count.  */
   riscv_vector_chunks = riscv_convert_vector_chunks (opts);
 
-  /* Set scalar costing to a high value such that we always pick
-     vectorization.  Increase scalar costing by 100x.  */
+  /* Enable possible unprofitable vectorization.  */
   if (opts->x_riscv_max_vectorization)
     SET_OPTION_IF_UNSET (&global_options, &global_options_set,
-			 param_vect_scalar_cost_multiplier, 10000);
+			 param_vect_allow_possibly_not_worthwhile_vectorizations,
+			 1);
 
   if (opts->x_flag_cf_protection != CF_NONE)
     {
diff --git a/gcc/doc/params.texi b/gcc/doc/params.texi
index 94329cfe6170cbc08248bbb539e5709357dd0cc3..31b04688cf77c20d777e397509f687dd15329f8e 100644
--- a/gcc/doc/params.texi
+++ b/gcc/doc/params.texi
@@ -1658,11 +1658,10 @@  this parameter.  The default value of this parameter is 50.
 @item vect-induction-float
 Enable loop vectorization of floating-point inductions.
 
-@paindex vect-scalar-cost-multiplier
-@item vect-scalar-cost-multiplier
-Apply the given multiplier percentage to scalar loop costing during
-vectorization.
-Increasing the cost multiplier makes vector loops more profitable.
+@paindex vect_allow_possibly_not_worthwhile_vectorizations
+@item vect_allow_possibly_not_worthwhile_vectorizations
+Enable vectorization of loops that may not be profitable according to the cost
+model but still perform costing between vector modes.
 
 @paindex vrp-block-limit
 @item vrp-block-limit
diff --git a/gcc/params.opt b/gcc/params.opt
index 72ac44dd773f4f1afb57179405ceaeedec63968f..b35ca688cdf15de81915e47be0ec6c19df5a974e 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -1295,9 +1295,9 @@  The maximum factor which the loop vectorizer applies to the cost of statements i
 Common Joined UInteger Var(param_vect_induction_float) Init(1) IntegerRange(0, 1) Param Optimization
 Enable loop vectorization of floating point inductions.
 
--param=vect-scalar-cost-multiplier=
-Common Joined UInteger Var(param_vect_scalar_cost_multiplier) Init(100) IntegerRange(0, 10000) Param Optimization
-The scaling multiplier as a percentage to apply to all scalar loop costing when performing vectorization profitability analysis.  The default value is 100.
+-param=vect_allow_possibly_not_worthwhile_vectorizations=
+Common Joined UInteger Var(param_vect_allow_possibly_not_worthwhile_vectorizations) Init(0) IntegerRange(0, 1) Param Optimization
+Enable vectorization of loops that may not be profitable according to the cost model but still perform costing between vector modes.
 
 -param=vrp-block-limit=
 Common Joined UInteger Var(param_vrp_block_limit) Init(150000) Optimization Param
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cost_model_16.c b/gcc/testsuite/gcc.target/aarch64/sve/cost_model_16.c
index bfe49ef15f3a535b00f543fbc5055a59b9dcbf34..f39ee6915b8d8a4e435c390bae2e4040c563304e 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/cost_model_16.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/cost_model_16.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-Ofast -march=armv8-a+sve --param vect-scalar-cost-multiplier=1000 -fdump-tree-vect-details" } */
+/* { dg-options "-Ofast -march=armv8-a+sve --param vect_allow_possibly_not_worthwhile_vectorizations=1 -fdump-tree-vect-details" } */
 
 void
 foo (char *restrict a, int *restrict b, int *restrict c,
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cost_model_19.c b/gcc/testsuite/gcc.target/aarch64/sve/cost_model_19.c
new file mode 100644
index 0000000000000000000000000000000000000000..9e4421d630f2b04833100f83dec378428b3d170f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/cost_model_19.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Ofast -march=armv8-a+sve -mmax-vectorization -ffreestanding -fdump-tree-vect-details" } */
+
+unsigned long f(const char *s)
+{
+    unsigned long i = 0;
+    while (*s++)
+        i++;
+    return i;
+}
+
+/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" } } */
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 7121edb8d81904d227ba510f20a776fa7c9c8224..767c971485f49284fd946a05170be3600200fa18 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -2550,7 +2550,7 @@  start_over:
 
   /* Check the costings of the loop make vectorizing worthwhile.  */
   res = vect_analyze_loop_costing (loop_vinfo, suggested_unroll_factor);
-  if (res < 0)
+  if (res < 0 && !param_vect_allow_possibly_not_worthwhile_vectorizations)
     {
       ok = opt_result::failure_at (vect_location,
 				   "Loop costings may not be worthwhile.\n");
@@ -4089,8 +4089,7 @@  vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
      TODO: Consider assigning different costs to different scalar
      statements.  */
 
-  scalar_single_iter_cost = (loop_vinfo->scalar_costs->total_cost ()
-			     * param_vect_scalar_cost_multiplier) / 100;
+  scalar_single_iter_cost = loop_vinfo->scalar_costs->total_cost ();
 
   /* Add additional cost for the peeled instructions in prologue and epilogue
      loop.  (For fully-masked loops there will be no peeling.)
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 8fa6a740c96c16c0c99041db1dadc54fd50f63b3..603a991f4f602d6cd76c9fa4c4490c1b4e958570 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -9792,8 +9792,7 @@  vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo,
       while (si < li_scalar_costs.length ()
 	     && li_scalar_costs[si].first == sl);
       scalar_target_cost_data->finish_cost (nullptr);
-      scalar_cost = (scalar_target_cost_data->body_cost ()
-		     * param_vect_scalar_cost_multiplier) / 100;
+      scalar_cost = scalar_target_cost_data->body_cost ();
 
       /* Complete the target-specific vector cost calculation.  */
       class vector_costs *vect_target_cost_data = init_cost (bb_vinfo, false);
@@ -10352,6 +10351,7 @@  vect_slp_region (vec<basic_block> bbs, vec<data_reference_p> datarefs,
 	      dump_user_location_t saved_vect_location = vect_location;
 	      vect_location = instance->location ();
 	      if (!unlimited_cost_model (NULL)
+		  && !param_vect_allow_possibly_not_worthwhile_vectorizations
 		  && !vect_bb_vectorization_profitable_p
 			(bb_vinfo, instance->subgraph_entries, orig_loop))
 		{