[vect] : replace vect_scalar_cost_multiplier with param_vect_allow_possibly_not_worthwhile_vectorizations
Commit Message
The parameter vect_scalar_cost_multiplier was added in order to make it possible
to apply a scaling factor to the cost of a scalar loop in order to make
vectorization more or less profitable.
However because of the way the costing currently works, increasing the cost of a
scalar loop does not always result in vectorization because in some cases when
the cost of vectorization is very low we switch to looking at only the cost of
the epilog and prologs.
One case where this fell apart was uncounted loops. As a result this patch
replaces the parameter with
param_vect_allow_possibly_not_worthwhile_vectorizations which just ignores the
result of costing.
Bootstrapped Regtested on aarch64-none-linux-gnu,
arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
-m32, -m64 and no issues.
Is this one better Richi? Robin are the RVV parts OK?
Thanks,
Tamar
gcc/ChangeLog:
* doc/params.texi: Replace param_vect_scalar_cost_multiplier with
param_vect_allow_possibly_not_worthwhile_vectorizations
* params.opt: Likewise.
* config/aarch64/aarch64.cc (aarch64_override_options_internal):
Likewise.
* config/riscv/riscv.cc (riscv_override_options_internal): Likewise.
* tree-vect-loop.cc (vect_estimate_min_profitable_iters): Likewise.
* tree-vect-slp.cc (vect_bb_vectorization_profitable_p): Likewise.
(vect_slp_region): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/cost_model_16.c: Updated.
* gcc.target/aarch64/sve/cost_model_19.c: New test.
---
--
Comments
> The parameter vect_scalar_cost_multiplier was added in order to make it possible
> to apply a scaling factor to the cost of a scalar loop in order to make
> vectorization more or less profitable.
>
> However because of the way the costing currently works, increasing the cost of a
> scalar loop does not always result in vectorization because in some cases when
> the cost of vectorization is very low we switch to looking at only the cost of
> the epilog and prologs.
>
> One case where this fell apart was uncounted loops. As a result this patch
> replaces the parameter with
> param_vect_allow_possibly_not_worthwhile_vectorizations which just ignores the
> result of costing.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues.
>
> Robin are the RVV parts OK?
Of course, just the param name is very unwieldy :) I don't have a better
suggestion, though. "allow speculative vectorization"? "tentative"? But it's
not really tentative, either. "allow aggressive vectorization"?
On Wed, 8 Apr 2026, Robin Dapp wrote:
> > The parameter vect_scalar_cost_multiplier was added in order to make it possible
> > to apply a scaling factor to the cost of a scalar loop in order to make
> > vectorization more or less profitable.
> >
> > However because of the way the costing currently works, increasing the cost of a
> > scalar loop does not always result in vectorization because in some cases when
> > the cost of vectorization is very low we switch to looking at only the cost of
> > the epilog and prologs.
> >
> > One case where this fell apart was uncounted loops. As a result this patch
> > replaces the parameter with
> > param_vect_allow_possibly_not_worthwhile_vectorizations which just ignores the
> > result of costing.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > -m32, -m64 and no issues.
> >
> > Robin are the RVV parts OK?
>
> Of course, just the param name is very unwieldy :) I don't have a better
> suggestion, though. "allow speculative vectorization"? "tentative"? But it's
> not really tentative, either. "allow aggressive vectorization"?
Nobody will use this param, so the unwieldy name shouldn't be of
concern - it's only for debugging. All disabled vectorizations
this way will never be profitable.
OK.
Richard.
@@ -19828,11 +19828,11 @@ aarch64_override_options_internal (struct gcc_options *opts)
& AARCH64_EXTRA_TUNE_DISPATCH_SCHED)
gcc_assert (aarch64_tune_params.dispatch_constraints != NULL);
- /* Set scalar costing to a high value such that we always pick
- vectorization. Increase scalar costing by 10000%. */
+ /* Enable possible unprofitable vectorization. */
if (opts->x_flag_aarch64_max_vectorization)
SET_OPTION_IF_UNSET (opts, &global_options_set,
- param_vect_scalar_cost_multiplier, 10000);
+ param_vect_allow_possibly_not_worthwhile_vectorizations,
+ 1);
/* Synchronize the -mautovec-preference and aarch64_autovec_preference using
whichever one is not default. If both are set then prefer the param flag
@@ -12346,11 +12346,11 @@ riscv_override_options_internal (struct gcc_options *opts)
/* Convert -march and -mrvv-vector-bits to a chunks count. */
riscv_vector_chunks = riscv_convert_vector_chunks (opts);
- /* Set scalar costing to a high value such that we always pick
- vectorization. Increase scalar costing by 100x. */
+ /* Enable possible unprofitable vectorization. */
if (opts->x_riscv_max_vectorization)
SET_OPTION_IF_UNSET (&global_options, &global_options_set,
- param_vect_scalar_cost_multiplier, 10000);
+ param_vect_allow_possibly_not_worthwhile_vectorizations,
+ 1);
if (opts->x_flag_cf_protection != CF_NONE)
{
@@ -1658,11 +1658,10 @@ this parameter. The default value of this parameter is 50.
@item vect-induction-float
Enable loop vectorization of floating-point inductions.
-@paindex vect-scalar-cost-multiplier
-@item vect-scalar-cost-multiplier
-Apply the given multiplier percentage to scalar loop costing during
-vectorization.
-Increasing the cost multiplier makes vector loops more profitable.
+@paindex vect_allow_possibly_not_worthwhile_vectorizations
+@item vect_allow_possibly_not_worthwhile_vectorizations
+Enable vectorization of loops that may not be profitable according to the cost
+model but still perform costing between vector modes.
@paindex vrp-block-limit
@item vrp-block-limit
@@ -1295,9 +1295,9 @@ The maximum factor which the loop vectorizer applies to the cost of statements i
Common Joined UInteger Var(param_vect_induction_float) Init(1) IntegerRange(0, 1) Param Optimization
Enable loop vectorization of floating point inductions.
--param=vect-scalar-cost-multiplier=
-Common Joined UInteger Var(param_vect_scalar_cost_multiplier) Init(100) IntegerRange(0, 10000) Param Optimization
-The scaling multiplier as a percentage to apply to all scalar loop costing when performing vectorization profitability analysis. The default value is 100.
+-param=vect_allow_possibly_not_worthwhile_vectorizations=
+Common Joined UInteger Var(param_vect_allow_possibly_not_worthwhile_vectorizations) Init(0) IntegerRange(0, 1) Param Optimization
+Enable vectorization of loops that may not be profitable according to the cost model but still perform costing between vector modes.
-param=vrp-block-limit=
Common Joined UInteger Var(param_vrp_block_limit) Init(150000) Optimization Param
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-Ofast -march=armv8-a+sve --param vect-scalar-cost-multiplier=1000 -fdump-tree-vect-details" } */
+/* { dg-options "-Ofast -march=armv8-a+sve --param vect_allow_possibly_not_worthwhile_vectorizations=1 -fdump-tree-vect-details" } */
void
foo (char *restrict a, int *restrict b, int *restrict c,
new file mode 100644
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -march=armv8-a+sve -mmax-vectorization -ffreestanding -fdump-tree-vect-details" } */
+
+unsigned long f(const char *s)
+{
+ unsigned long i = 0;
+ while (*s++)
+ i++;
+ return i;
+}
+
+/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" } } */
@@ -2550,7 +2550,7 @@ start_over:
/* Check the costings of the loop make vectorizing worthwhile. */
res = vect_analyze_loop_costing (loop_vinfo, suggested_unroll_factor);
- if (res < 0)
+ if (res < 0 && !param_vect_allow_possibly_not_worthwhile_vectorizations)
{
ok = opt_result::failure_at (vect_location,
"Loop costings may not be worthwhile.\n");
@@ -4089,8 +4089,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
TODO: Consider assigning different costs to different scalar
statements. */
- scalar_single_iter_cost = (loop_vinfo->scalar_costs->total_cost ()
- * param_vect_scalar_cost_multiplier) / 100;
+ scalar_single_iter_cost = loop_vinfo->scalar_costs->total_cost ();
/* Add additional cost for the peeled instructions in prologue and epilogue
loop. (For fully-masked loops there will be no peeling.)
@@ -9792,8 +9792,7 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo,
while (si < li_scalar_costs.length ()
&& li_scalar_costs[si].first == sl);
scalar_target_cost_data->finish_cost (nullptr);
- scalar_cost = (scalar_target_cost_data->body_cost ()
- * param_vect_scalar_cost_multiplier) / 100;
+ scalar_cost = scalar_target_cost_data->body_cost ();
/* Complete the target-specific vector cost calculation. */
class vector_costs *vect_target_cost_data = init_cost (bb_vinfo, false);
@@ -10352,6 +10351,7 @@ vect_slp_region (vec<basic_block> bbs, vec<data_reference_p> datarefs,
dump_user_location_t saved_vect_location = vect_location;
vect_location = instance->location ();
if (!unlimited_cost_model (NULL)
+ && !param_vect_allow_possibly_not_worthwhile_vectorizations
&& !vect_bb_vectorization_profitable_p
(bb_vinfo, instance->subgraph_entries, orig_loop))
{