rs6000: Parameterize some const values for density test

Message ID b0a97dfc-870f-069a-992d-49f2942b36dc@linux.ibm.com
State New
Headers
Series rs6000: Parameterize some const values for density test |

Commit Message

Kewen.Lin Sept. 15, 2021, 8:52 a.m. UTC
  Hi,

This patch follows the discussion here[1], where Segher suggested
parameterizing those exact magic constants for density heuristics,
to make it easier to tweak if need.

Since these heuristics are quite internal, I make these parameters
as undocumented and be mainly used by developers.

The change here should be "No Functional Change".  But I verified
it with SPEC2017 at option sets O2-vect and Ofast-unroll on Power8,
the result is neutral as expected.

Bootstrapped and regress-tested on powerpc64le-linux-gnu Power9.

Is it ok for trunk?

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579121.html

BR,
Kewen
-----
gcc/ChangeLog:

	* config/rs6000/rs6000.opt (rs6000-density-pct-threshold,
	rs6000-density-size-threshold, rs6000-density-penalty,
	rs6000-density-load-pct-threshold,
	rs6000-density-load-num-threshold): New parameter.
	* config/rs6000/rs6000.c (rs6000_density_test): Adjust with
	corresponding parameters.

---
 gcc/config/rs6000/rs6000.c   | 22 +++++++---------------
 gcc/config/rs6000/rs6000.opt | 21 +++++++++++++++++++++
 2 files changed, 28 insertions(+), 15 deletions(-)

--
2.25.1
  

Comments

Li, Pan2 via Gcc-patches Sept. 17, 2021, 4:27 p.m. UTC | #1
Hi Kewen,

On 9/15/21 3:52 AM, Kewen.Lin wrote:
> Hi,
>
> This patch follows the discussion here[1], where Segher suggested
> parameterizing those exact magic constants for density heuristics,
> to make it easier to tweak if need.
>
> Since these heuristics are quite internal, I make these parameters
> as undocumented and be mainly used by developers.
>
> The change here should be "No Functional Change".  But I verified
> it with SPEC2017 at option sets O2-vect and Ofast-unroll on Power8,
> the result is neutral as expected.
>
> Bootstrapped and regress-tested on powerpc64le-linux-gnu Power9.
>
> Is it ok for trunk?
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579121.html
>
> BR,
> Kewen
> -----
> gcc/ChangeLog:
>
> 	* config/rs6000/rs6000.opt (rs6000-density-pct-threshold,
> 	rs6000-density-size-threshold, rs6000-density-penalty,
> 	rs6000-density-load-pct-threshold,
> 	rs6000-density-load-num-threshold): New parameter.
> 	* config/rs6000/rs6000.c (rs6000_density_test): Adjust with
> 	corresponding parameters.
>
> ---
>   gcc/config/rs6000/rs6000.c   | 22 +++++++---------------
>   gcc/config/rs6000/rs6000.opt | 21 +++++++++++++++++++++
>   2 files changed, 28 insertions(+), 15 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 9bc826e3a50..4ab23b0ab33 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -5284,9 +5284,6 @@ struct rs6000_cost_data
>   static void
>   rs6000_density_test (rs6000_cost_data *data)
>   {
> -  const int DENSITY_PCT_THRESHOLD = 85;
> -  const int DENSITY_SIZE_THRESHOLD = 70;
> -  const int DENSITY_PENALTY = 10;
>     struct loop *loop = data->loop_info;
>     basic_block *bbs = get_loop_body (loop);
>     int nbbs = loop->num_nodes;
> @@ -5322,26 +5319,21 @@ rs6000_density_test (rs6000_cost_data *data)
>     free (bbs);
>     density_pct = (vec_cost * 100) / (vec_cost + not_vec_cost);
>
> -  if (density_pct > DENSITY_PCT_THRESHOLD
> -      && vec_cost + not_vec_cost > DENSITY_SIZE_THRESHOLD)
> +  if (density_pct > rs6000_density_pct_threshold
> +      && vec_cost + not_vec_cost > rs6000_density_size_threshold)
>       {
> -      data->cost[vect_body] = vec_cost * (100 + DENSITY_PENALTY) / 100;
> +      data->cost[vect_body] = vec_cost * (100 + rs6000_density_penalty) / 100;
>         if (dump_enabled_p ())
>   	dump_printf_loc (MSG_NOTE, vect_location,
>   			 "density %d%%, cost %d exceeds threshold, penalizing "
> -			 "loop body cost by %d%%\n", density_pct,
> -			 vec_cost + not_vec_cost, DENSITY_PENALTY);
> +			 "loop body cost by %u%%\n", density_pct,
> +			 vec_cost + not_vec_cost, rs6000_density_penalty);
>       }
>
>     /* Check whether we need to penalize the body cost to account
>        for excess strided or elementwise loads.  */
>     if (data->extra_ctor_cost > 0)
>       {
> -      /* Threshold for load stmts percentage in all vectorized stmts.  */
> -      const int DENSITY_LOAD_PCT_THRESHOLD = 45;
> -      /* Threshold for total number of load stmts.  */
> -      const int DENSITY_LOAD_NUM_THRESHOLD = 20;
> -
>         gcc_assert (data->nloads <= data->nstmts);
>         unsigned int load_pct = (data->nloads * 100) / data->nstmts;
>
> @@ -5355,8 +5347,8 @@ rs6000_density_test (rs6000_cost_data *data)
>   	      the loads.
>   	 One typical case is the innermost loop of the hotspot of SPEC2017
>   	 503.bwaves_r without loop interchange.  */
> -      if (data->nloads > DENSITY_LOAD_NUM_THRESHOLD
> -	  && load_pct > DENSITY_LOAD_PCT_THRESHOLD)
> +      if (data->nloads > (unsigned int) rs6000_density_load_num_threshold
> +	  && load_pct > (unsigned int) rs6000_density_load_pct_threshold)
>   	{
>   	  data->cost[vect_body] += data->extra_ctor_cost;
>   	  if (dump_enabled_p ())
> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
> index 0538db387dc..563983f3269 100644
> --- a/gcc/config/rs6000/rs6000.opt
> +++ b/gcc/config/rs6000/rs6000.opt
> @@ -639,3 +639,24 @@ Enable instructions that guard against return-oriented programming attacks.
>   mprivileged
>   Target Var(rs6000_privileged) Init(0)
>   Generate code that will run in privileged state.
> +
> +-param=rs6000-density-pct-threshold=
> +Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 99) Param
> +When costing for loop vectorization, we probably need to penalize the loop body cost if the existing cost model may not adequately reflect delays from unavailable vector resources.  We collect the cost for vectorized statements and non-vectorized statements separately, check the proportion of vec_cost to total cost of vec_cost and non vec_cost, and penalize only if the proportion exceeds the threshold specified by this parameter.  The default value is 85.
> +
> +-param=rs6000-density-size-threshold=
> +Target Undocumented Joined UInteger Var(rs6000_density_size_threshold) Init(70) IntegerRange(0, 99) Param

I think 99 is not a sufficient upper bound.  This is a counting value 
that could in theory get much higher.  Can you set it to something 
ridiculous like IntegerRange(0, 1000)?

> +Like parameter rs6000-density-pct-threshold, we also check the total sum of vec_cost and non vec_cost, and penalize only if the sum exceeds the threshold specified by this parameter.  The default value is 70.
> +
> +-param=rs6000-density-penalty=
> +Target Undocumented Joined UInteger Var(rs6000_density_penalty) Init(10) IntegerRange(0, 1000) Param
> +When both heuristics with rs6000-density-pct-threshold and rs6000-density-size-threshold are satisfied, we decide to penalize the loop body cost by the value which is specified by this parameter.  The default value is 10.
> +
> +-param=rs6000-density-load-pct-threshold=
> +Target Undocumented Joined UInteger Var(rs6000_density_load_pct_threshold) Init(45) IntegerRange(0, 99) Param
> +When costing for loop vectorization, we probably need to penalize the loop body cost by accounting for excess strided or elementwise loads.  We collect the numbers for general statements and load statements according to the information for statement to be vectorized, check the proportion of load statements, and penalize only if the proportion exceeds the threshold specified by this parameter.  The default value is 45.

"according to the information for statement*s* to be vectorized"

Otherwise, these descriptions are quite excellent, by the way.

This looks good to me with those minor things addressed.  Recommend that 
maintainers approve.

Thanks for the patch!
Bill
> +
> +-param=rs6000-density-load-num-threshold=
> +Target Undocumented Joined UInteger Var(rs6000_density_load_num_threshold) Init(20) IntegerRange(0, 1000) Param
> +Like parameter rs6000-density-load-pct-threshold, we also check if the total number of load statements exceeds the threshold specified by this parameter, and penalize only if it's satisfied.  The default value is 20.
> +
> --
> 2.25.1
>
  
Segher Boessenkool Sept. 17, 2021, 10:15 p.m. UTC | #2
Hi!

On Fri, Sep 17, 2021 at 11:27:09AM -0500, Bill Schmidt wrote:
> On 9/15/21 3:52 AM, Kewen.Lin wrote:
> >+-param=rs6000-density-pct-threshold=
> >+Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) 
> >Init(85) IntegerRange(0, 99) Param
> >+When costing for loop vectorization, we probably need to penalize the 
> >loop body cost if the existing cost model may not adequately reflect 
> >delays from unavailable vector resources.  We collect the cost for 
> >vectorized statements and non-vectorized statements separately, check the 
> >proportion of vec_cost to total cost of vec_cost and non vec_cost, and 
> >penalize only if the proportion exceeds the threshold specified by this 
> >parameter.  The default value is 85.
> >+
> >+-param=rs6000-density-size-threshold=
> >+Target Undocumented Joined UInteger Var(rs6000_density_size_threshold) 
> >Init(70) IntegerRange(0, 99) Param
> 
> I think 99 is not a sufficient upper bound.  This is a counting value 
> that could in theory get much higher.  Can you set it to something 
> ridiculous like IntegerRange(0, 1000)?

It is a percentage.  (0,100) is the maximum that makes any sense :-)

It may be useful to make it a bit more sensitive than hundreds, but it
is a heuristic anyway, this will work fine.

But allowing 100 will be good.


Segher
  
Segher Boessenkool Sept. 17, 2021, 10:26 p.m. UTC | #3
Hi!

On Wed, Sep 15, 2021 at 04:52:49PM +0800, Kewen.Lin wrote:
> This patch follows the discussion here[1], where Segher suggested
> parameterizing those exact magic constants for density heuristics,
> to make it easier to tweak if need.
> 
> Since these heuristics are quite internal, I make these parameters
> as undocumented and be mainly used by developers.

Okido.

> +      if (data->nloads > (unsigned int) rs6000_density_load_num_threshold
> +	  && load_pct > (unsigned int) rs6000_density_load_pct_threshold)

Those variables are unsigned int already.  Don't cast please.

> +-param=rs6000-density-pct-threshold=
> +Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 99) Param

So make this and all other percentages (0, 100) please.

> +When costing for loop vectorization, we probably need to penalize the loop body cost if the existing cost model may not adequately reflect delays from unavailable vector resources.  We collect the cost for vectorized statements and non-vectorized statements separately, check the proportion of vec_cost to total cost of vec_cost and non vec_cost, and penalize only if the proportion exceeds the threshold specified by this parameter.  The default value is 85.

It would be good if we can use line breaks in the source code for things
like this, but I don't think we can.  This message is mainly used for
"--help=param", and it is good there to have as short messages as you
can.  But given the nature of params you need quite a few words often,
and you do not want to say so little that things are no clear, either.

So, dunno :-)

Oksy for trunk with these fixes and what Bill mentioned in the other
thread.  Thanks!


Segher
  
Kewen.Lin Sept. 21, 2021, 3:27 a.m. UTC | #4
Hi Bill,

Thanks for the review!

on 2021/9/18 上午12:27, Bill Schmidt wrote:
> Hi Kewen,
> 
> On 9/15/21 3:52 AM, Kewen.Lin wrote:
>> Hi,
>>
>> This patch follows the discussion here[1], where Segher suggested
>> parameterizing those exact magic constants for density heuristics,
>> to make it easier to tweak if need.
>>
>> Since these heuristics are quite internal, I make these parameters
>> as undocumented and be mainly used by developers.
>>
>> The change here should be "No Functional Change".  But I verified
>> it with SPEC2017 at option sets O2-vect and Ofast-unroll on Power8,
>> the result is neutral as expected.
>>
>> Bootstrapped and regress-tested on powerpc64le-linux-gnu Power9.
>>
>> Is it ok for trunk?
>>
>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579121.html
>>
>> BR,
>> Kewen
>> -----
>> gcc/ChangeLog:
>>
>>     * config/rs6000/rs6000.opt (rs6000-density-pct-threshold,
>>     rs6000-density-size-threshold, rs6000-density-penalty,
>>     rs6000-density-load-pct-threshold,
>>     rs6000-density-load-num-threshold): New parameter.
>>     * config/rs6000/rs6000.c (rs6000_density_test): Adjust with
>>     corresponding parameters.
>>
>> ---
>>   gcc/config/rs6000/rs6000.c   | 22 +++++++---------------
>>   gcc/config/rs6000/rs6000.opt | 21 +++++++++++++++++++++
>>   2 files changed, 28 insertions(+), 15 deletions(-)
>>
>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>> index 9bc826e3a50..4ab23b0ab33 100644
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -5284,9 +5284,6 @@ struct rs6000_cost_data
>>   static void
>>   rs6000_density_test (rs6000_cost_data *data)
>>   {
>> -  const int DENSITY_PCT_THRESHOLD = 85;
>> -  const int DENSITY_SIZE_THRESHOLD = 70;
>> -  const int DENSITY_PENALTY = 10;
>>     struct loop *loop = data->loop_info;
>>     basic_block *bbs = get_loop_body (loop);
>>     int nbbs = loop->num_nodes;
>> @@ -5322,26 +5319,21 @@ rs6000_density_test (rs6000_cost_data *data)
>>     free (bbs);
>>     density_pct = (vec_cost * 100) / (vec_cost + not_vec_cost);
>>
>> -  if (density_pct > DENSITY_PCT_THRESHOLD
>> -      && vec_cost + not_vec_cost > DENSITY_SIZE_THRESHOLD)
>> +  if (density_pct > rs6000_density_pct_threshold
>> +      && vec_cost + not_vec_cost > rs6000_density_size_threshold)
>>       {
>> -      data->cost[vect_body] = vec_cost * (100 + DENSITY_PENALTY) / 100;
>> +      data->cost[vect_body] = vec_cost * (100 + rs6000_density_penalty) / 100;
>>         if (dump_enabled_p ())
>>       dump_printf_loc (MSG_NOTE, vect_location,
>>                "density %d%%, cost %d exceeds threshold, penalizing "
>> -             "loop body cost by %d%%\n", density_pct,
>> -             vec_cost + not_vec_cost, DENSITY_PENALTY);
>> +             "loop body cost by %u%%\n", density_pct,
>> +             vec_cost + not_vec_cost, rs6000_density_penalty);
>>       }
>>
>>     /* Check whether we need to penalize the body cost to account
>>        for excess strided or elementwise loads.  */
>>     if (data->extra_ctor_cost > 0)
>>       {
>> -      /* Threshold for load stmts percentage in all vectorized stmts.  */
>> -      const int DENSITY_LOAD_PCT_THRESHOLD = 45;
>> -      /* Threshold for total number of load stmts.  */
>> -      const int DENSITY_LOAD_NUM_THRESHOLD = 20;
>> -
>>         gcc_assert (data->nloads <= data->nstmts);
>>         unsigned int load_pct = (data->nloads * 100) / data->nstmts;
>>
>> @@ -5355,8 +5347,8 @@ rs6000_density_test (rs6000_cost_data *data)
>>             the loads.
>>        One typical case is the innermost loop of the hotspot of SPEC2017
>>        503.bwaves_r without loop interchange.  */
>> -      if (data->nloads > DENSITY_LOAD_NUM_THRESHOLD
>> -      && load_pct > DENSITY_LOAD_PCT_THRESHOLD)
>> +      if (data->nloads > (unsigned int) rs6000_density_load_num_threshold
>> +      && load_pct > (unsigned int) rs6000_density_load_pct_threshold)
>>       {
>>         data->cost[vect_body] += data->extra_ctor_cost;
>>         if (dump_enabled_p ())
>> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
>> index 0538db387dc..563983f3269 100644
>> --- a/gcc/config/rs6000/rs6000.opt
>> +++ b/gcc/config/rs6000/rs6000.opt
>> @@ -639,3 +639,24 @@ Enable instructions that guard against return-oriented programming attacks.
>>   mprivileged
>>   Target Var(rs6000_privileged) Init(0)
>>   Generate code that will run in privileged state.
>> +
>> +-param=rs6000-density-pct-threshold=
>> +Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 99) Param
>> +When costing for loop vectorization, we probably need to penalize the loop body cost if the existing cost model may not adequately reflect delays from unavailable vector resources.  We collect the cost for vectorized statements and non-vectorized statements separately, check the proportion of vec_cost to total cost of vec_cost and non vec_cost, and penalize only if the proportion exceeds the threshold specified by this parameter.  The default value is 85.
>> +
>> +-param=rs6000-density-size-threshold=
>> +Target Undocumented Joined UInteger Var(rs6000_density_size_threshold) Init(70) IntegerRange(0, 99) Param
> 
> I think 99 is not a sufficient upper bound.  This is a counting value that could in theory get much higher.  Can you set it to something ridiculous like IntegerRange(0, 1000)?
> 

Thanks for catching!  It's a copy & paste typo.  :(  Will fix!
>> +Like parameter rs6000-density-pct-threshold, we also check the total sum of vec_cost and non vec_cost, and penalize only if the sum exceeds the threshold specified by this parameter.  The default value is 70.
>> +
>> +-param=rs6000-density-penalty=
>> +Target Undocumented Joined UInteger Var(rs6000_density_penalty) Init(10) IntegerRange(0, 1000) Param
>> +When both heuristics with rs6000-density-pct-threshold and rs6000-density-size-threshold are satisfied, we decide to penalize the loop body cost by the value which is specified by this parameter.  The default value is 10.
>> +
>> +-param=rs6000-density-load-pct-threshold=
>> +Target Undocumented Joined UInteger Var(rs6000_density_load_pct_threshold) Init(45) IntegerRange(0, 99) Param
>> +When costing for loop vectorization, we probably need to penalize the loop body cost by accounting for excess strided or elementwise loads.  We collect the numbers for general statements and load statements according to the information for statement to be vectorized, check the proportion of load statements, and penalize only if the proportion exceeds the threshold specified by this parameter.  The default value is 45.
> 
> "according to the information for statement*s* to be vectorized"
> 

Good catch, will fix.

> Otherwise, these descriptions are quite excellent, by the way.
> 
> This looks good to me with those minor things addressed.  Recommend that maintainers approve.
> 

Thanks again!

BR,
Kewen

> Thanks for the patch!
> Bill
>> +
>> +-param=rs6000-density-load-num-threshold=
>> +Target Undocumented Joined UInteger Var(rs6000_density_load_num_threshold) Init(20) IntegerRange(0, 1000) Param
>> +Like parameter rs6000-density-load-pct-threshold, we also check if the total number of load statements exceeds the threshold specified by this parameter, and penalize only if it's satisfied.  The default value is 20.
>> +
>> -- 
>> 2.25.1
>>
>
  
Kewen.Lin Sept. 21, 2021, 5:47 a.m. UTC | #5
on 2021/9/18 上午6:26, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Sep 15, 2021 at 04:52:49PM +0800, Kewen.Lin wrote:
>> This patch follows the discussion here[1], where Segher suggested
>> parameterizing those exact magic constants for density heuristics,
>> to make it easier to tweak if need.
>>
>> Since these heuristics are quite internal, I make these parameters
>> as undocumented and be mainly used by developers.
> 
> Okido.
> 
>> +      if (data->nloads > (unsigned int) rs6000_density_load_num_threshold
>> +	  && load_pct > (unsigned int) rs6000_density_load_pct_threshold)
> 
> Those variables are unsigned int already.  Don't cast please.
> 

Unfortunately this is required by bootstrapping.  The UInteger for the
param definition is really confusing, in the underlying implementation
it's still "signed".  If you grep "(unsigned) param", you can see a few
examples.  I guess the "UInteger" is mainly for the param value range
checking.

>> +-param=rs6000-density-pct-threshold=
>> +Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 99) Param
> 
> So make this and all other percentages (0, 100) please.
> 

I thought 99 is enough for the RHS in ">". just realized it's more clear
with 100.  Will fix!

>> +When costing for loop vectorization, we probably need to penalize the loop body cost if the existing cost model may not adequately reflect delays from unavailable vector resources.  We collect the cost for vectorized statements and non-vectorized statements separately, check the proportion of vec_cost to total cost of vec_cost and non vec_cost, and penalize only if the proportion exceeds the threshold specified by this parameter.  The default value is 85.
> 
> It would be good if we can use line breaks in the source code for things
> like this, but I don't think we can.  This message is mainly used for
> "--help=param", and it is good there to have as short messages as you
> can.  But given the nature of params you need quite a few words often,
> and you do not want to say so little that things are no clear, either.
> 
> So, dunno :-)

I did some testings, the line breaks writing can still survive in the
"--help=param" show, the lines are concatenated with " ".  Although
there seems no this kind of writing practices, I am guessing you want
me to do line breaks for their descriptions?  If so, I will make them
short as the above "Target Undocumented..." line.  Or do you want it
to align source code ColumnLimit 80 (for these cases, it would look
shorter)?

> 
> Oksy for trunk with these fixes and what Bill mentioned in the other
> thread.  Thanks!
> 

OK, thanks again!

BR,
Kewen
  
Segher Boessenkool Sept. 21, 2021, 12:03 p.m. UTC | #6
Hi!

On Tue, Sep 21, 2021 at 01:47:19PM +0800, Kewen.Lin wrote:
> on 2021/9/18 上午6:26, Segher Boessenkool wrote:
> >> +      if (data->nloads > (unsigned int) rs6000_density_load_num_threshold
> >> +	  && load_pct > (unsigned int) rs6000_density_load_pct_threshold)
> > 
> > Those variables are unsigned int already.  Don't cast please.
> 
> Unfortunately this is required by bootstrapping.  The UInteger for the
> param definition is really confusing, in the underlying implementation
> it's still "signed".  If you grep "(unsigned) param", you can see a few
> examples.  I guess the "UInteger" is mainly for the param value range
> checking.

Huh, I see.  Is that a bug?  It certainly is surprising!  Please open a
PR if you think it could/should be improved, put me on Cc:?

> >> +-param=rs6000-density-pct-threshold=
> >> +Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 99) Param
> > 
> > So make this and all other percentages (0, 100) please.
> 
> I thought 99 is enough for the RHS in ">". just realized it's more clear
> with 100.  Will fix!

99 will work fine, but it's not the best choice for the user, who will
expect that a percentage can be anything from 0% to 100%.

> >> +When costing for loop vectorization, we probably need to penalize the loop body cost if the existing cost model may not adequately reflect delays from unavailable vector resources.  We collect the cost for vectorized statements and non-vectorized statements separately, check the proportion of vec_cost to total cost of vec_cost and non vec_cost, and penalize only if the proportion exceeds the threshold specified by this parameter.  The default value is 85.
> > 
> > It would be good if we can use line breaks in the source code for things
> > like this, but I don't think we can.  This message is mainly used for
> > "--help=param", and it is good there to have as short messages as you
> > can.  But given the nature of params you need quite a few words often,
> > and you do not want to say so little that things are no clear, either.
> > 
> > So, dunno :-)
> 
> I did some testings, the line breaks writing can still survive in the
> "--help=param" show, the lines are concatenated with " ".  Although
> there seems no this kind of writing practices, I am guessing you want
> me to do line breaks for their descriptions?  If so, I will make them
> short as the above "Target Undocumented..." line.  Or do you want it
> to align source code ColumnLimit 80 (for these cases, it would look
> shorter)?

It would help if was more readable in the surce code, one line of close
to 500 columns is not very manageable :-)

But the thing that matters is what it will look like in the --help=
output (and/or the manual).


Segher
  
Kewen.Lin Sept. 22, 2021, 5:17 a.m. UTC | #7
on 2021/9/21 下午8:03, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Sep 21, 2021 at 01:47:19PM +0800, Kewen.Lin wrote:
>> on 2021/9/18 上午6:26, Segher Boessenkool wrote:
>>>> +      if (data->nloads > (unsigned int) rs6000_density_load_num_threshold
>>>> +	  && load_pct > (unsigned int) rs6000_density_load_pct_threshold)
>>>
>>> Those variables are unsigned int already.  Don't cast please.
>>
>> Unfortunately this is required by bootstrapping.  The UInteger for the
>> param definition is really confusing, in the underlying implementation
>> it's still "signed".  If you grep "(unsigned) param", you can see a few
>> examples.  I guess the "UInteger" is mainly for the param value range
>> checking.
> 
> Huh, I see.  Is that a bug?  It certainly is surprising!  Please open a
> PR if you think it could/should be improved, put me on Cc:?
> 

I guessed it's not a bug, "UInteger" is more for the opt/param value range
checking, but could be improved.  PR102440 filed as you suggested.  :)

>>>> +-param=rs6000-density-pct-threshold=
>>>> +Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 99) Param
>>>
>>> So make this and all other percentages (0, 100) please.
>>
>> I thought 99 is enough for the RHS in ">". just realized it's more clear
>> with 100.  Will fix!
> 
> 99 will work fine, but it's not the best choice for the user, who will
> expect that a percentage can be anything from 0% to 100%.
> 
>>>> +When costing for loop vectorization, we probably need to penalize the loop body cost if the existing cost model may not adequately reflect delays from unavailable vector resources.  We collect the cost for vectorized statements and non-vectorized statements separately, check the proportion of vec_cost to total cost of vec_cost and non vec_cost, and penalize only if the proportion exceeds the threshold specified by this parameter.  The default value is 85.
>>>
>>> It would be good if we can use line breaks in the source code for things
>>> like this, but I don't think we can.  This message is mainly used for
>>> "--help=param", and it is good there to have as short messages as you
>>> can.  But given the nature of params you need quite a few words often,
>>> and you do not want to say so little that things are no clear, either.
>>>
>>> So, dunno :-)
>>
>> I did some testings, the line breaks writing can still survive in the
>> "--help=param" show, the lines are concatenated with " ".  Although
>> there seems no this kind of writing practices, I am guessing you want
>> me to do line breaks for their descriptions?  If so, I will make them
>> short as the above "Target Undocumented..." line.  Or do you want it
>> to align source code ColumnLimit 80 (for these cases, it would look
>> shorter)?
> 
> It would help if was more readable in the surce code, one line of close
> to 500 columns is not very manageable :-)
> 
> But the thing that matters is what it will look like in the --help=
> output (and/or the manual).
> 

OK, I've used ColumnLimit 80 for that.  The outputs in --help= before/after
the line breaks look the same (smoother than what I can expect).  :)

Committed in r12-3767, thanks!

BR,
Kewen
  

Patch

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 9bc826e3a50..4ab23b0ab33 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -5284,9 +5284,6 @@  struct rs6000_cost_data
 static void
 rs6000_density_test (rs6000_cost_data *data)
 {
-  const int DENSITY_PCT_THRESHOLD = 85;
-  const int DENSITY_SIZE_THRESHOLD = 70;
-  const int DENSITY_PENALTY = 10;
   struct loop *loop = data->loop_info;
   basic_block *bbs = get_loop_body (loop);
   int nbbs = loop->num_nodes;
@@ -5322,26 +5319,21 @@  rs6000_density_test (rs6000_cost_data *data)
   free (bbs);
   density_pct = (vec_cost * 100) / (vec_cost + not_vec_cost);

-  if (density_pct > DENSITY_PCT_THRESHOLD
-      && vec_cost + not_vec_cost > DENSITY_SIZE_THRESHOLD)
+  if (density_pct > rs6000_density_pct_threshold
+      && vec_cost + not_vec_cost > rs6000_density_size_threshold)
     {
-      data->cost[vect_body] = vec_cost * (100 + DENSITY_PENALTY) / 100;
+      data->cost[vect_body] = vec_cost * (100 + rs6000_density_penalty) / 100;
       if (dump_enabled_p ())
 	dump_printf_loc (MSG_NOTE, vect_location,
 			 "density %d%%, cost %d exceeds threshold, penalizing "
-			 "loop body cost by %d%%\n", density_pct,
-			 vec_cost + not_vec_cost, DENSITY_PENALTY);
+			 "loop body cost by %u%%\n", density_pct,
+			 vec_cost + not_vec_cost, rs6000_density_penalty);
     }

   /* Check whether we need to penalize the body cost to account
      for excess strided or elementwise loads.  */
   if (data->extra_ctor_cost > 0)
     {
-      /* Threshold for load stmts percentage in all vectorized stmts.  */
-      const int DENSITY_LOAD_PCT_THRESHOLD = 45;
-      /* Threshold for total number of load stmts.  */
-      const int DENSITY_LOAD_NUM_THRESHOLD = 20;
-
       gcc_assert (data->nloads <= data->nstmts);
       unsigned int load_pct = (data->nloads * 100) / data->nstmts;

@@ -5355,8 +5347,8 @@  rs6000_density_test (rs6000_cost_data *data)
 	      the loads.
 	 One typical case is the innermost loop of the hotspot of SPEC2017
 	 503.bwaves_r without loop interchange.  */
-      if (data->nloads > DENSITY_LOAD_NUM_THRESHOLD
-	  && load_pct > DENSITY_LOAD_PCT_THRESHOLD)
+      if (data->nloads > (unsigned int) rs6000_density_load_num_threshold
+	  && load_pct > (unsigned int) rs6000_density_load_pct_threshold)
 	{
 	  data->cost[vect_body] += data->extra_ctor_cost;
 	  if (dump_enabled_p ())
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 0538db387dc..563983f3269 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -639,3 +639,24 @@  Enable instructions that guard against return-oriented programming attacks.
 mprivileged
 Target Var(rs6000_privileged) Init(0)
 Generate code that will run in privileged state.
+
+-param=rs6000-density-pct-threshold=
+Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 99) Param
+When costing for loop vectorization, we probably need to penalize the loop body cost if the existing cost model may not adequately reflect delays from unavailable vector resources.  We collect the cost for vectorized statements and non-vectorized statements separately, check the proportion of vec_cost to total cost of vec_cost and non vec_cost, and penalize only if the proportion exceeds the threshold specified by this parameter.  The default value is 85.
+
+-param=rs6000-density-size-threshold=
+Target Undocumented Joined UInteger Var(rs6000_density_size_threshold) Init(70) IntegerRange(0, 99) Param
+Like parameter rs6000-density-pct-threshold, we also check the total sum of vec_cost and non vec_cost, and penalize only if the sum exceeds the threshold specified by this parameter.  The default value is 70.
+
+-param=rs6000-density-penalty=
+Target Undocumented Joined UInteger Var(rs6000_density_penalty) Init(10) IntegerRange(0, 1000) Param
+When both heuristics with rs6000-density-pct-threshold and rs6000-density-size-threshold are satisfied, we decide to penalize the loop body cost by the value which is specified by this parameter.  The default value is 10.
+
+-param=rs6000-density-load-pct-threshold=
+Target Undocumented Joined UInteger Var(rs6000_density_load_pct_threshold) Init(45) IntegerRange(0, 99) Param
+When costing for loop vectorization, we probably need to penalize the loop body cost by accounting for excess strided or elementwise loads.  We collect the numbers for general statements and load statements according to the information for statement to be vectorized, check the proportion of load statements, and penalize only if the proportion exceeds the threshold specified by this parameter.  The default value is 45.
+
+-param=rs6000-density-load-num-threshold=
+Target Undocumented Joined UInteger Var(rs6000_density_load_num_threshold) Init(20) IntegerRange(0, 1000) Param
+Like parameter rs6000-density-load-pct-threshold, we also check if the total number of load statements exceeds the threshold specified by this parameter, and penalize only if it's satisfied.  The default value is 20.
+