LoongArch: Fix lo_sum rtx cost
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gcc_build--master-arm |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gcc_check--master-arm |
success
|
Testing passed
|
Commit Message
The cost of lo_sum rtx for addi.d instruction my be a very big number if
computed by common function. It may cause some symbols saving to stack and
loading from stack if there no enough registers during loop optimization.
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_rtx_costs): Add lo_sum cost.
---
gcc/config/loongarch/loongarch.cc | 4 ++++
1 file changed, 4 insertions(+)
Comments
Hi,
On 9/16/23 17:16, mengqinggang wrote:
> The cost of lo_sum rtx for addi.d instruction my be a very big number if
> computed by common function. It may cause some symbols saving to stack and
> loading from stack if there no enough registers during loop optimization.
Thanks for the patch! It seems though this change is done in order to
optimize some previously pathetic codegen, am I right? If so, it's
appreciated to have a minimal test case attached, in order to ensure
that codegen never regresses. (You can have your teammates help you if
you're not familiar with that.)
>
> gcc/ChangeLog:
>
> * config/loongarch/loongarch.cc (loongarch_rtx_costs): Add lo_sum cost.
> ---
> gcc/config/loongarch/loongarch.cc | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
> index 845fad5a8e8..0e57f09379c 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -3648,6 +3648,10 @@ loongarch_rtx_costs (rtx x, machine_mode mode, int outer_code,
> *total = COSTS_N_INSNS (4);
> return false;
>
> + case LO_SUM:
> + *total = set_src_cost (XEXP (x, 0), mode, speed);
> + return true;
> +
In order for the code to be more maintainable, it may be better to
duplicate some of the change reasons here, just in case someone in the
future questions this piece of code that's without any explanation, and
regresses things (because there's no test case).
> case LT:
> case LTU:
> case LE:
在 2023/9/16 下午10:52, WANG Xuerui 写道:
> Hi,
>
> On 9/16/23 17:16, mengqinggang wrote:
>> The cost of lo_sum rtx for addi.d instruction my be a very big number if
>> computed by common function. It may cause some symbols saving to
>> stack and
>> loading from stack if there no enough registers during loop
>> optimization.
>
> Thanks for the patch! It seems though this change is done in order to
> optimize some previously pathetic codegen, am I right? If so, it's
> appreciated to have a minimal test case attached, in order to ensure
> that codegen never regresses. (You can have your teammates help you if
> you're not familiar with that.)
This is a performance optimization problem discovered by Meng Qinggang
when he was debugging the spec. The specific test cases are not easy to
extract.
We will try to extract simple test cases to reproduce this optimization.
If not, we will mark the description information.
Thanks!
>
>>
>> gcc/ChangeLog:
>>
>> * config/loongarch/loongarch.cc (loongarch_rtx_costs): Add lo_sum
>> cost.
>> ---
>> gcc/config/loongarch/loongarch.cc | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/gcc/config/loongarch/loongarch.cc
>> b/gcc/config/loongarch/loongarch.cc
>> index 845fad5a8e8..0e57f09379c 100644
>> --- a/gcc/config/loongarch/loongarch.cc
>> +++ b/gcc/config/loongarch/loongarch.cc
>> @@ -3648,6 +3648,10 @@ loongarch_rtx_costs (rtx x, machine_mode mode,
>> int outer_code,
>> *total = COSTS_N_INSNS (4);
>> return false;
>> + case LO_SUM:
>> + *total = set_src_cost (XEXP (x, 0), mode, speed);
>> + return true;
>> +
> In order for the code to be more maintainable, it may be better to
> duplicate some of the change reasons here, just in case someone in the
> future questions this piece of code that's without any explanation,
> and regresses things (because there's no test case).
>> case LT:
>> case LTU:
>> case LE:
@@ -3648,6 +3648,10 @@ loongarch_rtx_costs (rtx x, machine_mode mode, int outer_code,
*total = COSTS_N_INSNS (4);
return false;
+ case LO_SUM:
+ *total = set_src_cost (XEXP (x, 0), mode, speed);
+ return true;
+
case LT:
case LTU:
case LE: