LoongArch: Fix lo_sum rtx cost

Message ID 20230916091643.3160525-1-mengqinggang@loongson.cn
State New
Headers
Series LoongArch: Fix lo_sum rtx cost |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Testing passed

Commit Message

mengqinggang Sept. 16, 2023, 9:16 a.m. UTC
  The cost of lo_sum rtx for addi.d instruction my be a very big number if
computed by common function. It may cause some symbols saving to stack and
loading from stack if there no enough registers during loop optimization.

gcc/ChangeLog:

	* config/loongarch/loongarch.cc (loongarch_rtx_costs): Add lo_sum cost.
---
 gcc/config/loongarch/loongarch.cc | 4 ++++
 1 file changed, 4 insertions(+)
  

Comments

WANG Xuerui Sept. 16, 2023, 2:52 p.m. UTC | #1
Hi,

On 9/16/23 17:16, mengqinggang wrote:
> The cost of lo_sum rtx for addi.d instruction my be a very big number if
> computed by common function. It may cause some symbols saving to stack and
> loading from stack if there no enough registers during loop optimization.

Thanks for the patch! It seems though this change is done in order to 
optimize some previously pathetic codegen, am I right? If so, it's 
appreciated to have a minimal test case attached, in order to ensure 
that codegen never regresses. (You can have your teammates help you if 
you're not familiar with that.)

>
> gcc/ChangeLog:
>
> 	* config/loongarch/loongarch.cc (loongarch_rtx_costs): Add lo_sum cost.
> ---
>   gcc/config/loongarch/loongarch.cc | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
> index 845fad5a8e8..0e57f09379c 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -3648,6 +3648,10 @@ loongarch_rtx_costs (rtx x, machine_mode mode, int outer_code,
>   	*total = COSTS_N_INSNS (4);
>         return false;
>   
> +    case LO_SUM:
> +      *total = set_src_cost (XEXP (x, 0), mode, speed);
> +      return true;
> +
In order for the code to be more maintainable, it may be better to 
duplicate some of the change reasons here, just in case someone in the 
future questions this piece of code that's without any explanation, and 
regresses things (because there's no test case).
>       case LT:
>       case LTU:
>       case LE:
  
Lulu Cheng Sept. 17, 2023, 1:42 a.m. UTC | #2
在 2023/9/16 下午10:52, WANG Xuerui 写道:
> Hi,
>
> On 9/16/23 17:16, mengqinggang wrote:
>> The cost of lo_sum rtx for addi.d instruction my be a very big number if
>> computed by common function. It may cause some symbols saving to 
>> stack and
>> loading from stack if there no enough registers during loop 
>> optimization.
>
> Thanks for the patch! It seems though this change is done in order to 
> optimize some previously pathetic codegen, am I right? If so, it's 
> appreciated to have a minimal test case attached, in order to ensure 
> that codegen never regresses. (You can have your teammates help you if 
> you're not familiar with that.)

This is a performance optimization problem discovered by Meng Qinggang 
when he was debugging the spec. The specific test cases are not easy to 
extract.

We will try to extract simple test cases to reproduce this optimization. 
If not, we will mark the description information.

Thanks!

>
>>
>> gcc/ChangeLog:
>>
>>     * config/loongarch/loongarch.cc (loongarch_rtx_costs): Add lo_sum 
>> cost.
>> ---
>>   gcc/config/loongarch/loongarch.cc | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/gcc/config/loongarch/loongarch.cc 
>> b/gcc/config/loongarch/loongarch.cc
>> index 845fad5a8e8..0e57f09379c 100644
>> --- a/gcc/config/loongarch/loongarch.cc
>> +++ b/gcc/config/loongarch/loongarch.cc
>> @@ -3648,6 +3648,10 @@ loongarch_rtx_costs (rtx x, machine_mode mode, 
>> int outer_code,
>>       *total = COSTS_N_INSNS (4);
>>         return false;
>>   +    case LO_SUM:
>> +      *total = set_src_cost (XEXP (x, 0), mode, speed);
>> +      return true;
>> +
> In order for the code to be more maintainable, it may be better to 
> duplicate some of the change reasons here, just in case someone in the 
> future questions this piece of code that's without any explanation, 
> and regresses things (because there's no test case).
>>       case LT:
>>       case LTU:
>>       case LE:
  

Patch

diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 845fad5a8e8..0e57f09379c 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -3648,6 +3648,10 @@  loongarch_rtx_costs (rtx x, machine_mode mode, int outer_code,
 	*total = COSTS_N_INSNS (4);
       return false;
 
+    case LO_SUM:
+      *total = set_src_cost (XEXP (x, 0), mode, speed);
+      return true;
+
     case LT:
     case LTU:
     case LE: