[rs6000] punish reload of lfiwzx when loading an int variable [PR102169, PR102146]

Message ID 5d0b68da-7198-5cda-df42-3cb3a168ce89@linux.ibm.com
State New
Headers
Series [rs6000] punish reload of lfiwzx when loading an int variable [PR102169, PR102146] |

Commit Message

HAO CHEN GUI Sept. 29, 2021, 8:32 a.m. UTC
  Hi,

   The patch punishes reload of alternative pair of "d, Z" for movsi_internal1. The reload occurs if 'Z' doesn't match and generates an additional insn. So the memory reload should be punished.

   Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot.


ChangeLog

2021-09-29 Haochen Gui <guihaoc@linux.ibm.com>

gcc/
         * gcc/config/rs6000/rs6000.md (movsi_internal1): disparages
         slightly the alternative 'Z' of "lfiwzx" when reload is needed.

patch.diff
  

Comments

HAO CHEN GUI Oct. 11, 2021, 5:46 a.m. UTC | #1
Hi,

     Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580479.html

Thanks


On 29/9/2021 下午 4:32, HAO CHEN GUI wrote:
> Hi,
>
>   The patch punishes reload of alternative pair of "d, Z" for movsi_internal1. The reload occurs if 'Z' doesn't match and generates an additional insn. So the memory reload should be punished.
>
>   Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot.
>
>
> ChangeLog
>
> 2021-09-29 Haochen Gui <guihaoc@linux.ibm.com>
>
> gcc/
>         * gcc/config/rs6000/rs6000.md (movsi_internal1): disparages
>         slightly the alternative 'Z' of "lfiwzx" when reload is needed.
>
> patch.diff
>
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 6bec2bddbde..c961f2df4a7 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -7341,7 +7341,7 @@ (define_insn "*movsi_internal1"
>            r,          *h,         *h")
>         (match_operand:SI 1 "input_operand"
>           "r,          U,
> -          m,          Z,          Z,
> +          m,          ^Z,         Z,
>            r,          d,          v,
>            I,          L,          eI,         n,
>            wa,         O,          wM,         wB,
>
  
Segher Boessenkool Oct. 14, 2021, 12:12 a.m. UTC | #2
On Wed, Sep 29, 2021 at 04:32:19PM +0800, HAO CHEN GUI wrote:
>   The patch punishes reload of alternative pair of "d, Z" for 
> movsi_internal1. The reload occurs if 'Z' doesn't match and generates an 
> additional insn. So the memory reload should be punished.

As David says, why only for loads?  But also, why not for lxsiwzx (and
stxsiwx) as well?

But, what for all other uses of lfiwzx?  And lfiwax?

We need to find out why the register allocator considers it a good idea
to use FP regs here, and fix *that*?

The extra insn you talk about is because this insn only allows indexed
addressing ([reg+reg] or [reg] addressing).  That is true for very many
insns.  Reload (well, LRA in the modern world) should know about such
extra costs.  Does it not?

> gcc/
>         * gcc/config/rs6000/rs6000.md (movsi_internal1): disparages
>         slightly the alternative 'Z' of "lfiwzx" when reload is 
> needed.

"Disparage", no "s".  Changelog entries are written in the imperative.


Segher
  
HAO CHEN GUI Oct. 14, 2021, 8:37 a.m. UTC | #3
On 14/10/2021 上午 8:12, Segher Boessenkool wrote:
> On Wed, Sep 29, 2021 at 04:32:19PM +0800, HAO CHEN GUI wrote:
>>    The patch punishes reload of alternative pair of "d, Z" for
>> movsi_internal1. The reload occurs if 'Z' doesn't match and generates an
>> additional insn. So the memory reload should be punished.
> As David says, why only for loads?  But also, why not for lxsiwzx (and
> stxsiwx) as well?
>
> But, what for all other uses of lfiwzx?  And lfiwax?
>
> We need to find out why the register allocator considers it a good idea
> to use FP regs here, and fix *that*?

Let me explain why it uses FP regs.

In ira pass,  the cost of general, float and altivec registers are all zero. So it prefers 'GEN_OR_VSX_REGS' class and is finally assigned register vs32. Not sure if the altivec registers are preferable for 'GEN_OR_VSX_REGS'.

     r120: preferred GEN_OR_VSX_REGS, alternative NO_REGS, allocno GEN_OR_VSX_REGS

   a0(r120,l0) costs: BASE_REGS:0,0 GENERAL_REGS:0,0 FLOAT_REGS:0,0 ALTIVEC_REGS:0,0 VSX_REGS:4000,4000 GEN_OR_FLOAT_REGS:6000,6000 GEN_OR_VSX_REGS:6000,6000 LINK_REGS:12000,12000 CTR_REGS:12000,12000 LINK_OR_CTR_REGS:12000,12000 SPEC_OR_GEN_REGS:12000,12000 ALL_REGS:36000,36000 MEM:8000,8000

       Allocno a0r120 of GEN_OR_VSX_REGS(93) has 93 avail. regs  0 3-12 14-95, node:  0 3-12 14-95 (confl regs =  1-2 13 96-110)
       Forming thread from colorable bucket:
       Pushing a0(r120,l0)(cost 0)
       Popping a0(r120,l0)  --         assign reg 32
Disposition:
     0:r120 l0    32

In reload pass, the third alternative pair is "r,m" and the forth pair is "d,Z".  As the 'r' doesn't match the class of reg vs32, it needs a output reload and got a "reject++" as well as a "losers". For pair "d,Z", the second operands 'Z' doesn't match. It needs a input reload and got a "losers". But the memory reload is not punished if there is no addition modifies with the alternative. So it only got a "addr_losers". Overall cost of "r,m" is great than "d,Z". So it picked up FP register and 'lfiwzx' instruction.

             0 Non input pseudo reload: reject++
           alt=2,overall=7,losers=1,rld_nregs=1
           alt=3,overall=6,losers=1,rld_nregs=0

Gui Haochen

>
> The extra insn you talk about is because this insn only allows indexed
> addressing ([reg+reg] or [reg] addressing).  That is true for very many
> insns.  Reload (well, LRA in the modern world) should know about such
> extra costs.  Does it not?
>
>> gcc/
>>          * gcc/config/rs6000/rs6000.md (movsi_internal1): disparages
>>          slightly the alternative 'Z' of "lfiwzx" when reload is
>> needed.
> "Disparage", no "s".  Changelog entries are written in the imperative.
>
>
> Segher
  

Patch

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6bec2bddbde..c961f2df4a7 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7341,7 +7341,7 @@  (define_insn "*movsi_internal1"
            r,          *h,         *h")
         (match_operand:SI 1 "input_operand"
           "r,          U,
-          m,          Z,          Z,
+          m,          ^Z,         Z,
            r,          d,          v,
            I,          L,          eI,         n,
            wa,         O,          wM,         wB,