rtl-optimization: ppc backend generates unnecessary signed extension.

Message ID 4c0b6b4f-bbc1-8dd0-a91c-ffed028b4873@linux.ibm.com
State New
Headers
Series rtl-optimization: ppc backend generates unnecessary signed extension. |

Commit Message

Ajit Agarwal March 23, 2023, 10:38 a.m. UTC
  Hello All:

This patch removed unnecessary signed extension elimination in ree pass.
Bootstrapped and regtested on powerpc64-linux-gnu.


Thanks & Regards
Ajit

	rtl-optimization: ppc backend generates unnecessary signed extension.

	Eliminate unnecessary redundant signed extension.

	2023-03-23  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>

gcc/ChangeLog:

	* ree.cc: Modification for  AND opcode support to eliminate
	unnecessary signed extension.
	* testsuite/g++.target/powerpc/sext-elim.C: New tests.
---
 gcc/ree.cc                                   | 24 +++++++++++++++++---
 gcc/testsuite/g++.target/powerpc/sext-elim.C | 19 ++++++++++++++++
 2 files changed, 40 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/powerpc/sext-elim.C
  

Comments

Peter Bergner March 23, 2023, 12:38 p.m. UTC | #1
On 3/23/23 5:38 AM, Ajit Agarwal wrote:
> This patch removed unnecessary signed extension elimination in ree pass.
> Bootstrapped and regtested on powerpc64-linux-gnu.
> 
> 
> Thanks & Regards
> Ajit
> 
> 	rtl-optimization: ppc backend generates unnecessary signed extension.
> 
> 	Eliminate unnecessary redundant signed extension.
> 
> 	2023-03-23  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
> 
> gcc/ChangeLog:
> 
> 	* ree.cc: Modification for  AND opcode support to eliminate
> 	unnecessary signed extension.
> 	* testsuite/g++.target/powerpc/sext-elim.C: New tests.

Not a review of the patch, but we talked offline about other bugzillas
regarding unnecessary sign and zero extensions.  Doing a quick scan, I
see the following bugs.  Please have a look at 1) whether these are
still a problem with unpatched trunk, and if they are, 2) whether your
patch fixes them or could fix them.  Thanks.

    https://gcc.gnu.org/PR41742
    https://gcc.gnu.org/PR65010
    https://gcc.gnu.org/PR82940
    https://gcc.gnu.org/PR107949

Peter
  
Jeff Law March 23, 2023, 1:47 p.m. UTC | #2
On 3/23/23 04:38, Ajit Agarwal wrote:
> 
> Hello All:
> 
> This patch removed unnecessary signed extension elimination in ree pass.
> Bootstrapped and regtested on powerpc64-linux-gnu.
> 
> 
> Thanks & Regards
> Ajit
> 
> 	rtl-optimization: ppc backend generates unnecessary signed extension.
> 
> 	Eliminate unnecessary redundant signed extension.
> 
> 	2023-03-23  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
> 
> gcc/ChangeLog:
> 
> 	* ree.cc: Modification for  AND opcode support to eliminate
> 	unnecessary signed extension.
> 	* testsuite/g++.target/powerpc/sext-elim.C: New tests.
Just a note.  I'll look at this once the trunk is open for gcc-14 
development.  It's really not appropriate for gcc-13.

jeff
  
Ajit Agarwal March 23, 2023, 3:32 p.m. UTC | #3
Hello Peter:

On 23/03/23 6:08 pm, Peter Bergner wrote:
> On 3/23/23 5:38 AM, Ajit Agarwal wrote:
>> This patch removed unnecessary signed extension elimination in ree pass.
>> Bootstrapped and regtested on powerpc64-linux-gnu.
>>
>>
>> Thanks & Regards
>> Ajit
>>
>> 	rtl-optimization: ppc backend generates unnecessary signed extension.
>>
>> 	Eliminate unnecessary redundant signed extension.
>>
>> 	2023-03-23  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
>>
>> gcc/ChangeLog:
>>
>> 	* ree.cc: Modification for  AND opcode support to eliminate
>> 	unnecessary signed extension.
>> 	* testsuite/g++.target/powerpc/sext-elim.C: New tests.
> 
> Not a review of the patch, but we talked offline about other bugzillas
> regarding unnecessary sign and zero extensions.  Doing a quick scan, I
> see the following bugs.  Please have a look at 1) whether these are
> still a problem with unpatched trunk, and if they are, 2) whether your
> patch fixes them or could fix them.  Thanks.
> 
>     https://gcc.gnu.org/PR41742

These are not addressed in the trunk patch, because int c is not initialized with registers and for this reason we cannot eliminate them. If we initialize int c then zero extension goes away.

>     https://gcc.gnu.org/PR65010
>     https://gcc.gnu.org/PR82940
>     https://gcc.gnu.org/PR107949
>

My patch fixes these PR's which were not fixed in trunk patch.

Thanks & Regards
Ajit
 
> Peter
>
  
Ajit Agarwal March 23, 2023, 3:32 p.m. UTC | #4
Hello Peter:

On 23/03/23 6:08 pm, Peter Bergner wrote:
> On 3/23/23 5:38 AM, Ajit Agarwal wrote:
>> This patch removed unnecessary signed extension elimination in ree pass.
>> Bootstrapped and regtested on powerpc64-linux-gnu.
>>
>>
>> Thanks & Regards
>> Ajit
>>
>> 	rtl-optimization: ppc backend generates unnecessary signed extension.
>>
>> 	Eliminate unnecessary redundant signed extension.
>>
>> 	2023-03-23  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
>>
>> gcc/ChangeLog:
>>
>> 	* ree.cc: Modification for  AND opcode support to eliminate
>> 	unnecessary signed extension.
>> 	* testsuite/g++.target/powerpc/sext-elim.C: New tests.
> 
> Not a review of the patch, but we talked offline about other bugzillas
> regarding unnecessary sign and zero extensions.  Doing a quick scan, I
> see the following bugs.  Please have a look at 1) whether these are
> still a problem with unpatched trunk, and if they are, 2) whether your
> patch fixes them or could fix them.  Thanks.
> 
>     https://gcc.gnu.org/PR41742

These are not addressed in the trunk patch, because int c is not initialized with registers and for this reason we cannot eliminate them. If we initialize int c then zero extension goes away.

>     https://gcc.gnu.org/PR65010
>     https://gcc.gnu.org/PR82940
>     https://gcc.gnu.org/PR107949
>

My patch fixes these PR's which were not fixed in trunk patch.

Thanks & Regards
Ajit
 
> Peter
>
  
Ajit Agarwal March 23, 2023, 3:36 p.m. UTC | #5
On 23/03/23 7:17 pm, Jeff Law wrote:
> 
> 
> On 3/23/23 04:38, Ajit Agarwal wrote:
>>
>> Hello All:
>>
>> This patch removed unnecessary signed extension elimination in ree pass.
>> Bootstrapped and regtested on powerpc64-linux-gnu.
>>
>>
>> Thanks & Regards
>> Ajit
>>
>>     rtl-optimization: ppc backend generates unnecessary signed extension.
>>
>>     Eliminate unnecessary redundant signed extension.
>>
>>     2023-03-23  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
>>
>> gcc/ChangeLog:
>>
>>     * ree.cc: Modification for  AND opcode support to eliminate
>>     unnecessary signed extension.
>>     * testsuite/g++.target/powerpc/sext-elim.C: New tests.
> Just a note.  I'll look at this once the trunk is open for gcc-14 development.  It's really not appropriate for gcc-13.

Thanks Jeff.
> 
> jeff
  
Peter Bergner March 23, 2023, 4:29 p.m. UTC | #6
On 3/23/23 8:47 AM, Jeff Law wrote:
> On 3/23/23 04:38, Ajit Agarwal wrote:
>>     * ree.cc: Modification for  AND opcode support to eliminate
>>     unnecessary signed extension.
>>     * testsuite/g++.target/powerpc/sext-elim.C: New tests.
> Just a note.  I'll look at this once the trunk is open for gcc-14 development.
> It's really not appropriate for gcc-13.

Hi Jeff, yes, we agree 100% that this is stage1 material!  I'm sorry if
this wasn't clear. 



>>     https://gcc.gnu.org/PR41742
> 
> These are not addressed in the trunk patch, because int c is not initialized
> with registers and for this reason we cannot eliminate them. If we initialize
> int c then zero extension goes away.

I'm sorry that I don't know how REE works.  Why can't it optimize this?
I see in the REE dump:

(insn 20 18 22 3 (set (reg:DI 4 4)
                      (zero_extend:DI (reg:QI 4 4 [orig:120 cD.3556+3 ] [120]))) "pr41742.c":6:41 8 {zero_extendqidi2} (nil))
(call_insn 22 20 41 3 (parallel [
            (set (reg:DI 3 3)
                 (call (mem:SI (symbol_ref:DI ("memset") [flags 0x41]  <function_decl 0x7fff925f8400 __builtin_memset>) [0 memsetD.1196 S4 A8])
                    (const_int 64 [0x40])))
            (use (const_int 0 [0]))
            (clobber (reg:DI 96 lr)) ...

Is there a reason why REE cannot see that our (reg:QI 4) is a param register
and thus due to our ABI, already correctly sign/zero extended?



>>     https://gcc.gnu.org/PR65010
>>     https://gcc.gnu.org/PR82940
>>     https://gcc.gnu.org/PR107949
>>
> 
> My patch fixes these PR's which were not fixed in trunk patch.

Great!  Once this goes is, please include these PR #s in your commit log
and mark the PRs as RESOLVED/FIXED.

That said, I see we don't enable -free at -O2 and above like other
architectures do, so we'll get no benefit without explicitly adding -free:

./gcc/common/config/riscv/riscv-common.cc:    { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
./gcc/common/config/aarch64/aarch64-common.cc:    { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
./gcc/common/config/h8300/h8300-common.cc:    { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
./gcc/common/config/i386/i386-common.cc:    { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
./gcc/common/config/sparc/sparc-common.cc:    { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
./gcc/common/config/alpha/alpha-common.cc:    { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },

...maybe we should enable it too (in a separate patch) once yours goes in now
that it will actually do something for us?  Thoughts?

I'll note the docs/man page only mention x86, Aarch64 and Alpha enabling REE at
-O2 and above, but clearly others have been added since, so if we enable REE at
-O2 and above, we should fix that too.

Peter
  
Jeff Law March 23, 2023, 4:32 p.m. UTC | #7
On 3/23/23 10:29, Peter Bergner wrote:

>>>      https://gcc.gnu.org/PR41742
>>
>> These are not addressed in the trunk patch, because int c is not initialized
>> with registers and for this reason we cannot eliminate them. If we initialize
>> int c then zero extension goes away.
> 
> I'm sorry that I don't know how REE works.  Why can't it optimize this?
> I see in the REE dump:
> 
> (insn 20 18 22 3 (set (reg:DI 4 4)
>                        (zero_extend:DI (reg:QI 4 4 [orig:120 cD.3556+3 ] [120]))) "pr41742.c":6:41 8 {zero_extendqidi2} (nil))
> (call_insn 22 20 41 3 (parallel [
>              (set (reg:DI 3 3)
>                   (call (mem:SI (symbol_ref:DI ("memset") [flags 0x41]  <function_decl 0x7fff925f8400 __builtin_memset>) [0 memsetD.1196 S4 A8])
>                      (const_int 64 [0x40])))
>              (use (const_int 0 [0]))
>              (clobber (reg:DI 96 lr)) ...
> 
> Is there a reason why REE cannot see that our (reg:QI 4) is a param register
> and thus due to our ABI, already correctly sign/zero extended?
I don't think REE has ever considered exploiting ABI constraints. 
Handling that might be a notable improvement on various targets.  It'd 
be a great place to do some experimentation.

jeff
  
Peter Bergner March 23, 2023, 4:53 p.m. UTC | #8
On 3/23/23 11:32 AM, Jeff Law via Gcc-patches wrote:
> On 3/23/23 10:29, Peter Bergner wrote:
>> I'm sorry that I don't know how REE works.  Why can't it optimize this?
>> I see in the REE dump:
>>
>> (insn 20 18 22 3 (set (reg:DI 4 4)
>>                        (zero_extend:DI (reg:QI 4 4 [orig:120 cD.3556+3 ] [120]))) "pr41742.c":6:41 8 {zero_extendqidi2} (nil))
>> (call_insn 22 20 41 3 (parallel [
>>              (set (reg:DI 3 3)
>>                   (call (mem:SI (symbol_ref:DI ("memset") [flags 0x41]  <function_decl 0x7fff925f8400 __builtin_memset>) [0 memsetD.1196 S4 A8])
>>                      (const_int 64 [0x40])))
>>              (use (const_int 0 [0]))
>>              (clobber (reg:DI 96 lr)) ...
>>
>> Is there a reason why REE cannot see that our (reg:QI 4) is a param register
>> and thus due to our ABI, already correctly sign/zero extended?
>
> I don't think REE has ever considered exploiting ABI constraints. Handling
> that might be a notable improvement on various targets.  It'd be a great
> place to do some experimentation.

Ok, so sounds like a good follow-on project after this patch is reviewed
and committed (stage1).  Thanks for your input!

Peter
  
Jeff Law March 23, 2023, 11:12 p.m. UTC | #9
On 3/23/23 10:53, Peter Bergner wrote:
> On 3/23/23 11:32 AM, Jeff Law via Gcc-patches wrote:
>> On 3/23/23 10:29, Peter Bergner wrote:
>>> I'm sorry that I don't know how REE works.  Why can't it optimize this?
>>> I see in the REE dump:
>>>
>>> (insn 20 18 22 3 (set (reg:DI 4 4)
>>>                         (zero_extend:DI (reg:QI 4 4 [orig:120 cD.3556+3 ] [120]))) "pr41742.c":6:41 8 {zero_extendqidi2} (nil))
>>> (call_insn 22 20 41 3 (parallel [
>>>               (set (reg:DI 3 3)
>>>                    (call (mem:SI (symbol_ref:DI ("memset") [flags 0x41]  <function_decl 0x7fff925f8400 __builtin_memset>) [0 memsetD.1196 S4 A8])
>>>                       (const_int 64 [0x40])))
>>>               (use (const_int 0 [0]))
>>>               (clobber (reg:DI 96 lr)) ...
>>>
>>> Is there a reason why REE cannot see that our (reg:QI 4) is a param register
>>> and thus due to our ABI, already correctly sign/zero extended?
>>
>> I don't think REE has ever considered exploiting ABI constraints. Handling
>> that might be a notable improvement on various targets.  It'd be a great
>> place to do some experimentation.
> 
> Ok, so sounds like a good follow-on project after this patch is reviewed
> and committed (stage1).  Thanks for your input!Agreed.  I suspect that risc-v will benefit from such work as well. 
With that in mind, if y'all start poking at this, please loop in Raphael 
(on cc) who's expressed an interest in this space.

Jeff
  
Peter Bergner March 24, 2023, 9:34 p.m. UTC | #10
On 3/23/23 6:12 PM, Jeff Law via Gcc-patches wrote:
>>>> Is there a reason why REE cannot see that our (reg:QI 4) is a param register
>>>> and thus due to our ABI, already correctly sign/zero extended?
>>>
>>> I don't think REE has ever considered exploiting ABI constraints. Handling
>>> that might be a notable improvement on various targets.  It'd be a great
>>> place to do some experimentation.
>>
>> Ok, so sounds like a good follow-on project after this patch is reviewed
>> and committed (stage1).  Thanks for your input!
>
> Agreed.  I suspect that risc-v will benefit from such work as well. 
> With that in mind, if y'all start poking at this, please loop in Raphael
> (on cc) who's expressed an interest in this space.

Will do.  I suspect that it'll be best to come up with some generic interface
using target hooks like "param regs are sign/zero extended" or "call return
values are sign/zero extended", etc. that targets can conditionally opt into
depending on their ABI that is in effect.

Peter
  
Jeff Law March 25, 2023, 6:09 p.m. UTC | #11
On 3/24/23 15:34, Peter Bergner wrote:
> On 3/23/23 6:12 PM, Jeff Law via Gcc-patches wrote:
>>>>> Is there a reason why REE cannot see that our (reg:QI 4) is a param register
>>>>> and thus due to our ABI, already correctly sign/zero extended?
>>>>
>>>> I don't think REE has ever considered exploiting ABI constraints. Handling
>>>> that might be a notable improvement on various targets.  It'd be a great
>>>> place to do some experimentation.
>>>
>>> Ok, so sounds like a good follow-on project after this patch is reviewed
>>> and committed (stage1).  Thanks for your input!
>>
>> Agreed.  I suspect that risc-v will benefit from such work as well.
>> With that in mind, if y'all start poking at this, please loop in Raphael
>> (on cc) who's expressed an interest in this space.
> 
> Will do.  I suspect that it'll be best to come up with some generic interface
> using target hooks like "param regs are sign/zero extended" or "call return
> values are sign/zero extended", etc. that targets can conditionally opt into
> depending on their ABI that is in effect.
I wonder if we already have some of this in place via the ABI 
interfaces.  Or if the ABI interfaces could be slightly revamped to 
utilize the same information as REE.  That way there's a single source 
of truth for this aspect of the ABI.

But we can cross that bridge when we start poking around.

jeff
  
Hans-Peter Nilsson March 31, 2023, 12:01 a.m. UTC | #12
On Fri, 24 Mar 2023, Peter Bergner via Gcc-patches wrote:

> On 3/23/23 6:12 PM, Jeff Law via Gcc-patches wrote:
> >>>> Is there a reason why REE cannot see that our (reg:QI 4) is a param register
> >>>> and thus due to our ABI, already correctly sign/zero extended?
> >>>
> >>> I don't think REE has ever considered exploiting ABI constraints. Handling
> >>> that might be a notable improvement on various targets.  It'd be a great
> >>> place to do some experimentation.
> >>
> >> Ok, so sounds like a good follow-on project after this patch is reviewed
> >> and committed (stage1).  Thanks for your input!
> >
> > Agreed.  I suspect that risc-v will benefit from such work as well. 
> > With that in mind, if y'all start poking at this, please loop in Raphael
> > (on cc) who's expressed an interest in this space.
> 
> Will do.  I suspect that it'll be best to come up with some generic interface
> using target hooks like "param regs are sign/zero extended" or "call return
> values are sign/zero extended", etc. that targets can conditionally opt into
> depending on their ABI that is in effect.

Pardon the arm-chair development mode but it sounds like 
re-inventing the TARGET_PROMOTE_* hooks...

Maybe just hook up TARGET_PROMOTE_FUNCTION_MODE to ree.c (as 
"you" already already define it for "rs6000")?

brgds, H-P
  
Jeff Law March 31, 2023, 2:01 p.m. UTC | #13
On 3/30/23 18:01, Hans-Peter Nilsson wrote:
> On Fri, 24 Mar 2023, Peter Bergner via Gcc-patches wrote:
> 
>> On 3/23/23 6:12 PM, Jeff Law via Gcc-patches wrote:
>>>>>> Is there a reason why REE cannot see that our (reg:QI 4) is a param register
>>>>>> and thus due to our ABI, already correctly sign/zero extended?
>>>>>
>>>>> I don't think REE has ever considered exploiting ABI constraints. Handling
>>>>> that might be a notable improvement on various targets.  It'd be a great
>>>>> place to do some experimentation.
>>>>
>>>> Ok, so sounds like a good follow-on project after this patch is reviewed
>>>> and committed (stage1).  Thanks for your input!
>>>
>>> Agreed.  I suspect that risc-v will benefit from such work as well.
>>> With that in mind, if y'all start poking at this, please loop in Raphael
>>> (on cc) who's expressed an interest in this space.
>>
>> Will do.  I suspect that it'll be best to come up with some generic interface
>> using target hooks like "param regs are sign/zero extended" or "call return
>> values are sign/zero extended", etc. that targets can conditionally opt into
>> depending on their ABI that is in effect.
> 
> Pardon the arm-chair development mode but it sounds like
> re-inventing the TARGET_PROMOTE_* hooks...
> 
> Maybe just hook up TARGET_PROMOTE_FUNCTION_MODE to ree.c (as
> "you" already already define it for "rs6000")?
> 
That's what we touched on up-thread.  Ideally we want to wire up REE to 
utilize the existing mechanisms to specify ABI constraints so that we 
don't have to write all those macros again.

jeff
  

Patch

diff --git a/gcc/ree.cc b/gcc/ree.cc
index d09f55149b1..63d8cf9f237 100644
--- a/gcc/ree.cc
+++ b/gcc/ree.cc
@@ -364,6 +364,7 @@  combine_set_extension (ext_cand *cand, rtx_insn *curr_insn, rtx *orig_set)
       rtx simplified_temp_extension = simplify_rtx (temp_extension);
       if (simplified_temp_extension)
         temp_extension = simplified_temp_extension;
+
       new_set = gen_rtx_SET (new_reg, temp_extension);
     }
   else if (GET_CODE (orig_src) == IF_THEN_ELSE)
@@ -375,11 +376,21 @@  combine_set_extension (ext_cand *cand, rtx_insn *curr_insn, rtx *orig_set)
   else
     {
       /* This is the normal case.  */
-      rtx temp_extension
-	= gen_rtx_fmt_e (cand->code, cand->mode, orig_src);
+      rtx temp_extension = NULL_RTX;
+
+      if (GET_CODE (SET_SRC (cand_pat)) == AND)
+	temp_extension
+	= gen_rtx_fmt_ee (cand->code, cand->mode,orig_src,
+			  XEXP (SET_SRC (cand_pat), 1));
+      else
+	temp_extension
+	= gen_rtx_fmt_e (cand->code, cand->mode,orig_src);
+
       rtx simplified_temp_extension = simplify_rtx (temp_extension);
+
       if (simplified_temp_extension)
         temp_extension = simplified_temp_extension;
+
       new_set = gen_rtx_SET (new_reg, temp_extension);
     }
 
@@ -1047,7 +1058,14 @@  combine_reaching_defs (ext_cand *cand, const_rtx set_pat, ext_state *state)
 	 cannot be merged, we entirely give up.  In the future, we should allow
 	 extensions to be partially eliminated along those paths where the
 	 definitions could be merged.  */
-      if (apply_change_group ())
+       int num_clobbers = 0;
+       int icode = recog (cand->insn, cand->insn,
+			  (GET_CODE (cand->expr) == SET
+			   && ! reload_completed
+			   && ! reload_in_progress)
+			   ? &num_clobbers : 0);
+
+      if (apply_change_group () || (icode < 0))
         {
           if (dump_file)
             fprintf (dump_file, "All merges were successful.\n");
diff --git a/gcc/testsuite/g++.target/powerpc/sext-elim.C b/gcc/testsuite/g++.target/powerpc/sext-elim.C
new file mode 100644
index 00000000000..1180b9ce268
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/sext-elim.C
@@ -0,0 +1,19 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -free" } */
+
+unsigned long c2l(unsigned char* p)
+{
+  unsigned long res = *p + *(p+1);
+  return res;
+}
+
+long c2sl(signed char* p)
+{
+  long res = *p + *(p+1);
+  return res;
+}
+
+/* { dg-final { scan-assembler-not "rldicl" } } */
+/* { dg-final { scan-assembler-not "extsw" } } */