rtlanal: Fix set_noop_p for volatile loads or stores [PR114768]

Message ID ZiIOA962zzOBoe87@tucnak
State New
Headers
Series rtlanal: Fix set_noop_p for volatile loads or stores [PR114768] |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm fail Patch failed to apply
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 fail Patch failed to apply

Commit Message

Jakub Jelinek April 19, 2024, 6:24 a.m. UTC
  Hi!

On the following testcase, combine propagates the mem/v load into mem store
with the same address and then removes it, because noop_move_p says it is a
no-op move.  If it was the other way around, i.e. mem/v store and mem load,
or both would be mem/v, it would be kept.
The problem is that rtx_equal_p never checks any kind of flags on the rtxes
(and I think it would be quite dangerous to change it at this point), and
set_noop_p checks side_effects_p on just one of the operands, not both.
In the MEM <- MEM set, it only checks it on the destination, in
store to ZERO_EXTRACT only checks it on the source.

The following patch adds the missing side_effects_p checks.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-04-19  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/114768
	* rtlanal.cc (set_noop_p): Don't return true for MEM <- MEM
	sets if src has side-effects or for stores into ZERO_EXTRACT
	if ZERO_EXTRACT operand has side-effects.

	* gcc.dg/pr114768.c: New test.


	Jakub
  

Comments

Richard Biener April 19, 2024, 6:33 a.m. UTC | #1
On Fri, 19 Apr 2024, Jakub Jelinek wrote:

> Hi!
> 
> On the following testcase, combine propagates the mem/v load into mem store
> with the same address and then removes it, because noop_move_p says it is a
> no-op move.  If it was the other way around, i.e. mem/v store and mem load,
> or both would be mem/v, it would be kept.
> The problem is that rtx_equal_p never checks any kind of flags on the rtxes
> (and I think it would be quite dangerous to change it at this point), and
> set_noop_p checks side_effects_p on just one of the operands, not both.
> In the MEM <- MEM set, it only checks it on the destination, in
> store to ZERO_EXTRACT only checks it on the source.
> 
> The following patch adds the missing side_effects_p checks.
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2024-04-19  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR rtl-optimization/114768
> 	* rtlanal.cc (set_noop_p): Don't return true for MEM <- MEM
> 	sets if src has side-effects or for stores into ZERO_EXTRACT
> 	if ZERO_EXTRACT operand has side-effects.
> 
> 	* gcc.dg/pr114768.c: New test.
> 
> --- gcc/rtlanal.cc.jj	2024-02-24 12:45:28.674249100 +0100
> +++ gcc/rtlanal.cc	2024-04-18 15:09:55.199499083 +0200
> @@ -1637,12 +1637,15 @@ set_noop_p (const_rtx set)
>      return true;
>  
>    if (MEM_P (dst) && MEM_P (src))
> -    return rtx_equal_p (dst, src) && !side_effects_p (dst);
> +    return (rtx_equal_p (dst, src)
> +	    && !side_effects_p (dst)
> +	    && !side_effects_p (src));
>  
>    if (GET_CODE (dst) == ZERO_EXTRACT)
> -    return rtx_equal_p (XEXP (dst, 0), src)
> -	   && !BITS_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx
> -	   && !side_effects_p (src);
> +    return (rtx_equal_p (XEXP (dst, 0), src)
> +	    && !BITS_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx
> +	    && !side_effects_p (src)
> +	    && !side_effects_p (XEXP (dst, 0)));
>  
>    if (GET_CODE (dst) == STRICT_LOW_PART)
>      dst = XEXP (dst, 0);
> --- gcc/testsuite/gcc.dg/pr114768.c.jj	2024-04-18 15:37:49.139433678 +0200
> +++ gcc/testsuite/gcc.dg/pr114768.c	2024-04-18 15:43:30.389730365 +0200
> @@ -0,0 +1,10 @@
> +/* PR rtl-optimization/114768 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-rtl-final" } */
> +/* { dg-final { scan-rtl-dump "\\\(mem/v:" "final" { target { ! { nvptx*-*-* } } } } } */
> +
> +void
> +foo (int *p)
> +{
> +  *p = *(volatile int *) p;
> +}
> 
> 	Jakub
> 
>
  
Thomas Schwinge April 19, 2024, 10:23 a.m. UTC | #2
Hi Jakub!

On 2024-04-19T08:24:03+0200, Jakub Jelinek <jakub@redhat.com> wrote:
> --- gcc/testsuite/gcc.dg/pr114768.c.jj	2024-04-18 15:37:49.139433678 +0200
> +++ gcc/testsuite/gcc.dg/pr114768.c	2024-04-18 15:43:30.389730365 +0200
> @@ -0,0 +1,10 @@
> +/* PR rtl-optimization/114768 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-rtl-final" } */
> +/* { dg-final { scan-rtl-dump "\\\(mem/v:" "final" { target { ! { nvptx*-*-* } } } } } */
> +
> +void
> +foo (int *p)
> +{
> +  *p = *(volatile int *) p;
> +}

Why exclude nvptx target here?  As far as I can see, it does behave in
the exactly same way as expected; see 'diff' of before vs. after the
'gcc/rtlanal.cc' code changes:

    PASS: gcc.dg/pr114768.c (test for excess errors)
    [-FAIL:-]{+PASS:+} gcc.dg/pr114768.c scan-rtl-dump final "\\(mem/v:"

    --- 0/pr114768.c.347r.final	2024-04-19 11:34:34.577037596 +0200
    +++ ./pr114768.c.347r.final	2024-04-19 12:08:00.118312524 +0200
    @@ -13,15 +13,27 @@
     ;;  entry block defs 	 1 [%stack] 2 [%frame] 3 [%args]
     ;;  exit block uses 	 1 [%stack] 2 [%frame]
     ;;  regs ever live 	
    -;;  ref usage 	r1={1d,2u} r2={1d,2u} r3={1d,1u} 
    -;;    total ref usage 8{3d,5u,0e} in 1{1 regular + 0 call} insns.
    +;;  ref usage 	r1={1d,3u} r2={1d,3u} r3={1d,2u} r22={1d,1u} r23={1d,2u} 
    +;;    total ref usage 16{5d,11u,0e} in 4{4 regular + 0 call} insns.
     (note 1 0 4 NOTE_INSN_DELETED)
     (note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
    -(note 2 4 3 2 NOTE_INSN_DELETED)
    +(insn 2 4 3 2 (set (reg/v/f:DI 23 [ p ])
    +        (unspec:DI [
    +                (const_int 0 [0])
    +            ] UNSPEC_ARG_REG)) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":8:1 14 {load_arg_regdi}
    +     (nil))
     (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
    -(note 6 3 10 2 NOTE_INSN_DELETED)
    -(note 10 6 11 2 NOTE_INSN_EPILOGUE_BEG)
    -(jump_insn 11 10 12 2 (return) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":10:1 289 {return}
    +(insn 6 3 7 2 (set (reg:SI 22 [ _1 ])
    +        (mem/v:SI (reg/v/f:DI 23 [ p ]) [1 MEM[(volatile int *)p_3(D)]+0 S4 A32])) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":9:8 6 {*movsi_insn}
    +     (nil))
    +(insn 7 6 10 2 (set (mem:SI (reg/v/f:DI 23 [ p ]) [1 *p_3(D)+0 S4 A32])
    +        (reg:SI 22 [ _1 ])) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":9:6 6 {*movsi_insn}
    +     (expr_list:REG_DEAD (reg/v/f:DI 23 [ p ])
    +        (expr_list:REG_DEAD (reg:SI 22 [ _1 ])
    +            (nil))))
    +(note 10 7 13 2 NOTE_INSN_EPILOGUE_BEG)
    +(note 13 10 11 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
    +(jump_insn 11 13 12 3 (return) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":10:1 289 {return}
          (nil)
      -> return)
     (barrier 12 11 0)

    --- 0/pr114768.s	2024-04-19 11:34:34.577037596 +0200
    +++ ./pr114768.s	2024-04-19 12:08:00.118312524 +0200
    @@ -13,5 +13,10 @@
     {
     	.reg.u64 %ar0;
     	ld.param.u64 %ar0, [%in_ar0];
    +	.reg.u32 %r22;
    +	.reg.u64 %r23;
    +		mov.u64	%r23, %ar0;
    +		ld.u32	%r22, [%r23];
    +		st.u32	[%r23], %r22;
     	ret;
     }


Grüße
 Thomas
  
Jakub Jelinek April 19, 2024, 10:30 a.m. UTC | #3
On Fri, Apr 19, 2024 at 12:23:03PM +0200, Thomas Schwinge wrote:
> On 2024-04-19T08:24:03+0200, Jakub Jelinek <jakub@redhat.com> wrote:
> > --- gcc/testsuite/gcc.dg/pr114768.c.jj	2024-04-18 15:37:49.139433678 +0200
> > +++ gcc/testsuite/gcc.dg/pr114768.c	2024-04-18 15:43:30.389730365 +0200
> > @@ -0,0 +1,10 @@
> > +/* PR rtl-optimization/114768 */
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-rtl-final" } */
> > +/* { dg-final { scan-rtl-dump "\\\(mem/v:" "final" { target { ! { nvptx*-*-* } } } } } */
> > +
> > +void
> > +foo (int *p)
> > +{
> > +  *p = *(volatile int *) p;
> > +}
> 
> Why exclude nvptx target here?  As far as I can see, it does behave in
> the exactly same way as expected; see 'diff' of before vs. after the
> 'gcc/rtlanal.cc' code changes:

I wasn't sure if the non-RA targets (for which we don't have an effective
target) even have final dump.
If they do as you show, then guess the target guard can go.

	Jakub
  

Patch

--- gcc/rtlanal.cc.jj	2024-02-24 12:45:28.674249100 +0100
+++ gcc/rtlanal.cc	2024-04-18 15:09:55.199499083 +0200
@@ -1637,12 +1637,15 @@  set_noop_p (const_rtx set)
     return true;
 
   if (MEM_P (dst) && MEM_P (src))
-    return rtx_equal_p (dst, src) && !side_effects_p (dst);
+    return (rtx_equal_p (dst, src)
+	    && !side_effects_p (dst)
+	    && !side_effects_p (src));
 
   if (GET_CODE (dst) == ZERO_EXTRACT)
-    return rtx_equal_p (XEXP (dst, 0), src)
-	   && !BITS_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx
-	   && !side_effects_p (src);
+    return (rtx_equal_p (XEXP (dst, 0), src)
+	    && !BITS_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx
+	    && !side_effects_p (src)
+	    && !side_effects_p (XEXP (dst, 0)));
 
   if (GET_CODE (dst) == STRICT_LOW_PART)
     dst = XEXP (dst, 0);
--- gcc/testsuite/gcc.dg/pr114768.c.jj	2024-04-18 15:37:49.139433678 +0200
+++ gcc/testsuite/gcc.dg/pr114768.c	2024-04-18 15:43:30.389730365 +0200
@@ -0,0 +1,10 @@ 
+/* PR rtl-optimization/114768 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-final" } */
+/* { dg-final { scan-rtl-dump "\\\(mem/v:" "final" { target { ! { nvptx*-*-* } } } } } */
+
+void
+foo (int *p)
+{
+  *p = *(volatile int *) p;
+}