[v2] aarch64: Preserve mem info on change of base for ldp/stp [PR114674]

Message ID ZhjwwAw4+PTOvPii@arm.com
State New
Headers
Series [v2] aarch64: Preserve mem info on change of base for ldp/stp [PR114674] |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Testing passed

Commit Message

Alex Coplan April 12, 2024, 8:28 a.m. UTC
  This is a v2 because I accidentally sent a WIP version of the patch last
time round which used replace_equiv_address instead of
replace_equiv_address_nv; that caused some ICEs (pointed out by the
Linaro CI) since pair addressing modes aren't a subset of the addresses
that are accepted by memory_operand for a given mode.

This patch should otherwise be identical to v1.  Bootstrapped/regtested
on aarch64-linux-gnu (indeed this is the patch I actually tested last
time), is this version also OK for GCC 15?

Thanks,
Alex

--- >8 ---

The ldp/stp fusion pass can change the base of an access so that the two
accesses end up using a common base register.  So far we have been using
adjust_address_nv to do this, but this means that we don't preserve
other properties of the mem we're replacing.  It seems better to use
replace_equiv_address_nv, as this will preserve e.g. the MEM_ALIGN of the
mem whose address we're changing.

The PR shows that by adjusting the other mem we lose alignment
information about the original access and therefore end up rejecting an
otherwise viable pair when --param=aarch64-stp-policy=aligned is passed.
This patch fixes that by using replace_equiv_address_nv instead.

Notably this is the same approach as taken by
aarch64_check_consecutive_mems when a change of base is required, so
this at least makes things more consistent between the ldp fusion pass
and the peepholes.

gcc/ChangeLog:

	PR target/114674
	* config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::fuse_pair):
	Use replace_equiv_address_nv on a change of base instead of
	adjust_address_nv on the other access.

gcc/testsuite/ChangeLog:

	PR target/114674
	* gcc.target/aarch64/pr114674.c: New test.
  

Comments

Richard Sandiford April 12, 2024, 11:13 a.m. UTC | #1
Alex Coplan <alex.coplan@arm.com> writes:
> This is a v2 because I accidentally sent a WIP version of the patch last
> time round which used replace_equiv_address instead of
> replace_equiv_address_nv; that caused some ICEs (pointed out by the
> Linaro CI) since pair addressing modes aren't a subset of the addresses
> that are accepted by memory_operand for a given mode.
>
> This patch should otherwise be identical to v1.  Bootstrapped/regtested
> on aarch64-linux-gnu (indeed this is the patch I actually tested last
> time), is this version also OK for GCC 15?

OK, thanks.  Sorry for missing this in the first review.

Richard

> Thanks,
> Alex
>
> --- >8 ---
>
> The ldp/stp fusion pass can change the base of an access so that the two
> accesses end up using a common base register.  So far we have been using
> adjust_address_nv to do this, but this means that we don't preserve
> other properties of the mem we're replacing.  It seems better to use
> replace_equiv_address_nv, as this will preserve e.g. the MEM_ALIGN of the
> mem whose address we're changing.
>
> The PR shows that by adjusting the other mem we lose alignment
> information about the original access and therefore end up rejecting an
> otherwise viable pair when --param=aarch64-stp-policy=aligned is passed.
> This patch fixes that by using replace_equiv_address_nv instead.
>
> Notably this is the same approach as taken by
> aarch64_check_consecutive_mems when a change of base is required, so
> this at least makes things more consistent between the ldp fusion pass
> and the peepholes.
>
> gcc/ChangeLog:
>
> 	PR target/114674
> 	* config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::fuse_pair):
> 	Use replace_equiv_address_nv on a change of base instead of
> 	adjust_address_nv on the other access.
>
> gcc/testsuite/ChangeLog:
>
> 	PR target/114674
> 	* gcc.target/aarch64/pr114674.c: New test.
>
> diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> index 365dcf48b22..d07d79df06c 100644
> --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
> +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> @@ -1730,11 +1730,11 @@ ldp_bb_info::fuse_pair (bool load_p,
>  	adjust_amt *= -1;
>  
>        rtx change_reg = XEXP (change_pat, !load_p);
> -      machine_mode mode_for_mem = GET_MODE (change_mem);
>        rtx effective_base = drop_writeback (base_mem);
> -      rtx new_mem = adjust_address_nv (effective_base,
> -				       mode_for_mem,
> -				       adjust_amt);
> +      rtx adjusted_addr = plus_constant (Pmode,
> +					 XEXP (effective_base, 0),
> +					 adjust_amt);
> +      rtx new_mem = replace_equiv_address_nv (change_mem, adjusted_addr);
>        rtx new_set = load_p
>  	? gen_rtx_SET (change_reg, new_mem)
>  	: gen_rtx_SET (new_mem, change_reg);
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr114674.c b/gcc/testsuite/gcc.target/aarch64/pr114674.c
> new file mode 100644
> index 00000000000..944784fd008
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr114674.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 --param=aarch64-stp-policy=aligned" } */
> +typedef struct {
> +	unsigned int f1;
> +	unsigned int f2;
> +} test_struct;
> +
> +static test_struct ts = {
> +	123, 456
> +};
> +
> +void foo(void)
> +{
> +	ts.f2 = 36969 * (ts.f2 & 65535) + (ts.f1 >> 16);
> +	ts.f1 = 18000 * (ts.f2 & 65535) + (ts.f2 >> 16);
> +}
> +/* { dg-final { scan-assembler-times "stp" 1 } } */
  
Alex Coplan May 7, 2024, 1:45 p.m. UTC | #2
On 12/04/2024 12:13, Richard Sandiford wrote:
> Alex Coplan <alex.coplan@arm.com> writes:
> > This is a v2 because I accidentally sent a WIP version of the patch last
> > time round which used replace_equiv_address instead of
> > replace_equiv_address_nv; that caused some ICEs (pointed out by the
> > Linaro CI) since pair addressing modes aren't a subset of the addresses
> > that are accepted by memory_operand for a given mode.
> >
> > This patch should otherwise be identical to v1.  Bootstrapped/regtested
> > on aarch64-linux-gnu (indeed this is the patch I actually tested last
> > time), is this version also OK for GCC 15?
> 
> OK, thanks.  Sorry for missing this in the first review.

Now pushed to trunk, thanks.

Alex

> 
> Richard
> 
> > Thanks,
> > Alex
> >
> > --- >8 ---
> >
> > The ldp/stp fusion pass can change the base of an access so that the two
> > accesses end up using a common base register.  So far we have been using
> > adjust_address_nv to do this, but this means that we don't preserve
> > other properties of the mem we're replacing.  It seems better to use
> > replace_equiv_address_nv, as this will preserve e.g. the MEM_ALIGN of the
> > mem whose address we're changing.
> >
> > The PR shows that by adjusting the other mem we lose alignment
> > information about the original access and therefore end up rejecting an
> > otherwise viable pair when --param=aarch64-stp-policy=aligned is passed.
> > This patch fixes that by using replace_equiv_address_nv instead.
> >
> > Notably this is the same approach as taken by
> > aarch64_check_consecutive_mems when a change of base is required, so
> > this at least makes things more consistent between the ldp fusion pass
> > and the peepholes.
> >
> > gcc/ChangeLog:
> >
> > 	PR target/114674
> > 	* config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::fuse_pair):
> > 	Use replace_equiv_address_nv on a change of base instead of
> > 	adjust_address_nv on the other access.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 	PR target/114674
> > 	* gcc.target/aarch64/pr114674.c: New test.
> >
> > diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> > index 365dcf48b22..d07d79df06c 100644
> > --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
> > +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> > @@ -1730,11 +1730,11 @@ ldp_bb_info::fuse_pair (bool load_p,
> >  	adjust_amt *= -1;
> >  
> >        rtx change_reg = XEXP (change_pat, !load_p);
> > -      machine_mode mode_for_mem = GET_MODE (change_mem);
> >        rtx effective_base = drop_writeback (base_mem);
> > -      rtx new_mem = adjust_address_nv (effective_base,
> > -				       mode_for_mem,
> > -				       adjust_amt);
> > +      rtx adjusted_addr = plus_constant (Pmode,
> > +					 XEXP (effective_base, 0),
> > +					 adjust_amt);
> > +      rtx new_mem = replace_equiv_address_nv (change_mem, adjusted_addr);
> >        rtx new_set = load_p
> >  	? gen_rtx_SET (change_reg, new_mem)
> >  	: gen_rtx_SET (new_mem, change_reg);
> > diff --git a/gcc/testsuite/gcc.target/aarch64/pr114674.c b/gcc/testsuite/gcc.target/aarch64/pr114674.c
> > new file mode 100644
> > index 00000000000..944784fd008
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/pr114674.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3 --param=aarch64-stp-policy=aligned" } */
> > +typedef struct {
> > +	unsigned int f1;
> > +	unsigned int f2;
> > +} test_struct;
> > +
> > +static test_struct ts = {
> > +	123, 456
> > +};
> > +
> > +void foo(void)
> > +{
> > +	ts.f2 = 36969 * (ts.f2 & 65535) + (ts.f1 >> 16);
> > +	ts.f1 = 18000 * (ts.f2 & 65535) + (ts.f2 >> 16);
> > +}
> > +/* { dg-final { scan-assembler-times "stp" 1 } } */
  

Patch

diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc b/gcc/config/aarch64/aarch64-ldp-fusion.cc
index 365dcf48b22..d07d79df06c 100644
--- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
+++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
@@ -1730,11 +1730,11 @@  ldp_bb_info::fuse_pair (bool load_p,
 	adjust_amt *= -1;
 
       rtx change_reg = XEXP (change_pat, !load_p);
-      machine_mode mode_for_mem = GET_MODE (change_mem);
       rtx effective_base = drop_writeback (base_mem);
-      rtx new_mem = adjust_address_nv (effective_base,
-				       mode_for_mem,
-				       adjust_amt);
+      rtx adjusted_addr = plus_constant (Pmode,
+					 XEXP (effective_base, 0),
+					 adjust_amt);
+      rtx new_mem = replace_equiv_address_nv (change_mem, adjusted_addr);
       rtx new_set = load_p
 	? gen_rtx_SET (change_reg, new_mem)
 	: gen_rtx_SET (new_mem, change_reg);
diff --git a/gcc/testsuite/gcc.target/aarch64/pr114674.c b/gcc/testsuite/gcc.target/aarch64/pr114674.c
new file mode 100644
index 00000000000..944784fd008
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr114674.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 --param=aarch64-stp-policy=aligned" } */
+typedef struct {
+	unsigned int f1;
+	unsigned int f2;
+} test_struct;
+
+static test_struct ts = {
+	123, 456
+};
+
+void foo(void)
+{
+	ts.f2 = 36969 * (ts.f2 & 65535) + (ts.f1 >> 16);
+	ts.f1 = 18000 * (ts.f2 & 65535) + (ts.f2 >> 16);
+}
+/* { dg-final { scan-assembler-times "stp" 1 } } */