RISC-V: Avoid more unsplit insns in const expander [PR118832].

Message ID D7QIV0X5LU5Y.2BPL47LE339IS@gmail.com
State Committed
Commit 28b2ad5341f875ee7e034b0c6f9e4eb725e19a8f
Delegated to: Juzhe Zhong
Headers
Series RISC-V: Avoid more unsplit insns in const expander [PR118832]. |

Checks

Context Check Description
rivoscibot/toolchain-ci-rivos-apply-patch success Patch applied
rivoscibot/toolchain-ci-rivos-lint success Lint passed
rivoscibot/toolchain-ci-rivos-build--newlib-rv64gcv-lp64d-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--linux-rv64gcv-lp64d-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--linux-rv64gc_zba_zbb_zbc_zbs-lp64d-multilib success Build passed
rivoscibot/toolchain-ci-rivos-test success Testing passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Test passed
linaro-tcwg-bot/tcwg_simplebootstrap_build--master-arm-bootstrap success Build passed
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_simplebootstrap_build--master-aarch64-bootstrap success Build passed

Commit Message

Robin Dapp Feb. 12, 2025, 2:03 p.m. UTC
  Hi,

in PR118832 we have another instance of the problem already noticed in
PR117878.  We sometimes use e.g. expand_simple_binop for vector
operations like shift or and.  While this is usually OK, it causes
problems when doing it late, e.g. during LRA.

In particular, we might rematerialize a const_vector during LRA, which
then leaves an insn laying around that cannot be split any more if it
requires a pseudo.  Therefore we should only use the split variants
in expand_const_vector.

This patch fixed the issue in the PR and also pre-emptively rewrites two
other spots that might be prone to the same issue.

Regtested on rv64gcv_zvl512b.  As the two other cases don't have a test
(so might not even trigger) I unconditionally enabled them for my testsuite
run.

Regards
 Robin

	PR target/118832

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (expand_const_vector):  Expand as
	vlmax insn during lra.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/pr118832.c: New test.
---
 gcc/config/riscv/riscv-v.cc                   | 46 +++++++++++++++----
 .../gcc.target/riscv/rvv/autovec/pr118832.c   | 13 ++++++
 2 files changed, 51 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118832.c
  

Comments

钟居哲 Feb. 12, 2025, 11:39 p.m. UTC | #1
LGTM



juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2025-02-12 22:03
To: gcc-patches
CC: palmer@dabbelt.com; kito.cheng@gmail.com; juzhe.zhong@rivai.ai; jeffreyalaw@gmail.com; pan2.li@intel.com; rdapp.gcc@gmail.com
Subject: [PATCH] RISC-V: Avoid more unsplit insns in const expander [PR118832].
Hi,
 
in PR118832 we have another instance of the problem already noticed in
PR117878.  We sometimes use e.g. expand_simple_binop for vector
operations like shift or and.  While this is usually OK, it causes
problems when doing it late, e.g. during LRA.
 
In particular, we might rematerialize a const_vector during LRA, which
then leaves an insn laying around that cannot be split any more if it
requires a pseudo.  Therefore we should only use the split variants
in expand_const_vector.
 
This patch fixed the issue in the PR and also pre-emptively rewrites two
other spots that might be prone to the same issue.
 
Regtested on rv64gcv_zvl512b.  As the two other cases don't have a test
(so might not even trigger) I unconditionally enabled them for my testsuite
run.
 
Regards
Robin
 
PR target/118832
 
gcc/ChangeLog:
 
* config/riscv/riscv-v.cc (expand_const_vector):  Expand as
vlmax insn during lra.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/pr118832.c: New test.
---
gcc/config/riscv/riscv-v.cc                   | 46 +++++++++++++++----
.../gcc.target/riscv/rvv/autovec/pr118832.c   | 13 ++++++
2 files changed, 51 insertions(+), 8 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118832.c
 
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 9847439ca77..3e86b12bb40 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1265,7 +1265,16 @@ expand_const_vector (rtx target, rtx src)
       element. Use element width = 64 and broadcast a vector with
       all element equal to 0x0706050403020100.  */
  rtx ele = builder.get_merged_repeating_sequence ();
-   rtx dup = expand_vector_broadcast (builder.new_mode (), ele);
+   rtx dup;
+   if (lra_in_progress)
+     {
+       dup = gen_reg_rtx (builder.new_mode ());
+       rtx ops[] = {dup, ele};
+       emit_vlmax_insn (code_for_pred_broadcast
+        (builder.new_mode ()), UNARY_OP, ops);
+     }
+   else
+     dup = expand_vector_broadcast (builder.new_mode (), ele);
  emit_move_insn (result, gen_lowpart (mode, dup));
}
       else
@@ -1523,10 +1532,20 @@ expand_const_vector (rtx target, rtx src)
  base2 = gen_int_mode (rtx_to_poly_int64 (base2), new_smode);
  expand_vec_series (tmp2, base2,
     gen_int_mode (step2, new_smode));
-   rtx shifted_tmp2 = expand_simple_binop (
-     new_mode, ASHIFT, tmp2,
-     gen_int_mode (builder.inner_bits_size (), Pmode), NULL_RTX,
-     false, OPTAB_DIRECT);
+   rtx shifted_tmp2;
+   rtx shift = gen_int_mode (builder.inner_bits_size (), Xmode);
+   if (lra_in_progress)
+     {
+       shifted_tmp2 = gen_reg_rtx (new_mode);
+       rtx shift_ops[] = {shifted_tmp2, tmp2, shift};
+       emit_vlmax_insn (code_for_pred_scalar
+        (ASHIFT, new_mode), BINARY_OP,
+        shift_ops);
+     }
+   else
+     shifted_tmp2 = expand_simple_binop (new_mode, ASHIFT, tmp2,
+ shift, NULL_RTX, false,
+ OPTAB_DIRECT);
  rtx tmp3 = gen_reg_rtx (new_mode);
  rtx ior_ops[] = {tmp3, tmp1, shifted_tmp2};
  emit_vlmax_insn (code_for_pred (IOR, new_mode), BINARY_OP,
@@ -1539,9 +1558,20 @@ expand_const_vector (rtx target, rtx src)
      rtx vid = gen_reg_rtx (mode);
      expand_vec_series (vid, const0_rtx, const1_rtx);
      /* Transform into { 0, 0, 1, 1, 2, 2, ... }.  */
-       rtx shifted_vid
- = expand_simple_binop (mode, LSHIFTRT, vid, const1_rtx,
-        NULL_RTX, false, OPTAB_DIRECT);
+       rtx shifted_vid;
+       if (lra_in_progress)
+ {
+   shifted_vid = gen_reg_rtx (mode);
+   rtx shift = gen_int_mode (1, Xmode);
+   rtx shift_ops[] = {shifted_vid, vid, shift};
+   emit_vlmax_insn (code_for_pred_scalar
+    (ASHIFT, mode), BINARY_OP,
+    shift_ops);
+ }
+       else
+ shifted_vid = expand_simple_binop (mode, LSHIFTRT, vid,
+    const1_rtx, NULL_RTX,
+    false, OPTAB_DIRECT);
      rtx tmp1 = gen_reg_rtx (mode);
      rtx tmp2 = gen_reg_rtx (mode);
      expand_vec_series (tmp1, base1,
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118832.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118832.c
new file mode 100644
index 00000000000..db0b12bee5a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118832.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zbb -mabi=lp64d -O3" } */
+
+int *a;
+void b(int *);
+void c(int *g, short h) {
+    int d[8], e[8];
+    for (int f = 0; f < h; f++)
+      d[f] = g[f] << 24 | (g[f] & 4278190080) >> 24;
+    b(d);
+    for (int f = 0; f < h; f++)
+      a[f] = e[f] << 24 | (e[f] & 4278190080) >> 24;
+}
-- 
2.48.1
  
Jeffrey Law Feb. 13, 2025, 11:34 p.m. UTC | #2
On 2/12/25 7:03 AM, Robin Dapp wrote:
> Hi,
> 
> in PR118832 we have another instance of the problem already noticed in
> PR117878.  We sometimes use e.g. expand_simple_binop for vector
> operations like shift or and.  While this is usually OK, it causes
> problems when doing it late, e.g. during LRA.
> 
> In particular, we might rematerialize a const_vector during LRA, which
> then leaves an insn laying around that cannot be split any more if it
> requires a pseudo.  Therefore we should only use the split variants
> in expand_const_vector.
> 
> This patch fixed the issue in the PR and also pre-emptively rewrites two
> other spots that might be prone to the same issue.
> 
> Regtested on rv64gcv_zvl512b.  As the two other cases don't have a test
> (so might not even trigger) I unconditionally enabled them for my testsuite
> run.
> 
> Regards
>   Robin
> 
> 	PR target/118832
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/riscv-v.cc (expand_const_vector):  Expand as
> 	vlmax insn during lra.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/rvv/autovec/pr118832.c: New test.
Pushed to the trunk and I'll update the BZ entry momentarily.

jeff
  

Patch

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 9847439ca77..3e86b12bb40 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1265,7 +1265,16 @@  expand_const_vector (rtx target, rtx src)
 	       element. Use element width = 64 and broadcast a vector with
 	       all element equal to 0x0706050403020100.  */
 	  rtx ele = builder.get_merged_repeating_sequence ();
-	  rtx dup = expand_vector_broadcast (builder.new_mode (), ele);
+	  rtx dup;
+	  if (lra_in_progress)
+	    {
+	      dup = gen_reg_rtx (builder.new_mode ());
+	      rtx ops[] = {dup, ele};
+	      emit_vlmax_insn (code_for_pred_broadcast
+			       (builder.new_mode ()), UNARY_OP, ops);
+	    }
+	  else
+	    dup = expand_vector_broadcast (builder.new_mode (), ele);
 	  emit_move_insn (result, gen_lowpart (mode, dup));
 	}
       else
@@ -1523,10 +1532,20 @@  expand_const_vector (rtx target, rtx src)
 		  base2 = gen_int_mode (rtx_to_poly_int64 (base2), new_smode);
 		  expand_vec_series (tmp2, base2,
 				     gen_int_mode (step2, new_smode));
-		  rtx shifted_tmp2 = expand_simple_binop (
-		    new_mode, ASHIFT, tmp2,
-		    gen_int_mode (builder.inner_bits_size (), Pmode), NULL_RTX,
-		    false, OPTAB_DIRECT);
+		  rtx shifted_tmp2;
+		  rtx shift = gen_int_mode (builder.inner_bits_size (), Xmode);
+		  if (lra_in_progress)
+		    {
+		      shifted_tmp2 = gen_reg_rtx (new_mode);
+		      rtx shift_ops[] = {shifted_tmp2, tmp2, shift};
+		      emit_vlmax_insn (code_for_pred_scalar
+				       (ASHIFT, new_mode), BINARY_OP,
+				       shift_ops);
+		    }
+		  else
+		    shifted_tmp2 = expand_simple_binop (new_mode, ASHIFT, tmp2,
+							shift, NULL_RTX, false,
+							OPTAB_DIRECT);
 		  rtx tmp3 = gen_reg_rtx (new_mode);
 		  rtx ior_ops[] = {tmp3, tmp1, shifted_tmp2};
 		  emit_vlmax_insn (code_for_pred (IOR, new_mode), BINARY_OP,
@@ -1539,9 +1558,20 @@  expand_const_vector (rtx target, rtx src)
 	      rtx vid = gen_reg_rtx (mode);
 	      expand_vec_series (vid, const0_rtx, const1_rtx);
 	      /* Transform into { 0, 0, 1, 1, 2, 2, ... }.  */
-	      rtx shifted_vid
-		= expand_simple_binop (mode, LSHIFTRT, vid, const1_rtx,
-				       NULL_RTX, false, OPTAB_DIRECT);
+	      rtx shifted_vid;
+	      if (lra_in_progress)
+		{
+		  shifted_vid = gen_reg_rtx (mode);
+		  rtx shift = gen_int_mode (1, Xmode);
+		  rtx shift_ops[] = {shifted_vid, vid, shift};
+		  emit_vlmax_insn (code_for_pred_scalar
+				   (ASHIFT, mode), BINARY_OP,
+				   shift_ops);
+		}
+	      else
+		shifted_vid = expand_simple_binop (mode, LSHIFTRT, vid,
+						   const1_rtx, NULL_RTX,
+						   false, OPTAB_DIRECT);
 	      rtx tmp1 = gen_reg_rtx (mode);
 	      rtx tmp2 = gen_reg_rtx (mode);
 	      expand_vec_series (tmp1, base1,
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118832.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118832.c
new file mode 100644
index 00000000000..db0b12bee5a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118832.c
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zbb -mabi=lp64d -O3" } */
+
+int *a;
+void b(int *);
+void c(int *g, short h) {
+    int d[8], e[8];
+    for (int f = 0; f < h; f++)
+      d[f] = g[f] << 24 | (g[f] & 4278190080) >> 24;
+    b(d);
+    for (int f = 0; f < h; f++)
+      a[f] = e[f] << 24 | (e[f] & 4278190080) >> 24;
+}