[2/3] RISC-V: xthead(f)memidx: Eliminate optimization patterns

Message ID 20240807062714.2989672-2-christoph.muellner@vrull.eu
State Committed
Commit 31c3c5d1cad7f08c2eb7d264f4a208c2c91d20d1
Delegated to: Jeff Law
Headers
Series [1/3] RISC-V: testsuite: xtheadfmemidx: Rename test and add similar Zfa test |

Checks

Context Check Description
rivoscibot/toolchain-ci-rivos-lint success Lint passed
rivoscibot/toolchain-ci-rivos-apply-patch success Patch applied
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Build passed
rivoscibot/toolchain-ci-rivos-build--newlib-rv64gc-lp64d-non-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--linux-rv64gc_zba_zbb_zbc_zbs-lp64d-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--linux-rv64gcv-lp64d-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--newlib-rv64gcv-lp64d-multilib success Build passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Build passed
rivoscibot/toolchain-ci-rivos-build--linux-rv64gc-lp64d-non-multilib success Build passed
rivoscibot/toolchain-ci-rivos-test fail Testing failed

Commit Message

Christoph Müllner Aug. 7, 2024, 6:27 a.m. UTC
  We have a huge amount of optimization patterns (insn_and_split) for
XTheadMemIdx and XTheadFMemIdx that attempt to do something, that can be
done more efficient by generic GCC passes, if we have proper support code.

A key function in eliminating the optimization patterns is
th_memidx_classify_address_index(), which needs to identify each possible
memory expression that can be lowered into a XTheadMemIdx/XTheadFMemIdx
instruction.  This patch adds all memory expressions that were
previously only recognized by the optimization patterns.

Now, that the address classification is complete, we can finally remove
all optimization patterns with the side-effect or getting rid of the
non-canonical memory expression they produced: (plus (reg) (ashift (reg) (imm))).

A positive side-effect of this change is, that we address an RV32 ICE,
that was caused by the th_memidx_I_c pattern, which did not properly
handle SUBREGs (more details are in PR116131).

A temporary negative side-effect of this change is, that we cause a
regression of the xtheadfmemidx + xtheadfmv/zfa tests (initially
introduced as part of b79cd204c780 to address an ICE).
As this issue cannot be addressed in the code parts that are
adjusted in this patch, we just accept the regression for now.

	PR target/116131

gcc/ChangeLog:

	* config/riscv/thead.cc (th_memidx_classify_address_index):
	Recognize all possible XTheadMemIdx memory operand structures.
	(th_fmemidx_output_index): Do strict classification.
	* config/riscv/thead.md (*th_memidx_operand): Remove.
	(TARGET_XTHEADMEMIDX): Likewise.
	(TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX): Likewise.
	(!TARGET_64BIT && TARGET_XTHEADMEMIDX): Likewise.
	(*th_memidx_I_a): Likewise.
	(*th_memidx_I_b): Likewise.
	(*th_memidx_I_c): Likewise.
	(*th_memidx_US_a): Likewise.
	(*th_memidx_US_b): Likewise.
	(*th_memidx_US_c): Likewise.
	(*th_memidx_UZ_a): Likewise.
	(*th_memidx_UZ_b): Likewise.
	(*th_memidx_UZ_c): Likewise.
	(*th_fmemidx_movsf_hardfloat): Likewise.
	(*th_fmemidx_movdf_hardfloat_rv64): Likewise.
	(*th_fmemidx_I_a): Likewise.
	(*th_fmemidx_I_c): Likewise.
	(*th_fmemidx_US_a): Likewise.
	(*th_fmemidx_US_c): Likewise.
	(*th_fmemidx_UZ_a): Likewise.
	(*th_fmemidx_UZ_c): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/pr116131.c: New test.

Reported-by: Patrick O'Neill <patrick@rivosinc.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 gcc/config/riscv/thead.cc                 |  88 ++++-
 gcc/config/riscv/thead.md                 | 417 ----------------------
 gcc/testsuite/gcc.target/riscv/pr116131.c |  15 +
 3 files changed, 90 insertions(+), 430 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr116131.c
  

Comments

Jeff Law Aug. 7, 2024, 3:09 p.m. UTC | #1
On 8/7/24 12:27 AM, Christoph Müllner wrote:
> We have a huge amount of optimization patterns (insn_and_split) for
> XTheadMemIdx and XTheadFMemIdx that attempt to do something, that can be
> done more efficient by generic GCC passes, if we have proper support code.
> 
> A key function in eliminating the optimization patterns is
> th_memidx_classify_address_index(), which needs to identify each possible
> memory expression that can be lowered into a XTheadMemIdx/XTheadFMemIdx
> instruction.  This patch adds all memory expressions that were
> previously only recognized by the optimization patterns.
> 
> Now, that the address classification is complete, we can finally remove
> all optimization patterns with the side-effect or getting rid of the
> non-canonical memory expression they produced: (plus (reg) (ashift (reg) (imm))).
> 
> A positive side-effect of this change is, that we address an RV32 ICE,
> that was caused by the th_memidx_I_c pattern, which did not properly
> handle SUBREGs (more details are in PR116131).
> 
> A temporary negative side-effect of this change is, that we cause a
> regression of the xtheadfmemidx + xtheadfmv/zfa tests (initially
> introduced as part of b79cd204c780 to address an ICE).
> As this issue cannot be addressed in the code parts that are
> adjusted in this patch, we just accept the regression for now.
> 
> 	PR target/116131
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/thead.cc (th_memidx_classify_address_index):
> 	Recognize all possible XTheadMemIdx memory operand structures.
> 	(th_fmemidx_output_index): Do strict classification.
> 	* config/riscv/thead.md (*th_memidx_operand): Remove.
> 	(TARGET_XTHEADMEMIDX): Likewise.
> 	(TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX): Likewise.
> 	(!TARGET_64BIT && TARGET_XTHEADMEMIDX): Likewise.
> 	(*th_memidx_I_a): Likewise.
> 	(*th_memidx_I_b): Likewise.
> 	(*th_memidx_I_c): Likewise.
> 	(*th_memidx_US_a): Likewise.
> 	(*th_memidx_US_b): Likewise.
> 	(*th_memidx_US_c): Likewise.
> 	(*th_memidx_UZ_a): Likewise.
> 	(*th_memidx_UZ_b): Likewise.
> 	(*th_memidx_UZ_c): Likewise.
> 	(*th_fmemidx_movsf_hardfloat): Likewise.
> 	(*th_fmemidx_movdf_hardfloat_rv64): Likewise.
> 	(*th_fmemidx_I_a): Likewise.
> 	(*th_fmemidx_I_c): Likewise.
> 	(*th_fmemidx_US_a): Likewise.
> 	(*th_fmemidx_US_c): Likewise.
> 	(*th_fmemidx_UZ_a): Likewise.
> 	(*th_fmemidx_UZ_c): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/pr116131.c: New test.
Nice cleanup.  I did wander the old PA code a bit most of what found was 
actually addressing limitations due to its weird implicit segment 
selection based on the base register rather than the full effective 
address.  Bad memories.

But that was enough to get a few synapses to fire. When I last looked at 
the canonicalization issues (circa 2015 IIRC) in this space I adjusted 
combine to handle things better.  So probably not nearly as much to do 
in the target files anymore.

I'll note the thead code is riddled with explicit mode testing which 
looks less than ideal, but that's a pre-existing issue and it may in 
fact be reasonable, I don't know the thead extensions well enough to be 
100% sure either way.  So I won't object, but there's a bit of "ewww" 
when I look at the thead address classification code.


It looks like the pre-commit tester didn't test patches #2 or #3.  So 
while I'll ack, please be on the lookout for any failures.

Jeff
  

Patch

diff --git a/gcc/config/riscv/thead.cc b/gcc/config/riscv/thead.cc
index 13ca8f674b0c..2f1d83fbbc7f 100644
--- a/gcc/config/riscv/thead.cc
+++ b/gcc/config/riscv/thead.cc
@@ -609,31 +609,70 @@  th_memidx_classify_address_index (struct riscv_address_info *info, rtx x,
   if (GET_CODE (x) != PLUS)
     return false;
 
-  rtx reg = XEXP (x, 0);
+  rtx op0 = XEXP (x, 0);
+  rtx op1 = XEXP (x, 1);
   enum riscv_address_type type;
-  rtx offset = XEXP (x, 1);
   int shift;
+  rtx reg = op0;
+  rtx offset = op1;
 
   if (!riscv_valid_base_register_p (reg, mode, strict_p))
-    return false;
+    {
+      reg = op1;
+      offset = op0;
+      if (!riscv_valid_base_register_p (reg, mode, strict_p))
+	return false;
+    }
 
   /* (reg:X) */
-  if (REG_P (offset)
+  if ((REG_P (offset) || SUBREG_P (offset))
       && GET_MODE (offset) == Xmode)
     {
       type = ADDRESS_REG_REG;
       shift = 0;
       offset = offset;
     }
-  /* (zero_extend:DI (reg:SI)) */
-  else if (GET_CODE (offset) == ZERO_EXTEND
+  /* (any_extend:DI (reg:SI)) */
+  else if (TARGET_64BIT
+	   && (GET_CODE (offset) == SIGN_EXTEND
+	       || GET_CODE (offset) == ZERO_EXTEND)
 	   && GET_MODE (offset) == DImode
 	   && GET_MODE (XEXP (offset, 0)) == SImode)
     {
-      type = ADDRESS_REG_UREG;
+      type = (GET_CODE (offset) == SIGN_EXTEND)
+	     ? ADDRESS_REG_REG : ADDRESS_REG_UREG;
       shift = 0;
       offset = XEXP (offset, 0);
     }
+  /* (mult:X (reg:X) (const_int scale)) */
+  else if (GET_CODE (offset) == MULT
+	   && GET_MODE (offset) == Xmode
+	   && REG_P (XEXP (offset, 0))
+	   && GET_MODE (XEXP (offset, 0)) == Xmode
+	   && CONST_INT_P (XEXP (offset, 1))
+	   && pow2p_hwi (INTVAL (XEXP (offset, 1)))
+	   && IN_RANGE (exact_log2 (INTVAL (XEXP (offset, 1))), 1, 3))
+    {
+      type = ADDRESS_REG_REG;
+      shift = exact_log2 (INTVAL (XEXP (offset, 1)));
+      offset = XEXP (offset, 0);
+    }
+  /* (mult:DI (any_extend:DI (reg:SI)) (const_int scale)) */
+  else if (TARGET_64BIT
+	   && GET_CODE (offset) == MULT
+	   && GET_MODE (offset) == DImode
+	   && (GET_CODE (XEXP (offset, 0)) == SIGN_EXTEND
+	       || GET_CODE (XEXP (offset, 0)) == ZERO_EXTEND)
+	   && GET_MODE (XEXP (offset, 0)) == DImode
+	   && REG_P (XEXP (XEXP (offset, 0), 0))
+	   && GET_MODE (XEXP (XEXP (offset, 0), 0)) == SImode
+	   && CONST_INT_P (XEXP (offset, 1)))
+    {
+      type = (GET_CODE (XEXP (offset, 0)) == SIGN_EXTEND)
+	     ? ADDRESS_REG_REG : ADDRESS_REG_UREG;
+      shift = exact_log2 (INTVAL (XEXP (x, 1)));
+      offset = XEXP (XEXP (x, 0), 0);
+    }
   /* (ashift:X (reg:X) (const_int shift)) */
   else if (GET_CODE (offset) == ASHIFT
 	   && GET_MODE (offset) == Xmode
@@ -646,23 +685,46 @@  th_memidx_classify_address_index (struct riscv_address_info *info, rtx x,
       shift = INTVAL (XEXP (offset, 1));
       offset = XEXP (offset, 0);
     }
-  /* (ashift:DI (zero_extend:DI (reg:SI)) (const_int shift)) */
-  else if (GET_CODE (offset) == ASHIFT
+  /* (ashift:DI (any_extend:DI (reg:SI)) (const_int shift)) */
+  else if (TARGET_64BIT
+	   && GET_CODE (offset) == ASHIFT
 	   && GET_MODE (offset) == DImode
-	   && GET_CODE (XEXP (offset, 0)) == ZERO_EXTEND
+	   && (GET_CODE (XEXP (offset, 0)) == SIGN_EXTEND
+	       || GET_CODE (XEXP (offset, 0)) == ZERO_EXTEND)
 	   && GET_MODE (XEXP (offset, 0)) == DImode
 	   && GET_MODE (XEXP (XEXP (offset, 0), 0)) == SImode
 	   && CONST_INT_P (XEXP (offset, 1))
 	   && IN_RANGE(INTVAL (XEXP (offset, 1)), 0, 3))
     {
-      type = ADDRESS_REG_UREG;
+      type = (GET_CODE (XEXP (offset, 0)) == SIGN_EXTEND)
+	     ? ADDRESS_REG_REG : ADDRESS_REG_UREG;
       shift = INTVAL (XEXP (offset, 1));
       offset = XEXP (XEXP (offset, 0), 0);
     }
+  /* (and:X (mult:X (reg:X) (const_int scale)) (const_int mask)) */
+  else if (TARGET_64BIT
+	   && GET_CODE (offset) == AND
+	   && GET_MODE (offset) == DImode
+	   && GET_CODE (XEXP (offset, 0)) == MULT
+	   && GET_MODE (XEXP (offset, 0)) == DImode
+	   && REG_P (XEXP (XEXP (offset, 0), 0))
+	   && GET_MODE (XEXP (XEXP (offset, 0), 0)) == DImode
+	   && CONST_INT_P (XEXP (XEXP (offset, 0), 1))
+	   && pow2p_hwi (INTVAL (XEXP (XEXP (offset, 0), 1)))
+	   && IN_RANGE (exact_log2 (INTVAL (XEXP (XEXP (offset, 0), 1))), 1, 3)
+	   && CONST_INT_P (XEXP (offset, 1))
+	   && INTVAL (XEXP (offset, 1))
+	      >> exact_log2 (INTVAL (XEXP (XEXP (offset, 0), 1))) == 0xffffffff)
+    {
+      type = ADDRESS_REG_UREG;
+      shift = exact_log2 (INTVAL (XEXP (XEXP (offset, 0), 1)));
+      offset = XEXP (XEXP (offset, 0), 0);
+    }
   else
     return false;
 
-  if (!strict_p && GET_CODE (offset) == SUBREG)
+  if (!strict_p && SUBREG_P (offset)
+      && GET_MODE (SUBREG_REG (offset)) == SImode)
     offset = SUBREG_REG (offset);
 
   if (!REG_P (offset)
@@ -774,7 +836,7 @@  th_fmemidx_output_index (rtx dest, rtx src, machine_mode mode, bool load)
   rtx x = th_get_move_mem_addr (dest, src, load);
 
   /* Validate x.  */
-  if (!th_memidx_classify_address_index (&info, x, mode, false))
+  if (!th_memidx_classify_address_index (&info, x, mode, reload_completed))
     return NULL;
 
   int index = exact_log2 (GET_MODE_SIZE (mode).to_constant ()) - 2;
diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
index 93becb2843bd..2a3af76b55c2 100644
--- a/gcc/config/riscv/thead.md
+++ b/gcc/config/riscv/thead.md
@@ -456,26 +456,6 @@  (define_insn "*th_mempair_load_zero_extendsidi2"
 
 ;; XTheadMemIdx
 
-;; Help reload to add a displacement for the base register.
-;; In the case `zext(*(uN*)(base+(zext(rN)<<1)))` LRA splits
-;; off two new instructions: a) `new_base = base + disp`, and
-;; b) `index = zext(rN)<<1`.  The index calculation has no
-;; corresponding instruction pattern and needs this insn_and_split
-;; to recover.
-
-(define_insn_and_split "*th_memidx_operand"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-     (ashift:DI
-       (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
-       (match_operand 2 "const_int_operand" "n")))]
-  "TARGET_64BIT && TARGET_XTHEADMEMIDX && lra_in_progress"
-  "#"
-  ""
-  [(set (match_dup 0) (zero_extend:DI (match_dup 1)))
-   (set (match_dup 0) (ashift:DI (match_dup 0) (match_dup 2)))]
-  ""
-  [(set_attr "type" "bitmanip")])
-
 (define_insn "*th_memidx_zero_extendqi<SUPERQI:mode>2"
   [(set (match_operand:SUPERQI 0 "register_operand" "=r,r,r,r,r,r")
 	(zero_extend:SUPERQI
@@ -654,401 +634,4 @@  (define_insn "*th_memidx_bb_extendqi<SUPERQI:mode>2"
   [(set_attr "move_type" "shift_shift,load,load,load,load,load")
    (set_attr "mode" "<SUPERQI:MODE>")])
 
-;; All modes that are supported by XTheadMemIdx
-(define_mode_iterator TH_M_ANYI [(QI "TARGET_XTHEADMEMIDX")
-                                 (HI "TARGET_XTHEADMEMIDX")
-                                 (SI "TARGET_XTHEADMEMIDX")
-                                 (DI "TARGET_64BIT && TARGET_XTHEADMEMIDX")])
-
-;; All modes that are supported by XTheadFMemIdx
-(define_mode_iterator TH_M_ANYF [(SF "TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX")
-                                 (DF "TARGET_DOUBLE_FLOAT && TARGET_XTHEADFMEMIDX")])
-
-;; All non-extension modes that are supported by XTheadMemIdx
-(define_mode_iterator TH_M_NOEXTI [(SI "!TARGET_64BIT && TARGET_XTHEADMEMIDX")
-                                   (DI "TARGET_64BIT && TARGET_XTHEADMEMIDX")])
-
-;; All non-extension modes that are supported by XTheadFMemIdx
-(define_mode_iterator TH_M_NOEXTF [(SF "TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX")
-                                   (DF "TARGET_DOUBLE_FLOAT && TARGET_XTHEADFMEMIDX")])
-
-;; XTheadMemIdx optimizations
-;; All optimizations attempt to improve the operand utilization of
-;; XTheadMemIdx instructions, where one sign or zero extended
-;; register-index-operand can be shifted left by a 2-bit immediate.
-;;
-;; The basic idea is the following optimization:
-;; (set (reg 0) (op (reg 1) (imm 2)))
-;; (set (reg 3) (mem (plus (reg 0) (reg 4)))
-;; ==>
-;; (set (reg 3) (mem (plus (reg 4) (op2 (reg 1) (imm 2))))
-;; This optimization only valid if (reg 0) has no further uses.
-;;
-;; The three-instruction case is as follows:
-;; (set (reg 0) (op1 (reg 1) (imm 2)))
-;; (set (reg 3) (op2 (reg 0) (imm 4)))
-;; (set (reg 5) (mem (plus (reg 3) (reg 6)))
-;; ==>
-;; (set (reg 5) (mem (plus (reg 6) (op2 (reg 1) (imm 2/4)))))
-;; This optimization is only valid if (reg 0) and (reg 3) have no further uses.
-;;
-;; The optimization cases are:
-;; I) fold 2-bit ashift of register offset into mem-plus RTX
-;; US) fold 32-bit zero-extended (shift) offset into mem-plus
-;; UZ) fold 32-bit zero-extended (zext) offset into mem-plus
-;;
-;; The first optimization case is targeting the th.lr<MODE> instructions.
-;; The other optimization cases are targeting the th.lur<MODE> instructions
-;; and have to consider two forms of zero-extensions:
-;; - ashift-32 + lshiftrt-{29..32} if there are no zero-extension instructions.
-;;   Left-shift amounts of 29..31 indicate a left-shifted zero-extended value.
-;; - zero-extend32 if there are zero-extension instructions (XTheadBb or Zbb).
-;;
-;; We always have three peephole passes per optimization case:
-;; a) no-extended (X) word-load
-;; b) any-extend (SUBX) word-load
-;; c) store
-;;
-;; Note, that SHIFTs will be converted to MULTs during combine.
-
-(define_insn_and_split "*th_memidx_I_a"
-  [(set (match_operand:TH_M_NOEXTI 0 "register_operand" "=r")
-        (mem:TH_M_NOEXTI (plus:X
-          (mult:X (match_operand:X 1 "register_operand" "r")
-                  (match_operand:QI 2 "immediate_operand" "i"))
-          (match_operand:X 3 "register_operand" "r"))))]
-  "TARGET_XTHEADMEMIDX
-   && CONST_INT_P (operands[2])
-   && pow2p_hwi (INTVAL (operands[2]))
-   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)"
-  "#"
-  "&& 1"
-  [(set (match_dup 0)
-        (mem:TH_M_NOEXTI (plus:X
-          (match_dup 3)
-          (ashift:X (match_dup 1) (match_dup 2)))))]
-  { operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
-  }
-)
-
-(define_insn_and_split "*th_memidx_I_b"
-  [(set (match_operand:X 0 "register_operand" "=r")
-        (any_extend:X (mem:SUBX (plus:X
-          (mult:X (match_operand:X 1 "register_operand" "r")
-                  (match_operand:QI 2 "immediate_operand" "i"))
-          (match_operand:X 3 "register_operand" "r")))))]
-  "TARGET_XTHEADMEMIDX
-   && CONST_INT_P (operands[2])
-   && pow2p_hwi (INTVAL (operands[2]))
-   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)"
-  "#"
-  "&& 1"
-  [(set (match_dup 0)
-        (any_extend:X (mem:SUBX (plus:X
-          (match_dup 3)
-          (ashift:X (match_dup 1) (match_dup 2))))))]
-  { operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
-  }
-)
-
-(define_insn_and_split "*th_memidx_I_c"
-  [(set (mem:TH_M_ANYI (plus:X
-          (mult:X (match_operand:X 1 "register_operand" "r")
-                  (match_operand:QI 2 "immediate_operand" "i"))
-          (match_operand:X 3 "register_operand" "r")))
-        (match_operand:TH_M_ANYI 0 "register_operand" "r"))]
-  "TARGET_XTHEADMEMIDX
-   && CONST_INT_P (operands[2])
-   && pow2p_hwi (INTVAL (operands[2]))
-   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)"
-  "#"
-  "&& 1"
-  [(set (mem:TH_M_ANYI (plus:X
-          (match_dup 3)
-          (ashift:X (match_dup 1) (match_dup 2))))
-        (match_dup 0))]
-  { operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
-  }
-)
-
-(define_insn_and_split "*th_memidx_US_a"
-  [(set (match_operand:TH_M_NOEXTI 0 "register_operand" "=r")
-        (mem:TH_M_NOEXTI (plus:DI
-          (and:DI
-            (mult:DI (match_operand:DI 1 "register_operand" "r")
-                     (match_operand:QI 2 "immediate_operand" "i"))
-            (match_operand:DI 3 "immediate_operand" "i"))
-          (match_operand:DI 4 "register_operand" "r"))))]
-  "TARGET_64BIT && TARGET_XTHEADMEMIDX
-   && CONST_INT_P (operands[2])
-   && pow2p_hwi (INTVAL (operands[2]))
-   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)
-   && CONST_INT_P (operands[3])
-   && (INTVAL (operands[3]) >> exact_log2 (INTVAL (operands[2]))) == 0xffffffff"
-  "#"
-  "&& 1"
-  [(set (match_dup 0)
-        (mem:TH_M_NOEXTI (plus:DI
-          (match_dup 4)
-          (ashift:DI (zero_extend:DI (match_dup 1)) (match_dup 2)))))]
-  { operands[1] = gen_lowpart (SImode, operands[1]);
-    operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
-  }
-)
-
-(define_insn_and_split "*th_memidx_US_b"
-  [(set (match_operand:X 0 "register_operand" "=r")
-        (any_extend:X (mem:SUBX (plus:DI
-          (and:DI
-            (mult:DI (match_operand:DI 1 "register_operand" "r")
-                     (match_operand:QI 2 "immediate_operand" "i"))
-            (match_operand:DI 3 "immediate_operand" "i"))
-          (match_operand:DI 4 "register_operand" "r")))))]
-  "TARGET_64BIT && TARGET_XTHEADMEMIDX
-   && CONST_INT_P (operands[2])
-   && pow2p_hwi (INTVAL (operands[2]))
-   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)
-   && CONST_INT_P (operands[3])
-   && (INTVAL (operands[3]) >> exact_log2 (INTVAL (operands[2]))) == 0xffffffff"
-  "#"
-  "&& 1"
-  [(set (match_dup 0)
-        (any_extend:X (mem:SUBX (plus:DI
-          (match_dup 4)
-          (ashift:DI (zero_extend:DI (match_dup 1)) (match_dup 2))))))]
-  { operands[1] = gen_lowpart (SImode, operands[1]);
-    operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
-  }
-)
-
-(define_insn_and_split "*th_memidx_US_c"
-  [(set (mem:TH_M_ANYI (plus:DI
-          (and:DI
-            (mult:DI (match_operand:DI 1 "register_operand" "r")
-                     (match_operand:QI 2 "immediate_operand" "i"))
-            (match_operand:DI 3 "immediate_operand" "i"))
-          (match_operand:DI 4 "register_operand" "r")))
-        (match_operand:TH_M_ANYI 0 "register_operand" "r"))]
-  "TARGET_64BIT && TARGET_XTHEADMEMIDX
-   && CONST_INT_P (operands[2])
-   && pow2p_hwi (INTVAL (operands[2]))
-   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)
-   && CONST_INT_P (operands[3])
-   && (INTVAL (operands[3]) >> exact_log2 (INTVAL (operands[2]))) == 0xffffffff"
-  "#"
-  "&& 1"
-  [(set (mem:TH_M_ANYI (plus:DI
-          (match_dup 4)
-          (ashift:DI (zero_extend:DI (match_dup 1)) (match_dup 2))))
-        (match_dup 0))]
-  { operands[1] = gen_lowpart (SImode, operands[1]);
-    operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
-  }
-)
-
-(define_insn_and_split "*th_memidx_UZ_a"
-  [(set (match_operand:TH_M_NOEXTI 0 "register_operand" "=r")
-        (mem:TH_M_NOEXTI (plus:DI
-          (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
-          (match_operand:DI 2 "register_operand" "r"))))]
-  "TARGET_64BIT && TARGET_XTHEADMEMIDX"
-  "#"
-  "&& 1"
-  [(set (match_dup 0)
-        (mem:TH_M_NOEXTI (plus:DI
-          (match_dup 2)
-          (zero_extend:DI (match_dup 1)))))]
-)
-
-(define_insn_and_split "*th_memidx_UZ_b"
-  [(set (match_operand:X 0 "register_operand" "=r")
-        (any_extend:X (mem:SUBX (plus:DI
-          (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
-          (match_operand:DI 2 "register_operand" "r")))))]
-  "TARGET_64BIT && TARGET_XTHEADMEMIDX"
-  "#"
-  "&& 1"
-  [(set (match_dup 0)
-        (any_extend:X (mem:SUBX (plus:DI
-          (match_dup 2)
-          (zero_extend:DI (match_dup 1))))))]
-)
-
-(define_insn_and_split "*th_memidx_UZ_c"
-  [(set (mem:TH_M_ANYI (plus:DI
-          (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
-          (match_operand:DI 2 "register_operand" "r")))
-        (match_operand:TH_M_ANYI 0 "register_operand" "r"))]
-  "TARGET_64BIT && TARGET_XTHEADMEMIDX"
-  "#"
-  "&& 1"
-  [(set (mem:TH_M_ANYI (plus:DI
-          (match_dup 2)
-          (zero_extend:DI (match_dup 1))))
-        (match_dup 0))]
-)
-
-;; XTheadFMemIdx
-;; Note, that we might get GP registers in FP-mode (reg:DF a2)
-;; which cannot be handled by the XTheadFMemIdx instructions.
-;; This might even happen after register allocation.
-;; We could implement splitters that undo the combiner results
-;; if "after_reload && !HARDFP_REG_P (operands[0])", but this
-;; raises even more questions (e.g. split into what?).
-;; So let's solve this by simply requiring XTheadMemIdx
-;; which provides the necessary instructions to cover this case.
-
-(define_insn "*th_fmemidx_movsf_hardfloat"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,th_m_mir,f,th_m_miu")
-	(match_operand:SF 1 "move_operand"         " th_m_mir,f,th_m_miu,f"))]
-  "TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX && TARGET_XTHEADMEMIDX
-   && (register_operand (operands[0], SFmode)
-       || reg_or_0_operand (operands[1], SFmode))"
-  { return riscv_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "fpload,fpstore,fpload,fpstore")
-   (set_attr "mode" "SF")])
-
-(define_insn "*th_fmemidx_movdf_hardfloat_rv64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,th_m_mir,f,th_m_miu")
-	(match_operand:DF 1 "move_operand"         " th_m_mir,f,th_m_miu,f"))]
-  "TARGET_64BIT && TARGET_DOUBLE_FLOAT && TARGET_XTHEADFMEMIDX
-   && TARGET_XTHEADMEMIDX
-   && (register_operand (operands[0], DFmode)
-       || reg_or_0_operand (operands[1], DFmode))"
-  { return riscv_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "fpload,fpstore,fpload,fpstore")
-   (set_attr "mode" "DF")])
-
-;; XTheadFMemIdx optimizations
-;; Similar like XTheadMemIdx optimizations, but less cases.
-
-(define_insn_and_split "*th_fmemidx_I_a"
-  [(set (match_operand:TH_M_NOEXTF 0 "register_operand" "=f")
-        (mem:TH_M_NOEXTF (plus:X
-          (mult:X (match_operand:X 1 "register_operand" "r")
-                  (match_operand:QI 2 "immediate_operand" "i"))
-          (match_operand:X 3 "register_operand" "r"))))]
-  "TARGET_XTHEADMEMIDX && TARGET_XTHEADFMEMIDX
-   && CONST_INT_P (operands[2])
-   && pow2p_hwi (INTVAL (operands[2]))
-   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-        (mem:TH_M_NOEXTF (plus:X
-          (match_dup 3)
-          (ashift:X (match_dup 1) (match_dup 2)))))]
-  { operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
-  }
-  [(set_attr "move_type" "fpload")
-   (set_attr "mode" "<UNITMODE>")
-   (set_attr "type" "fmove")
-   (set (attr "length") (const_int 16))])
-
-(define_insn_and_split "*th_fmemidx_I_c"
-  [(set (mem:TH_M_ANYF (plus:X
-          (mult:X (match_operand:X 1 "register_operand" "r")
-                  (match_operand:QI 2 "immediate_operand" "i"))
-          (match_operand:X 3 "register_operand" "r")))
-        (match_operand:TH_M_ANYF 0 "register_operand" "f"))]
-  "TARGET_XTHEADMEMIDX && TARGET_XTHEADFMEMIDX
-   && CONST_INT_P (operands[2])
-   && pow2p_hwi (INTVAL (operands[2]))
-   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)"
-  "#"
-  "&& 1"
-  [(set (mem:TH_M_ANYF (plus:X
-          (match_dup 3)
-          (ashift:X (match_dup 1) (match_dup 2))))
-        (match_dup 0))]
-  { operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
-  }
-)
-
-(define_insn_and_split "*th_fmemidx_US_a"
-  [(set (match_operand:TH_M_NOEXTF 0 "register_operand" "=f")
-        (mem:TH_M_NOEXTF (plus:DI
-          (and:DI
-            (mult:DI (match_operand:DI 1 "register_operand" "r")
-                     (match_operand:QI 2 "immediate_operand" "i"))
-            (match_operand:DI 3 "immediate_operand" "i"))
-          (match_operand:DI 4 "register_operand" "r"))))]
-  "TARGET_64BIT && TARGET_XTHEADMEMIDX && TARGET_XTHEADFMEMIDX
-   && CONST_INT_P (operands[2])
-   && pow2p_hwi (INTVAL (operands[2]))
-   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)
-   && CONST_INT_P (operands[3])
-   && (INTVAL (operands[3]) >> exact_log2 (INTVAL (operands[2]))) == 0xffffffff"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-        (mem:TH_M_NOEXTF (plus:DI
-          (match_dup 4)
-          (ashift:DI (zero_extend:DI (match_dup 1)) (match_dup 2)))))]
-  { operands[1] = gen_lowpart (SImode, operands[1]);
-    operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
-  }
-  [(set_attr "move_type" "fpload")
-   (set_attr "mode" "<UNITMODE>")
-   (set_attr "type" "fmove")
-   (set (attr "length") (const_int 16))])
-
-(define_insn_and_split "*th_fmemidx_US_c"
-  [(set (mem:TH_M_ANYF (plus:DI
-          (and:DI
-            (mult:DI (match_operand:DI 1 "register_operand" "r")
-                     (match_operand:QI 2 "immediate_operand" "i"))
-            (match_operand:DI 3 "immediate_operand" "i"))
-          (match_operand:DI 4 "register_operand" "r")))
-        (match_operand:TH_M_ANYF 0 "register_operand" "f"))]
-  "TARGET_64BIT && TARGET_XTHEADMEMIDX && TARGET_XTHEADFMEMIDX
-   && CONST_INT_P (operands[2])
-   && pow2p_hwi (INTVAL (operands[2]))
-   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)
-   && CONST_INT_P (operands[3])
-   && (INTVAL (operands[3]) >> exact_log2 (INTVAL (operands[2]))) == 0xffffffff"
-  "#"
-  "&& 1"
-  [(set (mem:TH_M_ANYF (plus:DI
-          (match_dup 4)
-          (ashift:DI (zero_extend:DI (match_dup 1)) (match_dup 2))))
-        (match_dup 0))]
-  { operands[1] = gen_lowpart (SImode, operands[1]);
-    operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
-  }
-)
-
-(define_insn_and_split "*th_fmemidx_UZ_a"
-  [(set (match_operand:TH_M_NOEXTF 0 "register_operand" "=f")
-        (mem:TH_M_NOEXTF (plus:DI
-          (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
-          (match_operand:DI 2 "register_operand" "r"))))]
-  "TARGET_64BIT && TARGET_XTHEADMEMIDX && TARGET_XTHEADFMEMIDX
-   && (!HARD_REGISTER_NUM_P (REGNO (operands[0])) || HARDFP_REG_P (REGNO (operands[0])))"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-        (mem:TH_M_NOEXTF (plus:DI
-          (match_dup 2)
-          (zero_extend:DI (match_dup 1)))))]
-  ""
-  [(set_attr "move_type" "fpload")
-   (set_attr "mode" "<UNITMODE>")
-   (set_attr "type" "fmove")
-   (set (attr "length") (const_int 16))])
-
-(define_insn_and_split "*th_fmemidx_UZ_c"
-  [(set (mem:TH_M_ANYF (plus:DI
-          (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
-          (match_operand:DI 2 "register_operand" "r")))
-        (match_operand:TH_M_ANYF 0 "register_operand" "f"))]
-  "TARGET_64BIT && TARGET_XTHEADMEMIDX && TARGET_XTHEADFMEMIDX"
-  "#"
-  "&& 1"
-  [(set (mem:TH_M_ANYF (plus:DI
-          (match_dup 2)
-          (zero_extend:DI (match_dup 1))))
-        (match_dup 0))]
-)
-
 (include "thead-peephole.md")
diff --git a/gcc/testsuite/gcc.target/riscv/pr116131.c b/gcc/testsuite/gcc.target/riscv/pr116131.c
new file mode 100644
index 000000000000..4d644c37cde7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr116131.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-flto" "-O0" "-Og" "-Os" "-Oz" } } */
+/* { dg-options "-march=rv64gc_xtheadmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadmemidx" { target { rv32 } } } */
+
+volatile long long a;
+int b;
+int c[1];
+
+void d()
+{
+  c[a] = b;
+}
+
+/* { dg-final { scan-assembler "th.srw\t" } } */