[10/11] riscv: thead: Add support for the XTheadMemIdx ISA extension

Message ID 20230428062314.2995571-1-christoph.muellner@vrull.eu
State New
Headers
Series Improvements for XThead* support |

Commit Message

Christoph Müllner April 28, 2023, 6:23 a.m. UTC
  From: Christoph Müllner <christoph.muellner@vrull.eu>

The XTheadMemIdx ISA extension provides a additional load and store
instructions with new addressing modes.

The following memory accesses types are supported:
* ltype = [b,bu,h,hu,w,wu,d]
* stype = [b,h,w,d]

The following addressing modes are supported:
* immediate offset with PRE_MODIFY or POST_MODIFY (22 instructions):
  l<ltype>.ia, l<ltype>.ib, s<stype>.ia, s<stype>.ib
* register offset with additional immediate offset (11 instructions):
  lr<ltype>, sr<stype>
* zero-extended register offset with additional immediate offset
  (11 instructions): lur<ltype>, sur<stype>

The RISC-V base ISA does not support index registers, so the changes
are kept separate from the RISC-V standard support.

Similar like other extensions (Zbb, XTheadBb), this patch needs to
prevent the conversion of sign-extensions/zero-extensions into
shift instructions. The case of the zero-extended register offset
addressing mode is handled by a new peephole pass.

Handling the different cases of extensions results in a couple of INSNs
that look redundant on first view, but they are just the equivalent
of what we already have for Zbb as well. The only difference is, that
we have much more load instructions.

To fully utilize the capabilities of the instructions, there are
a few new peephole passes which fold shift amounts into the RTX
if possible. The added tests ensure that this feature won't
regress without notice.

We already have a constraint with the name 'th_f_fmv', therefore,
the new constraints follow this pattern and have the same length
as required ('th_m_mia', 'th_m_mib', 'th_m_mir', 'th_m_miu').

gcc/ChangeLog:

	* config/riscv/constraints.md (th_m_mia): New constraint.
	(th_m_mib): Likewise.
	(th_m_mir): Likewise.
	(th_m_miu): Likewise.
	* config/riscv/riscv-protos.h (enum riscv_address_type):
	Add new address types ADDRESS_REG_REG, ADDRESS_REG_UREG,
	and ADDRESS_REG_WB and their documentation.
	(struct riscv_address_info): Add new field 'shift' and
	document the field usage for the new address types.
	(riscv_valid_base_register_p): New prototype.
	(th_memidx_legitimate_modify_p): Likewise.
	(th_memidx_legitimate_index_p): Likewise.
	(th_classify_address): Likewise.
	(th_output_move): Likewise.
	(th_print_operand_address): Likewise.
	* config/riscv/riscv.cc (riscv_index_reg_class):
	Return GR_REGS for XTheadMemIdx.
	(riscv_regno_ok_for_index_p): Add support for XTheadMemIdx.
	(riscv_classify_address): Call th_classify_address() on top.
	(riscv_output_move): Call th_output_move() on top.
	(riscv_print_operand_address): Call th_print_operand_address()
	on top.
	* config/riscv/riscv.h (HAVE_POST_MODIFY_DISP): New macro.
	(HAVE_PRE_MODIFY_DISP): Likewise.
	* config/riscv/riscv.md (zero_extendqi<SUPERQI:mode>2): Disable
	for XTheadMemIdx.
	(*zero_extendqi<SUPERQI:mode>2_internal): Convert to expand,
	create INSN with same name and disable it for XTheadMemIdx.
	(extendsidi2): Likewise.
	(*extendsidi2_internal): Disable for XTheadMemIdx.
	* config/riscv/thead-peephole.md: Add helper peephole passes.
	* config/riscv/thead.cc (valid_signed_immediate): New helper
	function.
	(th_memidx_classify_address_modify): New function.
	(th_memidx_legitimate_modify_p): Likewise.
	(th_memidx_output_modify): Likewise.
	(is_memidx_mode): Likewise.
	(th_memidx_classify_address_index): Likewise.
	(th_memidx_legitimate_index_p): Likewise.
	(th_memidx_output_index): Likewise.
	(th_classify_address): Likewise.
	(th_output_move): Likewise.
	(th_print_operand_address): Likewise.
	* config/riscv/thead.md (*th_memidx_mov<mode>2):
	New INSN.
	(*th_memidx_zero_extendqi<SUPERQI:mode>2): Likewise.
	(*th_memidx_extendsidi2): Likewise
	(*th_memidx_zero_extendsidi2): Likewise.
	(*th_memidx_zero_extendhi<GPR:mode>2): Likewise.
	(*th_memidx_extend<SHORT:mode><SUPERQI:mode>2): Likewise
	(*th_memidx_bb_zero_extendsidi2): Likewise.
	(*th_memidx_bb_zero_extendhi<GPR:mode>2): Likewise.
	(*th_memidx_bb_extendhi<GPR:mode>2): Likewise.
	(*th_memidx_bb_extendqi<SUPERQI:mode>2): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/xtheadmemidx-helpers.h: New test.
	* gcc.target/riscv/xtheadmemidx-index-update.c: New test.
	* gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c: New test.
	* gcc.target/riscv/xtheadmemidx-index-xtheadbb.c: New test.
	* gcc.target/riscv/xtheadmemidx-index.c: New test.
	* gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c: New test.
	* gcc.target/riscv/xtheadmemidx-modify.c: New test.
	* gcc.target/riscv/xtheadmemidx-uindex-update.c: New test.
	* gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c: New test.
	* gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c: New test.
	* gcc.target/riscv/xtheadmemidx-uindex.c: New test.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 gcc/config/riscv/constraints.md               |  24 +
 gcc/config/riscv/riscv-protos.h               |  29 ++
 gcc/config/riscv/riscv.cc                     |  22 +-
 gcc/config/riscv/riscv.h                      |   3 +
 gcc/config/riscv/riscv.md                     |  26 +-
 gcc/config/riscv/thead-peephole.md            | 214 +++++++++
 gcc/config/riscv/thead.cc                     | 428 ++++++++++++++++++
 gcc/config/riscv/thead.md                     | 186 +++++++-
 .../gcc.target/riscv/xtheadmemidx-helpers.h   | 152 +++++++
 .../riscv/xtheadmemidx-index-update.c         |  27 ++
 .../xtheadmemidx-index-xtheadbb-update.c      |  27 ++
 .../riscv/xtheadmemidx-index-xtheadbb.c       |  36 ++
 .../gcc.target/riscv/xtheadmemidx-index.c     |  36 ++
 .../riscv/xtheadmemidx-modify-xtheadbb.c      |  74 +++
 .../gcc.target/riscv/xtheadmemidx-modify.c    |  74 +++
 .../riscv/xtheadmemidx-uindex-update.c        |  27 ++
 .../xtheadmemidx-uindex-xtheadbb-update.c     |  27 ++
 .../riscv/xtheadmemidx-uindex-xtheadbb.c      |  44 ++
 .../gcc.target/riscv/xtheadmemidx-uindex.c    |  44 ++
 19 files changed, 1489 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmemidx-index-update.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmemidx-index-xtheadbb.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmemidx-index.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmemidx-modify.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex-update.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex.c
  

Comments

Jeff Law June 10, 2023, 5:53 p.m. UTC | #1
On 4/28/23 00:23, Christoph Muellner wrote:
> From: Christoph Müllner <christoph.muellner@vrull.eu>
> 
> The XTheadMemIdx ISA extension provides a additional load and store
> instructions with new addressing modes.
> 
> The following memory accesses types are supported:
> * ltype = [b,bu,h,hu,w,wu,d]
> * stype = [b,h,w,d]
> 
> The following addressing modes are supported:
> * immediate offset with PRE_MODIFY or POST_MODIFY (22 instructions):
>    l<ltype>.ia, l<ltype>.ib, s<stype>.ia, s<stype>.ib
> * register offset with additional immediate offset (11 instructions):
>    lr<ltype>, sr<stype>
> * zero-extended register offset with additional immediate offset
>    (11 instructions): lur<ltype>, sur<stype>
> 
> The RISC-V base ISA does not support index registers, so the changes
> are kept separate from the RISC-V standard support.
> 
> Similar like other extensions (Zbb, XTheadBb), this patch needs to
> prevent the conversion of sign-extensions/zero-extensions into
> shift instructions. The case of the zero-extended register offset
> addressing mode is handled by a new peephole pass.
> 
> Handling the different cases of extensions results in a couple of INSNs
> that look redundant on first view, but they are just the equivalent
> of what we already have for Zbb as well. The only difference is, that
> we have much more load instructions.
> 
> To fully utilize the capabilities of the instructions, there are
> a few new peephole passes which fold shift amounts into the RTX
> if possible. The added tests ensure that this feature won't
> regress without notice.
> 
> We already have a constraint with the name 'th_f_fmv', therefore,
> the new constraints follow this pattern and have the same length
> as required ('th_m_mia', 'th_m_mib', 'th_m_mir', 'th_m_miu').
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/constraints.md (th_m_mia): New constraint.
> 	(th_m_mib): Likewise.
> 	(th_m_mir): Likewise.
> 	(th_m_miu): Likewise.
> 	* config/riscv/riscv-protos.h (enum riscv_address_type):
> 	Add new address types ADDRESS_REG_REG, ADDRESS_REG_UREG,
> 	and ADDRESS_REG_WB and their documentation.
> 	(struct riscv_address_info): Add new field 'shift' and
> 	document the field usage for the new address types.
> 	(riscv_valid_base_register_p): New prototype.
> 	(th_memidx_legitimate_modify_p): Likewise.
> 	(th_memidx_legitimate_index_p): Likewise.
> 	(th_classify_address): Likewise.
> 	(th_output_move): Likewise.
> 	(th_print_operand_address): Likewise.
> 	* config/riscv/riscv.cc (riscv_index_reg_class):
> 	Return GR_REGS for XTheadMemIdx.
> 	(riscv_regno_ok_for_index_p): Add support for XTheadMemIdx.
> 	(riscv_classify_address): Call th_classify_address() on top.
> 	(riscv_output_move): Call th_output_move() on top.
> 	(riscv_print_operand_address): Call th_print_operand_address()
> 	on top.
> 	* config/riscv/riscv.h (HAVE_POST_MODIFY_DISP): New macro.
> 	(HAVE_PRE_MODIFY_DISP): Likewise.
> 	* config/riscv/riscv.md (zero_extendqi<SUPERQI:mode>2): Disable
> 	for XTheadMemIdx.
> 	(*zero_extendqi<SUPERQI:mode>2_internal): Convert to expand,
> 	create INSN with same name and disable it for XTheadMemIdx.
> 	(extendsidi2): Likewise.
> 	(*extendsidi2_internal): Disable for XTheadMemIdx.
> 	* config/riscv/thead-peephole.md: Add helper peephole passes.
> 	* config/riscv/thead.cc (valid_signed_immediate): New helper
> 	function.
> 	(th_memidx_classify_address_modify): New function.
> 	(th_memidx_legitimate_modify_p): Likewise.
> 	(th_memidx_output_modify): Likewise.
> 	(is_memidx_mode): Likewise.
> 	(th_memidx_classify_address_index): Likewise.
> 	(th_memidx_legitimate_index_p): Likewise.
> 	(th_memidx_output_index): Likewise.
> 	(th_classify_address): Likewise.
> 	(th_output_move): Likewise.
> 	(th_print_operand_address): Likewise.
> 	* config/riscv/thead.md (*th_memidx_mov<mode>2):
> 	New INSN.
> 	(*th_memidx_zero_extendqi<SUPERQI:mode>2): Likewise.
> 	(*th_memidx_extendsidi2): Likewise
> 	(*th_memidx_zero_extendsidi2): Likewise.
> 	(*th_memidx_zero_extendhi<GPR:mode>2): Likewise.
> 	(*th_memidx_extend<SHORT:mode><SUPERQI:mode>2): Likewise
> 	(*th_memidx_bb_zero_extendsidi2): Likewise.
> 	(*th_memidx_bb_zero_extendhi<GPR:mode>2): Likewise.
> 	(*th_memidx_bb_extendhi<GPR:mode>2): Likewise.
> 	(*th_memidx_bb_extendqi<SUPERQI:mode>2): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/xtheadmemidx-helpers.h: New test.
> 	* gcc.target/riscv/xtheadmemidx-index-update.c: New test.
> 	* gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c: New test.
> 	* gcc.target/riscv/xtheadmemidx-index-xtheadbb.c: New test.
> 	* gcc.target/riscv/xtheadmemidx-index.c: New test.
> 	* gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c: New test.
> 	* gcc.target/riscv/xtheadmemidx-modify.c: New test.
> 	* gcc.target/riscv/xtheadmemidx-uindex-update.c: New test.
> 	* gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c: New test.
> 	* gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c: New test.
> 	* gcc.target/riscv/xtheadmemidx-uindex.c: New test.
> 



> diff --git a/gcc/config/riscv/thead-peephole.md b/gcc/config/riscv/thead-peephole.md
> index 5b829b5b968..2a4c734a220 100644
> --- a/gcc/config/riscv/thead-peephole.md
> +++ b/gcc/config/riscv/thead-peephole.md
> @@ -72,3 +72,217 @@ (define_peephole2
>   {
>     th_mempair_order_operands (operands, true, SImode);
>   })
> +
> +;; All modes that are supported by XTheadMemIdx
> +(define_mode_iterator TH_M_ANY [QI HI SI (DI "TARGET_64BIT")])
> +
> +;; All non-extension modes that are supported by XTheadMemIdx
> +(define_mode_iterator TH_M_NOEXT [(SI "!TARGET_64BIT") (DI "TARGET_64BIT")])
> +
> +;; XTheadMemIdx overview:
> +;; All peephole passes attempt to improve the operand utilization of
> +;; XTheadMemIdx instructions, where one sign or zero extended
> +;; register-index-operand can be shifted left by a 2-bit immediate.
> +;;
> +;; The basic idea is the following optimization:
> +;; (set (reg 0) (op (reg 1) (imm 2)))
> +;; (set (reg 3) (mem (plus (reg 0) (reg 4)))
> +;; ==>
> +;; (set (reg 3) (mem (plus (reg 4) (op2 (reg 1) (imm 2))))
> +;; This optimization only valid if (reg 0) has no further uses.
Couldn't this be done by combine if you created define_insn patterns 
rather than define_peephole2 patterns?  Similarly for the other cases 
handled here.

Do you need to define HAVE_{PRE,POST}_MODIFY?  I see 
HAVE_{PRE,POST}_MODIFY_DISP. is defined.  If that's sufficient, great. 
This stuff seems to have changed since I last looked at it.

So I have to ask.  Is the extension documented?  If so, we probably 
should have a link to it.  What's the status of hardware availablity 
with this extension?

Jeff
  
Christoph Müllner June 28, 2023, 12:39 p.m. UTC | #2
On Sat, Jun 10, 2023 at 7:53 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 4/28/23 00:23, Christoph Muellner wrote:
> > From: Christoph Müllner <christoph.muellner@vrull.eu>
> >
> > The XTheadMemIdx ISA extension provides a additional load and store
> > instructions with new addressing modes.
> >
> > The following memory accesses types are supported:
> > * ltype = [b,bu,h,hu,w,wu,d]
> > * stype = [b,h,w,d]
> >
> > The following addressing modes are supported:
> > * immediate offset with PRE_MODIFY or POST_MODIFY (22 instructions):
> >    l<ltype>.ia, l<ltype>.ib, s<stype>.ia, s<stype>.ib
> > * register offset with additional immediate offset (11 instructions):
> >    lr<ltype>, sr<stype>
> > * zero-extended register offset with additional immediate offset
> >    (11 instructions): lur<ltype>, sur<stype>
> >
> > The RISC-V base ISA does not support index registers, so the changes
> > are kept separate from the RISC-V standard support.
> >
> > Similar like other extensions (Zbb, XTheadBb), this patch needs to
> > prevent the conversion of sign-extensions/zero-extensions into
> > shift instructions. The case of the zero-extended register offset
> > addressing mode is handled by a new peephole pass.
> >
> > Handling the different cases of extensions results in a couple of INSNs
> > that look redundant on first view, but they are just the equivalent
> > of what we already have for Zbb as well. The only difference is, that
> > we have much more load instructions.
> >
> > To fully utilize the capabilities of the instructions, there are
> > a few new peephole passes which fold shift amounts into the RTX
> > if possible. The added tests ensure that this feature won't
> > regress without notice.
> >
> > We already have a constraint with the name 'th_f_fmv', therefore,
> > the new constraints follow this pattern and have the same length
> > as required ('th_m_mia', 'th_m_mib', 'th_m_mir', 'th_m_miu').
> >
> > gcc/ChangeLog:
> >
> >       * config/riscv/constraints.md (th_m_mia): New constraint.
> >       (th_m_mib): Likewise.
> >       (th_m_mir): Likewise.
> >       (th_m_miu): Likewise.
> >       * config/riscv/riscv-protos.h (enum riscv_address_type):
> >       Add new address types ADDRESS_REG_REG, ADDRESS_REG_UREG,
> >       and ADDRESS_REG_WB and their documentation.
> >       (struct riscv_address_info): Add new field 'shift' and
> >       document the field usage for the new address types.
> >       (riscv_valid_base_register_p): New prototype.
> >       (th_memidx_legitimate_modify_p): Likewise.
> >       (th_memidx_legitimate_index_p): Likewise.
> >       (th_classify_address): Likewise.
> >       (th_output_move): Likewise.
> >       (th_print_operand_address): Likewise.
> >       * config/riscv/riscv.cc (riscv_index_reg_class):
> >       Return GR_REGS for XTheadMemIdx.
> >       (riscv_regno_ok_for_index_p): Add support for XTheadMemIdx.
> >       (riscv_classify_address): Call th_classify_address() on top.
> >       (riscv_output_move): Call th_output_move() on top.
> >       (riscv_print_operand_address): Call th_print_operand_address()
> >       on top.
> >       * config/riscv/riscv.h (HAVE_POST_MODIFY_DISP): New macro.
> >       (HAVE_PRE_MODIFY_DISP): Likewise.
> >       * config/riscv/riscv.md (zero_extendqi<SUPERQI:mode>2): Disable
> >       for XTheadMemIdx.
> >       (*zero_extendqi<SUPERQI:mode>2_internal): Convert to expand,
> >       create INSN with same name and disable it for XTheadMemIdx.
> >       (extendsidi2): Likewise.
> >       (*extendsidi2_internal): Disable for XTheadMemIdx.
> >       * config/riscv/thead-peephole.md: Add helper peephole passes.
> >       * config/riscv/thead.cc (valid_signed_immediate): New helper
> >       function.
> >       (th_memidx_classify_address_modify): New function.
> >       (th_memidx_legitimate_modify_p): Likewise.
> >       (th_memidx_output_modify): Likewise.
> >       (is_memidx_mode): Likewise.
> >       (th_memidx_classify_address_index): Likewise.
> >       (th_memidx_legitimate_index_p): Likewise.
> >       (th_memidx_output_index): Likewise.
> >       (th_classify_address): Likewise.
> >       (th_output_move): Likewise.
> >       (th_print_operand_address): Likewise.
> >       * config/riscv/thead.md (*th_memidx_mov<mode>2):
> >       New INSN.
> >       (*th_memidx_zero_extendqi<SUPERQI:mode>2): Likewise.
> >       (*th_memidx_extendsidi2): Likewise
> >       (*th_memidx_zero_extendsidi2): Likewise.
> >       (*th_memidx_zero_extendhi<GPR:mode>2): Likewise.
> >       (*th_memidx_extend<SHORT:mode><SUPERQI:mode>2): Likewise
> >       (*th_memidx_bb_zero_extendsidi2): Likewise.
> >       (*th_memidx_bb_zero_extendhi<GPR:mode>2): Likewise.
> >       (*th_memidx_bb_extendhi<GPR:mode>2): Likewise.
> >       (*th_memidx_bb_extendqi<SUPERQI:mode>2): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/riscv/xtheadmemidx-helpers.h: New test.
> >       * gcc.target/riscv/xtheadmemidx-index-update.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-index-xtheadbb.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-index.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-modify.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-uindex-update.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c: New test.
> >       * gcc.target/riscv/xtheadmemidx-uindex.c: New test.
> >
>
>
>
> > diff --git a/gcc/config/riscv/thead-peephole.md b/gcc/config/riscv/thead-peephole.md
> > index 5b829b5b968..2a4c734a220 100644
> > --- a/gcc/config/riscv/thead-peephole.md
> > +++ b/gcc/config/riscv/thead-peephole.md
> > @@ -72,3 +72,217 @@ (define_peephole2
> >   {
> >     th_mempair_order_operands (operands, true, SImode);
> >   })
> > +
> > +;; All modes that are supported by XTheadMemIdx
> > +(define_mode_iterator TH_M_ANY [QI HI SI (DI "TARGET_64BIT")])
> > +
> > +;; All non-extension modes that are supported by XTheadMemIdx
> > +(define_mode_iterator TH_M_NOEXT [(SI "!TARGET_64BIT") (DI "TARGET_64BIT")])
> > +
> > +;; XTheadMemIdx overview:
> > +;; All peephole passes attempt to improve the operand utilization of
> > +;; XTheadMemIdx instructions, where one sign or zero extended
> > +;; register-index-operand can be shifted left by a 2-bit immediate.
> > +;;
> > +;; The basic idea is the following optimization:
> > +;; (set (reg 0) (op (reg 1) (imm 2)))
> > +;; (set (reg 3) (mem (plus (reg 0) (reg 4)))
> > +;; ==>
> > +;; (set (reg 3) (mem (plus (reg 4) (op2 (reg 1) (imm 2))))
> > +;; This optimization only valid if (reg 0) has no further uses.
> Couldn't this be done by combine if you created define_insn patterns
> rather than define_peephole2 patterns?  Similarly for the other cases
> handled here.

I was inspired by XTheadMemPair, which merges two memory accesses
into a mem-pair instruction (and which got inspiration from
gcc/config/aarch64/aarch64-ldpstp.md).

I don't see the benefit of using combine or peephole, but I can change
if necessary. At least for the provided test cases, the implementation
works quite well.

>
> Do you need to define HAVE_{PRE,POST}_MODIFY?  I see
> HAVE_{PRE,POST}_MODIFY_DISP. is defined.  If that's sufficient, great.
> This stuff seems to have changed since I last looked at it.

There is no HAVE_{PRE,POST}_MODIFY, only HAVE_{PRE,POST}_MODIFY_REG (register)
and HAVE_{PRE,POST}_MODIFY_DISP (constant displacement).
The defined macros are matching the expectation of gcc/auto-inc-dec.cc
and also match the documentation.


> So I have to ask.  Is the extension documented?  If so, we probably
> should have a link to it.  What's the status of hardware availablity
> with this extension?

The basic support for this extension is already merged.

The documentation can be found here:
  https://github.com/T-head-Semi/thead-extension-spec/tree/master

The extension's name and a link to the documentation has also been
registered here:
  https://github.com/riscv-non-isa/riscv-toolchain-conventions#list-of-vendor-extensions

The XTheadMemIdx extension is part of the T-Head C906 and C910 SoCs.
The C906 was launched in October 2021.

Thanks
Christoph
  
Jeff Law June 28, 2023, 6:23 p.m. UTC | #3
On 6/28/23 06:39, Christoph Müllner wrote:

>>> +;; XTheadMemIdx overview:
>>> +;; All peephole passes attempt to improve the operand utilization of
>>> +;; XTheadMemIdx instructions, where one sign or zero extended
>>> +;; register-index-operand can be shifted left by a 2-bit immediate.
>>> +;;
>>> +;; The basic idea is the following optimization:
>>> +;; (set (reg 0) (op (reg 1) (imm 2)))
>>> +;; (set (reg 3) (mem (plus (reg 0) (reg 4)))
>>> +;; ==>
>>> +;; (set (reg 3) (mem (plus (reg 4) (op2 (reg 1) (imm 2))))
>>> +;; This optimization only valid if (reg 0) has no further uses.
>> Couldn't this be done by combine if you created define_insn patterns
>> rather than define_peephole2 patterns?  Similarly for the other cases
>> handled here.
> 
> I was inspired by XTheadMemPair, which merges two memory accesses
> into a mem-pair instruction (and which got inspiration from
> gcc/config/aarch64/aarch64-ldpstp.md).
Right.  I'm pretty familiar with those.  They cover a different case, 
specifically the two insns being optimized don't have a true data 
dependency between them.  ie, the first instruction does not produce a 
result used in the second insn.


In the case above there is a data dependency on reg0.  ie, the first 
instruction generates a result used in the second instruction.  combine 
is usually the best place to handle the data dependency case.


> 
> I don't see the benefit of using combine or peephole, but I can change
> if necessary. At least for the provided test cases, the implementation
> works quite well.
Peepholes require the instructions to be consecutive in the stream while 
combine relies on data dependence links and can thus find these 
opportunities even when the two insn we care about are separated by 
unrelated other insns.


Jeff
  
Christoph Müllner June 29, 2023, 7:39 a.m. UTC | #4
On Wed, Jun 28, 2023 at 8:23 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 6/28/23 06:39, Christoph Müllner wrote:
>
> >>> +;; XTheadMemIdx overview:
> >>> +;; All peephole passes attempt to improve the operand utilization of
> >>> +;; XTheadMemIdx instructions, where one sign or zero extended
> >>> +;; register-index-operand can be shifted left by a 2-bit immediate.
> >>> +;;
> >>> +;; The basic idea is the following optimization:
> >>> +;; (set (reg 0) (op (reg 1) (imm 2)))
> >>> +;; (set (reg 3) (mem (plus (reg 0) (reg 4)))
> >>> +;; ==>
> >>> +;; (set (reg 3) (mem (plus (reg 4) (op2 (reg 1) (imm 2))))
> >>> +;; This optimization only valid if (reg 0) has no further uses.
> >> Couldn't this be done by combine if you created define_insn patterns
> >> rather than define_peephole2 patterns?  Similarly for the other cases
> >> handled here.
> >
> > I was inspired by XTheadMemPair, which merges two memory accesses
> > into a mem-pair instruction (and which got inspiration from
> > gcc/config/aarch64/aarch64-ldpstp.md).
> Right.  I'm pretty familiar with those.  They cover a different case,
> specifically the two insns being optimized don't have a true data
> dependency between them.  ie, the first instruction does not produce a
> result used in the second insn.
>
>
> In the case above there is a data dependency on reg0.  ie, the first
> instruction generates a result used in the second instruction.  combine
> is usually the best place to handle the data dependency case.

Ok, understood.

It is a bit of a special case here, because the peephole is restricted
to those cases, where reg0 is not used elsewhere (peep2_reg_dead_p()).
I have not seen how to do this for combiner optimizations.

I found sh_remove_reg_dead_or_unused_notes(), which tests for reg notes
on a given rtx_insn. In our case we have a pattern that matches two insns,
where we have to test if one operand (reg0) is dead or unused after the second
insn. The first insn can be accessed with "curr_insn", but I did not see how to
access the second matching insn. Any ideas or hints?

Thanks,
Christoph



>
>
> >
> > I don't see the benefit of using combine or peephole, but I can change
> > if necessary. At least for the provided test cases, the implementation
> > works quite well.
> Peepholes require the instructions to be consecutive in the stream while
> combine relies on data dependence links and can thus find these
> opportunities even when the two insn we care about are separated by
> unrelated other insns.
>
>
> Jeff
  
Jeff Law June 29, 2023, 2:09 p.m. UTC | #5
On 6/29/23 01:39, Christoph Müllner wrote:
> On Wed, Jun 28, 2023 at 8:23 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>>
>>
>>
>> On 6/28/23 06:39, Christoph Müllner wrote:
>>
>>>>> +;; XTheadMemIdx overview:
>>>>> +;; All peephole passes attempt to improve the operand utilization of
>>>>> +;; XTheadMemIdx instructions, where one sign or zero extended
>>>>> +;; register-index-operand can be shifted left by a 2-bit immediate.
>>>>> +;;
>>>>> +;; The basic idea is the following optimization:
>>>>> +;; (set (reg 0) (op (reg 1) (imm 2)))
>>>>> +;; (set (reg 3) (mem (plus (reg 0) (reg 4)))
>>>>> +;; ==>
>>>>> +;; (set (reg 3) (mem (plus (reg 4) (op2 (reg 1) (imm 2))))
>>>>> +;; This optimization only valid if (reg 0) has no further uses.
>>>> Couldn't this be done by combine if you created define_insn patterns
>>>> rather than define_peephole2 patterns?  Similarly for the other cases
>>>> handled here.
>>>
>>> I was inspired by XTheadMemPair, which merges two memory accesses
>>> into a mem-pair instruction (and which got inspiration from
>>> gcc/config/aarch64/aarch64-ldpstp.md).
>> Right.  I'm pretty familiar with those.  They cover a different case,
>> specifically the two insns being optimized don't have a true data
>> dependency between them.  ie, the first instruction does not produce a
>> result used in the second insn.
>>
>>
>> In the case above there is a data dependency on reg0.  ie, the first
>> instruction generates a result used in the second instruction.  combine
>> is usually the best place to handle the data dependency case.
> 
> Ok, understood.
> 
> It is a bit of a special case here, because the peephole is restricted
> to those cases, where reg0 is not used elsewhere (peep2_reg_dead_p()).
> I have not seen how to do this for combiner optimizations.
If the value is used elsewhere, then the combiner will generate a 
parallel with two sets.  If the value dies, then the combiner generates 
the one set.  ie given

(set (t) (op0 (a) (b)))
(set (r) (op1 (c) (t)))

If "t" is dead, then combine will present you with:

(set (r) (op1 (c) (op0 (a) (b))))

If "t" is used elsewhere, then combine will present you with:

(parallel
   [(set (r) (op1 (c) (op0 (a) (b))))
    (set (t) (op0 (a) (b)))])

Which makes perfect sense if you think about it for a while.  If you 
still need "t", then the first sequence simply isn't valid as it doesn't 
preserve that side effect.  Hence it tries to produce a sequence with 
the combined operation, but with the side effect of the first statement 
included as well.


Jeff
  
Christoph Müllner July 6, 2023, 6:48 a.m. UTC | #6
On Thu, Jun 29, 2023 at 4:09 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 6/29/23 01:39, Christoph Müllner wrote:
> > On Wed, Jun 28, 2023 at 8:23 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
> >>
> >>
> >>
> >> On 6/28/23 06:39, Christoph Müllner wrote:
> >>
> >>>>> +;; XTheadMemIdx overview:
> >>>>> +;; All peephole passes attempt to improve the operand utilization of
> >>>>> +;; XTheadMemIdx instructions, where one sign or zero extended
> >>>>> +;; register-index-operand can be shifted left by a 2-bit immediate.
> >>>>> +;;
> >>>>> +;; The basic idea is the following optimization:
> >>>>> +;; (set (reg 0) (op (reg 1) (imm 2)))
> >>>>> +;; (set (reg 3) (mem (plus (reg 0) (reg 4)))
> >>>>> +;; ==>
> >>>>> +;; (set (reg 3) (mem (plus (reg 4) (op2 (reg 1) (imm 2))))
> >>>>> +;; This optimization only valid if (reg 0) has no further uses.
> >>>> Couldn't this be done by combine if you created define_insn patterns
> >>>> rather than define_peephole2 patterns?  Similarly for the other cases
> >>>> handled here.
> >>>
> >>> I was inspired by XTheadMemPair, which merges two memory accesses
> >>> into a mem-pair instruction (and which got inspiration from
> >>> gcc/config/aarch64/aarch64-ldpstp.md).
> >> Right.  I'm pretty familiar with those.  They cover a different case,
> >> specifically the two insns being optimized don't have a true data
> >> dependency between them.  ie, the first instruction does not produce a
> >> result used in the second insn.
> >>
> >>
> >> In the case above there is a data dependency on reg0.  ie, the first
> >> instruction generates a result used in the second instruction.  combine
> >> is usually the best place to handle the data dependency case.
> >
> > Ok, understood.
> >
> > It is a bit of a special case here, because the peephole is restricted
> > to those cases, where reg0 is not used elsewhere (peep2_reg_dead_p()).
> > I have not seen how to do this for combiner optimizations.
> If the value is used elsewhere, then the combiner will generate a
> parallel with two sets.  If the value dies, then the combiner generates
> the one set.  ie given
>
> (set (t) (op0 (a) (b)))
> (set (r) (op1 (c) (t)))
>
> If "t" is dead, then combine will present you with:
>
> (set (r) (op1 (c) (op0 (a) (b))))
>
> If "t" is used elsewhere, then combine will present you with:
>
> (parallel
>    [(set (r) (op1 (c) (op0 (a) (b))))
>     (set (t) (op0 (a) (b)))])
>
> Which makes perfect sense if you think about it for a while.  If you
> still need "t", then the first sequence simply isn't valid as it doesn't
> preserve that side effect.  Hence it tries to produce a sequence with
> the combined operation, but with the side effect of the first statement
> included as well.

Thanks for this!
Of course I was "lucky" and ran into the issue that the patterns did not match,
because of unexpected MULT insns where ASHIFTs were expected.
But after reading enough of combiner.cc I understood that this is on purpose
(for addresses) and I have to adjust my INSNs accordingly.

I've changed the patches for XTheadMemIdx and XTheadFMemIdx and will
send out a new series.

Thanks,
Christoph
  
Jeff Law July 6, 2023, 3:28 p.m. UTC | #7
On 7/6/23 00:48, Christoph Müllner wrote:

> 
> Thanks for this!
> Of course I was "lucky" and ran into the issue that the patterns did not match,
> because of unexpected MULT insns where ASHIFTs were expected.
> But after reading enough of combiner.cc I understood that this is on purpose
> (for addresses) and I have to adjust my INSNs accordingly.
Yea, it's a wart that the same operation has two different canonical 
forms depending on the context where it shows up :(  It's definitely a wart.

> 
> I've changed the patches for XTheadMemIdx and XTheadFMemIdx and will
> send out a new series.
Sounds good.

Jeff
  

Patch

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index c448e6b37e9..9cf83c0aa8f 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -188,3 +188,27 @@  (define_register_constraint "th_f_fmv" "TARGET_XTHEADFMV ? FP_REGS : NO_REGS"
 
 (define_register_constraint "th_r_fmv" "TARGET_XTHEADFMV ? GR_REGS : NO_REGS"
   "An integer register for XTheadFmv.")
+
+(define_memory_constraint "th_m_mia"
+  "@internal
+   A MEM with a valid address for th.[l|s]*ia instructions."
+  (and (match_code "mem")
+       (match_test "th_memidx_legitimate_modify_p (op, true)")))
+
+(define_memory_constraint "th_m_mib"
+  "@internal
+   A MEM with a valid address for th.[l|s]*ib instructions."
+  (and (match_code "mem")
+       (match_test "th_memidx_legitimate_modify_p (op, false)")))
+
+(define_memory_constraint "th_m_mir"
+  "@internal
+   A MEM with a valid address for th.[l|s]*r* instructions."
+  (and (match_code "mem")
+       (match_test "th_memidx_legitimate_index_p (op, false)")))
+
+(define_memory_constraint "th_m_miu"
+  "@internal
+   A MEM with a valid address for th.[l|s]*ur* instructions."
+  (and (match_code "mem")
+       (match_test "th_memidx_legitimate_index_p (op, true)")))
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index b7417e97d99..d9b3c964285 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -41,6 +41,15 @@  enum riscv_symbol_type {
        A natural register + offset address.  The register satisfies
        riscv_valid_base_register_p and the offset is a const_arith_operand.
 
+   ADDRESS_REG_REG
+       A base register indexed by (optionally scaled) register.
+
+   ADDRESS_REG_UREG
+       A base register indexed by (optionally scaled) zero-extended register.
+
+   ADDRESS_REG_WB
+       A base register indexed by immediate offset with writeback.
+
    ADDRESS_LO_SUM
        A LO_SUM rtx.  The first operand is a valid base register and
        the second operand is a symbolic address.
@@ -52,6 +61,9 @@  enum riscv_symbol_type {
        A constant symbolic address.  */
 enum riscv_address_type {
   ADDRESS_REG,
+  ADDRESS_REG_REG,
+  ADDRESS_REG_UREG,
+  ADDRESS_REG_WB,
   ADDRESS_LO_SUM,
   ADDRESS_CONST_INT,
   ADDRESS_SYMBOLIC
@@ -65,6 +77,13 @@  enum riscv_address_type {
    ADDRESS_REG
        REG is the base register and OFFSET is the constant offset.
 
+   ADDRESS_REG_REG and ADDRESS_REG_UREG
+       REG is the base register and OFFSET is the index register.
+
+   ADDRESS_REG_WB
+       REG is the base register, OFFSET is the constant offset, and
+       shift is the shift amount for the offset.
+
    ADDRESS_LO_SUM
        REG and OFFSET are the operands to the LO_SUM and SYMBOL_TYPE
        is the type of symbol it references.
@@ -76,12 +95,14 @@  struct riscv_address_info {
   rtx reg;
   rtx offset;
   enum riscv_symbol_type symbol_type;
+  int shift;
 };
 
 /* Routines implemented in riscv.cc.  */
 extern enum riscv_symbol_type riscv_classify_symbolic_expression (rtx);
 extern bool riscv_symbolic_constant_p (rtx, enum riscv_symbol_type *);
 extern int riscv_regno_mode_ok_for_base_p (int, machine_mode, bool);
+extern bool riscv_valid_base_register_p (rtx, machine_mode, bool);
 extern enum reg_class riscv_index_reg_class ();
 extern int riscv_regno_ok_for_index_p (int);
 extern int riscv_address_insns (rtx, machine_mode, bool);
@@ -280,6 +301,14 @@  extern void th_mempair_save_restore_regs (rtx[4], bool, machine_mode);
 #ifdef RTX_CODE
 extern const char*
 th_mempair_output_move (rtx[4], bool, machine_mode, RTX_CODE);
+extern bool th_memidx_legitimate_modify_p (rtx);
+extern bool th_memidx_legitimate_modify_p (rtx, bool);
+extern bool th_memidx_legitimate_index_p (rtx);
+extern bool th_memidx_legitimate_index_p (rtx, bool);
+extern bool th_classify_address (struct riscv_address_info *,
+					rtx, machine_mode, bool);
+extern const char *th_output_move (rtx, rtx);
+extern bool th_print_operand_address (FILE *, machine_mode, rtx);
 #endif
 
 #endif /* ! GCC_RISCV_PROTOS_H */
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a33f0fff8ea..1691ecf3a94 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -834,6 +834,9 @@  riscv_regno_mode_ok_for_base_p (int regno,
 enum reg_class
 riscv_index_reg_class ()
 {
+  if (TARGET_XTHEADMEMIDX)
+    return GR_REGS;
+
   return NO_REGS;
 }
 
@@ -844,13 +847,16 @@  riscv_index_reg_class ()
 int
 riscv_regno_ok_for_index_p (int regno)
 {
+  if (TARGET_XTHEADMEMIDX)
+    return riscv_regno_mode_ok_for_base_p (regno, VOIDmode, 1);
+
   return 0;
 }
 
 /* Return true if X is a valid base register for mode MODE.
    STRICT_P is true if REG_OK_STRICT is in effect.  */
 
-static bool
+bool
 riscv_valid_base_register_p (rtx x, machine_mode mode, bool strict_p)
 {
   if (!strict_p && GET_CODE (x) == SUBREG)
@@ -1022,6 +1028,9 @@  static bool
 riscv_classify_address (struct riscv_address_info *info, rtx x,
 			machine_mode mode, bool strict_p)
 {
+  if (th_classify_address (info, x, mode, strict_p))
+    return true;
+
   switch (GET_CODE (x))
     {
     case REG:
@@ -2813,6 +2822,10 @@  riscv_output_move (rtx dest, rtx src)
   machine_mode mode;
   bool dbl_p;
   unsigned width;
+  const char *insn;
+
+  if ((insn = th_output_move (dest, src)))
+    return insn;
 
   dest_code = GET_CODE (dest);
   src_code = GET_CODE (src);
@@ -4554,6 +4567,9 @@  riscv_print_operand_address (FILE *file, machine_mode mode ATTRIBUTE_UNUSED, rtx
 {
   struct riscv_address_info addr;
 
+  if (th_print_operand_address (file, mode, x))
+    return;
+
   if (riscv_classify_address (&addr, x, word_mode, true))
     switch (addr.type)
       {
@@ -4575,7 +4591,11 @@  riscv_print_operand_address (FILE *file, machine_mode mode ATTRIBUTE_UNUSED, rtx
       case ADDRESS_SYMBOLIC:
 	output_addr_const (file, riscv_strip_unspec_address (x));
 	return;
+
+      default:
+	gcc_unreachable ();
       }
+
   gcc_unreachable ();
 }
 
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 21b81c22dea..7b8cc6f6499 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -1103,4 +1103,7 @@  extern void riscv_remove_unneeded_save_restore_calls (void);
 #define DWARF_REG_TO_UNWIND_COLUMN(REGNO) \
   ((REGNO == RISCV_DWARF_VLENB) ? (FIRST_PSEUDO_REGISTER + 1) : REGNO)
 
+#define HAVE_POST_MODIFY_DISP TARGET_XTHEADMEMIDX
+#define HAVE_PRE_MODIFY_DISP  TARGET_XTHEADMEMIDX
+
 #endif /* ! GCC_RISCV_H */
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index f4cc99187ed..1d260ddacfa 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1368,7 +1368,7 @@  (define_insn_and_split "*zero_extendsidi2_internal"
   [(set (match_operand:DI     0 "register_operand"     "=r,r")
 	(zero_extend:DI
 	    (match_operand:SI 1 "nonimmediate_operand" " r,m")))]
-  "TARGET_64BIT && !TARGET_ZBA && !TARGET_XTHEADBB
+  "TARGET_64BIT && !TARGET_ZBA && !TARGET_XTHEADBB && !TARGET_XTHEADMEMIDX
    && !(register_operand (operands[1], SImode)
         && reg_or_subregno (operands[1]) == VL_REGNUM)"
   "@
@@ -1395,7 +1395,7 @@  (define_insn_and_split "*zero_extendhi<GPR:mode>2"
   [(set (match_operand:GPR    0 "register_operand"     "=r,r")
 	(zero_extend:GPR
 	    (match_operand:HI 1 "nonimmediate_operand" " r,m")))]
-  "!TARGET_ZBB && !TARGET_XTHEADBB"
+  "!TARGET_ZBB && !TARGET_XTHEADBB && !TARGET_XTHEADMEMIDX"
   "@
    #
    lhu\t%0,%1"
@@ -1413,11 +1413,17 @@  (define_insn_and_split "*zero_extendhi<GPR:mode>2"
   [(set_attr "move_type" "shift_shift,load")
    (set_attr "mode" "<GPR:MODE>")])
 
-(define_insn "zero_extendqi<SUPERQI:mode>2"
+(define_expand "zero_extendqi<SUPERQI:mode>2"
+  [(set (match_operand:SUPERQI    0 "register_operand")
+	(zero_extend:SUPERQI
+	    (match_operand:QI 1 "nonimmediate_operand")))]
+  "")
+
+(define_insn "*zero_extendqi<SUPERQI:mode>2_internal"
   [(set (match_operand:SUPERQI 0 "register_operand"    "=r,r")
 	(zero_extend:SUPERQI
 	    (match_operand:QI 1 "nonimmediate_operand" " r,m")))]
-  ""
+  "!TARGET_XTHEADMEMIDX"
   "@
    andi\t%0,%1,0xff
    lbu\t%0,%1"
@@ -1431,11 +1437,17 @@  (define_insn "zero_extendqi<SUPERQI:mode>2"
 ;;
 ;;  ....................
 
-(define_insn "extendsidi2"
+(define_expand "extendsidi2"
   [(set (match_operand:DI     0 "register_operand"     "=r,r")
 	(sign_extend:DI
 	    (match_operand:SI 1 "nonimmediate_operand" " r,m")))]
-  "TARGET_64BIT"
+  "TARGET_64BIT")
+
+(define_insn "*extendsidi2_internal"
+  [(set (match_operand:DI     0 "register_operand"     "=r,r")
+	(sign_extend:DI
+	    (match_operand:SI 1 "nonimmediate_operand" " r,m")))]
+  "TARGET_64BIT && !TARGET_XTHEADMEMIDX && !TARGET_XTHEADMEMIDX"
   "@
    sext.w\t%0,%1
    lw\t%0,%1"
@@ -1451,7 +1463,7 @@  (define_insn_and_split "*extend<SHORT:mode><SUPERQI:mode>2"
   [(set (match_operand:SUPERQI   0 "register_operand"     "=r,r")
 	(sign_extend:SUPERQI
 	    (match_operand:SHORT 1 "nonimmediate_operand" " r,m")))]
-  "!TARGET_ZBB && !TARGET_XTHEADBB"
+  "!TARGET_ZBB && !TARGET_XTHEADBB && !TARGET_XTHEADMEMIDX"
   "@
    #
    l<SHORT:size>\t%0,%1"
diff --git a/gcc/config/riscv/thead-peephole.md b/gcc/config/riscv/thead-peephole.md
index 5b829b5b968..2a4c734a220 100644
--- a/gcc/config/riscv/thead-peephole.md
+++ b/gcc/config/riscv/thead-peephole.md
@@ -72,3 +72,217 @@  (define_peephole2
 {
   th_mempair_order_operands (operands, true, SImode);
 })
+
+;; All modes that are supported by XTheadMemIdx
+(define_mode_iterator TH_M_ANY [QI HI SI (DI "TARGET_64BIT")])
+
+;; All non-extension modes that are supported by XTheadMemIdx
+(define_mode_iterator TH_M_NOEXT [(SI "!TARGET_64BIT") (DI "TARGET_64BIT")])
+
+;; XTheadMemIdx overview:
+;; All peephole passes attempt to improve the operand utilization of
+;; XTheadMemIdx instructions, where one sign or zero extended
+;; register-index-operand can be shifted left by a 2-bit immediate.
+;;
+;; The basic idea is the following optimization:
+;; (set (reg 0) (op (reg 1) (imm 2)))
+;; (set (reg 3) (mem (plus (reg 0) (reg 4)))
+;; ==>
+;; (set (reg 3) (mem (plus (reg 4) (op2 (reg 1) (imm 2))))
+;; This optimization only valid if (reg 0) has no further uses.
+;;
+;; The three-instruction case is as follows:
+;; (set (reg 0) (op1 (reg 1) (imm 2)))
+;; (set (reg 3) (op2 (reg 0) (imm 4)))
+;; (set (reg 5) (mem (plus (reg 3) (reg 6)))
+;; ==>
+;; (set (reg 5) (mem (plus (reg 6) (op2 (reg 1) (imm 2/4)))))
+;; This optimization is only valid if (reg 0) and (reg 3) have no further uses.
+;;
+;; The optimization cases are:
+;; I) fold 2-bit ashift of register offset into mem-plus RTX
+;; US) fold 32-bit zero-extended (shift) offset into mem-plus
+;; UZ) fold 32-bit zero-extended (zext) offset into mem-plus
+;;
+;; The first optimization case is targeting the th.lr<MODE> instructions.
+;; The other optimization cases are targeting the th.lur<MODE> instructions
+;; and have to consider two forms of zero-extensions:
+;; - ashift-32 + lshiftrt-{29..32} if there are no zero-extension instructions.
+;;   Left-shift amounts of 29..31 indicate a left-shifted zero-extended value.
+;; - zero-extend32 if there are zero-extension instructions (XTheadBb or Zbb).
+;;
+;; We always have three peephole passes per optimization case:
+;; a) no-extended (X) word-load
+;; b) any-extend (SUBX) word-load
+;; c) store
+
+;; XTheadMemIdx I/a
+(define_peephole2
+  [(set (match_operand:X 0 "register_operand" "")
+        (ashift:X (match_operand:X 1 "register_operand" "")
+                  (match_operand:QI 2 "immediate_operand" "")))
+   (set (match_operand:TH_M_NOEXT 3 "register_operand" "")
+        (mem:TH_M_NOEXT (plus:X
+          (match_dup 0)
+          (match_operand:X 4 "register_operand" ""))))]
+  "TARGET_XTHEADMEMIDX
+   && peep2_reg_dead_p (2, operands[0])
+   && IN_RANGE (INTVAL (operands[2]), 0, 3)"
+  [(set (match_dup 3)
+        (mem:TH_M_NOEXT (plus:X
+          (match_dup 4)
+          (ashift:X (match_dup 1) (match_dup 2)))))]
+)
+
+;; XTheadMemIdx I/b
+(define_peephole2
+  [(set (match_operand:X 0 "register_operand" "")
+        (ashift:X (match_operand:X 1 "register_operand" "")
+                  (match_operand:QI 2 "immediate_operand" "")))
+   (set (match_operand:X 3 "register_operand" "")
+        (any_extend:X (mem:SUBX (plus:X
+          (match_dup 0)
+          (match_operand:X 4 "register_operand" "")))))]
+  "TARGET_XTHEADMEMIDX
+   && peep2_reg_dead_p (2, operands[0])
+   && IN_RANGE (INTVAL (operands[2]), 0, 3)"
+  [(set (match_dup 3)
+        (any_extend:X (mem:SUBX (plus:X
+          (match_dup 4)
+          (ashift:X (match_dup 1) (match_dup 2))))))]
+)
+
+;; XTheadMemIdx I/c
+(define_peephole2
+  [(set (match_operand:X 0 "register_operand" "")
+     (ashift:X (match_operand:X 1 "register_operand" "")
+               (match_operand:QI 2 "immediate_operand" "")))
+   (set (mem:TH_M_ANY (plus:X
+          (match_dup 0)
+          (match_operand:X 4 "register_operand" "")))
+        (match_operand:TH_M_ANY 3 "register_operand" ""))]
+  "TARGET_XTHEADMEMIDX
+   && peep2_reg_dead_p (2, operands[0])
+   && IN_RANGE (INTVAL (operands[2]), 0, 3)"
+  [(set (mem:TH_M_ANY (plus:X
+          (match_dup 4)
+          (ashift:X (match_dup 1) (match_dup 2))))
+        (match_dup 3))]
+)
+
+;; XTheadMemIdx US/a
+(define_peephole2
+  [(set (match_operand:DI 0 "register_operand" "")
+        (ashift:DI (match_operand:DI 1 "register_operand" "") (const_int 32)))
+   (set (match_operand:DI 3 "register_operand" "")
+        (lshiftrt:DI (match_dup 0) (match_operand:QI 4 "immediate_operand" "")))
+   (set (match_operand:TH_M_NOEXT 5 "register_operand" "")
+        (mem:TH_M_NOEXT (plus:DI
+          (match_dup 3)
+          (match_operand:DI 6 "register_operand" ""))))]
+  "TARGET_64BIT && TARGET_XTHEADMEMIDX
+   && peep2_reg_dead_p (3, operands[0])
+   && peep2_reg_dead_p (3, operands[3])
+   && IN_RANGE (INTVAL (operands[4]), 29, 32)"
+  [(set (match_dup 5)
+        (mem:TH_M_NOEXT (plus:DI
+          (match_dup 6)
+          (ashift:DI (zero_extend:DI (match_dup 1)) (match_dup 4)))))]
+  { operands[1] = gen_lowpart (SImode, operands[1]);
+    operands[4] = GEN_INT (32 - (INTVAL (operands [4])));
+  }
+)
+
+;; XTheadMemIdx US/b
+(define_peephole2
+  [(set (match_operand:DI 0 "register_operand" "")
+        (ashift:DI (match_operand:DI 1 "register_operand" "") (const_int 32)))
+   (set (match_operand:DI 3 "register_operand" "")
+        (lshiftrt:DI (match_dup 0) (match_operand:QI 4 "immediate_operand" "")))
+   (set (match_operand:X 5 "register_operand" "")
+        (any_extend:X (mem:SUBX (plus:DI
+          (match_dup 3)
+          (match_operand:DI 6 "register_operand" "")))))]
+  "TARGET_64BIT && TARGET_XTHEADMEMIDX
+   && peep2_reg_dead_p (3, operands[0])
+   && peep2_reg_dead_p (3, operands[3])
+   && IN_RANGE (INTVAL (operands[4]), 29, 32)"
+  [(set (match_dup 5)
+        (any_extend:X (mem:SUBX (plus:DI
+          (match_dup 6)
+          (ashift:DI (zero_extend:DI (match_dup 1)) (match_dup 4))))))]
+  { operands[1] = gen_lowpart (SImode, operands[1]);
+    operands[4] = GEN_INT (32 - (INTVAL (operands [4])));
+  }
+)
+
+;; XTheadMemIdx US/c
+(define_peephole2
+  [(set (match_operand:DI 0 "register_operand" "")
+        (ashift:DI (match_operand:DI 1 "register_operand" "") (const_int 32)))
+   (set (match_operand:DI 3 "register_operand" "")
+        (lshiftrt:DI (match_dup 0) (match_operand:QI 4 "immediate_operand" "")))
+   (set (mem:TH_M_ANY (plus:DI
+          (match_dup 3)
+          (match_operand:DI 6 "register_operand" "")))
+        (match_operand:TH_M_ANY 5 "register_operand" ""))]
+  "TARGET_64BIT && TARGET_XTHEADMEMIDX
+   && peep2_reg_dead_p (3, operands[0])
+   && peep2_reg_dead_p (3, operands[3])
+   && IN_RANGE (INTVAL (operands[4]), 29, 32)"
+  [(set (mem:TH_M_ANY (plus:DI
+          (match_dup 6)
+          (ashift:DI (zero_extend:DI (match_dup 1)) (match_dup 4))))
+        (match_dup 5))]
+  { operands[1] = gen_lowpart (SImode, operands[1]);
+    operands[4] = GEN_INT (32 - (INTVAL (operands [4])));
+  }
+)
+
+;; XTheadMemIdx UZ/a
+(define_peephole2
+  [(set (match_operand:DI 0 "register_operand" "")
+        (zero_extend:DI (match_operand:SI 1 "register_operand" "")))
+   (set (match_operand:TH_M_NOEXT 3 "register_operand" "")
+        (mem:TH_M_NOEXT (plus:DI
+          (match_dup 0)
+          (match_operand:DI 4 "register_operand" ""))))]
+  "TARGET_64BIT && TARGET_XTHEADMEMIDX
+   && peep2_reg_dead_p (2, operands[0])"
+  [(set (match_dup 3)
+        (mem:TH_M_NOEXT (plus:DI
+          (match_dup 4)
+          (zero_extend:DI (match_dup 1)))))]
+)
+
+;; XTheadMemIdx UZ/b
+(define_peephole2
+  [(set (match_operand:DI 0 "register_operand" "")
+        (zero_extend:DI (match_operand:SI 1 "register_operand" "")))
+   (set (match_operand:X 3 "register_operand" "")
+        (any_extend:X (mem:SUBX (plus:DI
+          (match_dup 0)
+          (match_operand:DI 4 "register_operand" "")))))]
+  "TARGET_64BIT && TARGET_XTHEADMEMIDX
+   && peep2_reg_dead_p (2, operands[0])"
+  [(set (match_dup 3)
+        (any_extend:X (mem:SUBX (plus:DI
+          (match_dup 4)
+          (zero_extend:DI (match_dup 1))))))]
+)
+
+;; XTheadMemIdx UZ/c
+(define_peephole2
+  [(set (match_operand:DI 0 "register_operand" "")
+        (zero_extend:DI (match_operand:SI 1 "register_operand" "")))
+   (set (mem:TH_M_ANY (plus:DI
+          (match_dup 0)
+          (match_operand:DI 4 "register_operand" "")))
+        (match_operand:TH_M_ANY 3 "register_operand" ""))]
+  "TARGET_64BIT && TARGET_XTHEADMEMIDX
+   && peep2_reg_dead_p (2, operands[0])"
+  [(set (mem:TH_M_ANY (plus:DI
+          (match_dup 4)
+          (zero_extend:DI (match_dup 1))))
+        (match_dup 3))]
+)
diff --git a/gcc/config/riscv/thead.cc b/gcc/config/riscv/thead.cc
index 507c912bc39..bfe035250af 100644
--- a/gcc/config/riscv/thead.cc
+++ b/gcc/config/riscv/thead.cc
@@ -25,11 +25,16 @@ 
 #include "coretypes.h"
 #include "target.h"
 #include "backend.h"
+#include "tree.h"
 #include "rtl.h"
+#include "explow.h"
 #include "memmodel.h"
 #include "emit-rtl.h"
+#include "optabs.h"
 #include "poly-int.h"
 #include "output.h"
+#include "regs.h"
+#include "riscv-protos.h"
 
 /* If MEM is in the form of "base+offset", extract the two parts
    of address and set to BASE and OFFSET, otherwise return false
@@ -429,3 +434,426 @@  th_mempair_save_restore_regs (rtx operands[4], bool load_p,
   else
     th_mempair_save_regs (operands);
 }
+
+/* Return true if X can be represented as signed immediate of NBITS bits.
+   The immediate is assumed to be shifted by LSHAMT bits left.  */
+
+static bool
+valid_signed_immediate (rtx x, unsigned nbits, unsigned lshamt)
+{
+  if (GET_CODE (x) != CONST_INT)
+    return false;
+
+  HOST_WIDE_INT v = INTVAL (x);
+
+  HOST_WIDE_INT vunshifted = v >> lshamt;
+
+  /* Make sure we did not shift out any bits.  */
+  if (vunshifted << lshamt != v)
+    return false;
+
+  unsigned HOST_WIDE_INT imm_reach = 1LL << nbits;
+  return ((unsigned HOST_WIDE_INT) vunshifted + imm_reach/2 < imm_reach);
+}
+
+/* Return true if X is a valid address for T-Head's memory addressing modes
+   with pre/post modification for machine mode MODE.
+   If it is, fill in INFO appropriately (if non-NULL).
+   If STRICT_P is true then REG_OK_STRICT is in effect.  */
+
+static bool
+th_memidx_classify_address_modify (struct riscv_address_info *info, rtx x,
+				   machine_mode mode, bool strict_p)
+{
+  if (!TARGET_XTHEADMEMIDX)
+    return false;
+
+  if (!TARGET_64BIT && mode == DImode)
+    return false;
+
+  if (!(INTEGRAL_MODE_P (mode) && GET_MODE_SIZE (mode).to_constant () <= 8))
+    return false;
+
+  if (GET_CODE (x) != POST_MODIFY
+      && GET_CODE (x) != PRE_MODIFY)
+    return false;
+
+  rtx reg = XEXP (x, 0);
+  rtx exp = XEXP (x, 1);
+  rtx expreg = XEXP (exp, 0);
+  rtx expoff = XEXP (exp, 1);
+
+  if (GET_CODE (exp) != PLUS
+      || !rtx_equal_p (expreg, reg)
+      || !CONST_INT_P (expoff)
+      || !riscv_valid_base_register_p (reg, mode, strict_p))
+    return false;
+
+  /* The offset is calculated as (sign_extend(imm5) << imm2)  */
+  const int shamt_bits = 2;
+  for (int shamt = 0; shamt < (1 << shamt_bits); shamt++)
+    {
+      const int nbits = 5;
+      if (valid_signed_immediate (expoff, nbits, shamt))
+	{
+	  if (info)
+	    {
+	      info->type = ADDRESS_REG_WB;
+	      info->reg = reg;
+	      info->offset = expoff;
+	      info->shift = shamt;
+	    }
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Return TRUE if X is a MEM with a legitimate modify address.  */
+
+bool
+th_memidx_legitimate_modify_p (rtx x)
+{
+  if (!MEM_P (x))
+    return false;
+
+  /* Get the mode from the MEM and unpack it.  */
+  machine_mode mode = GET_MODE (x);
+  x = XEXP (x, 0);
+
+  return th_memidx_classify_address_modify (NULL, x, mode, reload_completed);
+}
+
+/* Return TRUE if X is a MEM with a legitimate modify address
+   and the address is POST_MODIFY (if POST is true) or a PRE_MODIFY
+   (otherwise).  */
+
+bool
+th_memidx_legitimate_modify_p (rtx x, bool post)
+{
+  if (!th_memidx_legitimate_modify_p (x))
+    return false;
+
+  /* Unpack the MEM and check the code.  */
+  x = XEXP (x, 0);
+  if (post)
+    return GET_CODE (x) == POST_MODIFY;
+  else
+    return GET_CODE (x) == PRE_MODIFY;
+}
+
+/* Provide a buffer for a th.lXia/th.lXib/th.sXia/th.sXib instruction
+   for the given MODE. If LOAD is true, a load instruction will be
+   provided (otherwise, a store instruction). If X is not suitable
+   return NULL.  */
+
+static const char *
+th_memidx_output_modify (rtx x, machine_mode mode, bool load)
+{
+  static char buf[128] = {0};
+
+  /* Validate x.  */
+  if (!th_memidx_classify_address_modify (NULL, x, mode, reload_completed))
+    return NULL;
+
+  int index = exact_log2 (GET_MODE_SIZE (mode).to_constant ());
+  bool post = GET_CODE (x) == POST_MODIFY;
+
+  const char *const insn[][4] = {
+    {
+      "th.sbi%s\t%%z1,%%0",
+      "th.shi%s\t%%z1,%%0",
+      "th.swi%s\t%%z1,%%0",
+      "th.sdi%s\t%%z1,%%0"
+    },
+    {
+      "th.lbui%s\t%%0,%%1",
+      "th.lhui%s\t%%0,%%1",
+      "th.lwi%s\t%%0,%%1",
+      "th.ldi%s\t%%0,%%1"
+    }
+  };
+
+  snprintf (buf, sizeof (buf), insn[load][index], post ? "a" : "b");
+  return buf;
+}
+
+static bool
+is_memidx_mode (machine_mode mode)
+{
+  if (mode == QImode || mode == HImode || mode == SImode)
+    return true;
+
+  if (mode == DImode && TARGET_64BIT)
+    return true;
+
+  return false;
+}
+
+/* Return true if X is a valid address for T-Head's memory addressing modes
+   with scaled register offsets for machine mode MODE.
+   If it is, fill in INFO appropriately (if non-NULL).
+   If STRICT_P is true then REG_OK_STRICT is in effect.  */
+
+static bool
+th_memidx_classify_address_index (struct riscv_address_info *info, rtx x,
+				  machine_mode mode, bool strict_p)
+{
+  /* Ensure that the mode is supported.  */
+  if (!(TARGET_XTHEADMEMIDX && is_memidx_mode (mode)))
+    return false;
+
+  if (GET_CODE (x) != PLUS)
+    return false;
+
+  rtx reg = XEXP (x, 0);
+  enum riscv_address_type type;
+  rtx offset = XEXP (x, 1);
+  int shift;
+
+  if (!riscv_valid_base_register_p (reg, mode, strict_p))
+    return false;
+
+  /* (reg:X) */
+  if (REG_P (offset)
+      && GET_MODE (offset) == Xmode)
+    {
+      type = ADDRESS_REG_REG;
+      shift = 0;
+      offset = offset;
+    }
+  /* (zero_extend:DI (reg:SI)) */
+  else if (GET_CODE (offset) == ZERO_EXTEND
+	   && GET_MODE (offset) == DImode
+	   && GET_MODE (XEXP (offset, 0)) == SImode)
+    {
+      type = ADDRESS_REG_UREG;
+      shift = 0;
+      offset = XEXP (offset, 0);
+    }
+  /* (ashift:X (reg:X) (const_int shift)) */
+  else if (GET_CODE (offset) == ASHIFT
+	   && GET_MODE (offset) == Xmode
+	   && REG_P (XEXP (offset, 0))
+	   && GET_MODE (XEXP (offset, 0)) == Xmode
+	   && CONST_INT_P (XEXP (offset, 1))
+	   && IN_RANGE(INTVAL (XEXP (offset, 1)), 0, 3))
+    {
+      type = ADDRESS_REG_REG;
+      shift = INTVAL (XEXP (offset, 1));
+      offset = XEXP (offset, 0);
+    }
+  /* (ashift:DI (zero_extend:DI (reg:SI)) (const_int shift)) */
+  else if (GET_CODE (offset) == ASHIFT
+	   && GET_MODE (offset) == DImode
+	   && GET_CODE (XEXP (offset, 0)) == ZERO_EXTEND
+	   && GET_MODE (XEXP (offset, 0)) == DImode
+	   && REG_P (XEXP (XEXP (offset, 0), 0))
+	   && GET_MODE (XEXP (XEXP (offset, 0), 0)) == SImode
+	   && CONST_INT_P (XEXP (offset, 1))
+	   && IN_RANGE(INTVAL (XEXP (offset, 1)), 0, 3))
+    {
+      type = ADDRESS_REG_UREG;
+      shift = INTVAL (XEXP (offset, 1));
+      offset = XEXP (XEXP (offset, 0), 0);
+    }
+  else
+    return false;
+
+  if (!strict_p
+      && GET_CODE (offset) == SUBREG
+      && contains_reg_of_mode[GENERAL_REGS][GET_MODE (SUBREG_REG (offset))])
+    offset = SUBREG_REG (offset);
+
+  if (riscv_valid_base_register_p (offset, mode, strict_p))
+    {
+      if (info)
+	{
+	  info->reg = reg;
+	  info->type = type;
+	  info->offset = offset;
+	  info->shift = shift;
+	}
+      return true;
+    }
+  return false;
+}
+
+/* Return TRUE if X is a MEM with a legitimate indexed address.  */
+
+bool
+th_memidx_legitimate_index_p (rtx x)
+{
+  if (!MEM_P (x))
+    return false;
+
+  /* Get the mode from the MEM and unpack it.  */
+  machine_mode mode = GET_MODE (x);
+  x = XEXP (x, 0);
+
+  return th_memidx_classify_address_index (NULL, x, mode, reload_completed);
+}
+
+/* Return TRUE if X is a MEM with a legitimate indexed address
+   and the offset register is zero-extended (if UINDEX is true)
+   or sign-extended (otherwise).  */
+
+bool
+th_memidx_legitimate_index_p (rtx x, bool uindex)
+{
+  if (!MEM_P (x))
+    return false;
+
+  /* Get the mode from the MEM and unpack it.  */
+  machine_mode mode = GET_MODE (x);
+  x = XEXP (x, 0);
+
+  struct riscv_address_info info;
+  if (!th_memidx_classify_address_index (&info, x, mode, reload_completed))
+    return false;
+
+  if (uindex)
+    return info.type == ADDRESS_REG_UREG;
+  else
+    return info.type == ADDRESS_REG_REG;
+}
+
+/* Provide a buffer for a th.lrX/th.lurX/th.srX/th.surX instruction
+   for the given MODE. If LOAD is true, a load instruction will be
+   provided (otherwise, a store instruction). If X is not suitable
+   return NULL.  */
+
+static const char *
+th_memidx_output_index (rtx x, machine_mode mode, bool load)
+{
+  struct riscv_address_info info;
+  static char buf[128] = {0};
+
+  /* Validate x.  */
+  if (!th_memidx_classify_address_index (&info, x, mode, reload_completed))
+    return NULL;
+
+  int index = exact_log2 (GET_MODE_SIZE (mode).to_constant ());
+  bool uindex = info.type == ADDRESS_REG_UREG;
+
+  const char *const insn[][4] = {
+    {
+      "th.s%srb\t%%z1,%%0",
+      "th.s%srh\t%%z1,%%0",
+      "th.s%srw\t%%z1,%%0",
+      "th.s%srd\t%%z1,%%0"
+    },
+    {
+      "th.l%srbu\t%%0,%%1",
+      "th.l%srhu\t%%0,%%1",
+      "th.l%srw\t%%0,%%1",
+      "th.l%srd\t%%0,%%1"
+    }
+  };
+
+  snprintf (buf, sizeof (buf), insn[load][index], uindex ? "u" : "");
+
+  return buf;
+}
+
+/* Return true if X is a valid address for T-Head's memory addressing modes
+   for machine mode MODE.  If it is, fill in INFO appropriately (if non-NULL).
+   If STRICT_P is true then REG_OK_STRICT is in effect.  */
+
+bool
+th_classify_address (struct riscv_address_info *info, rtx x,
+		     machine_mode mode, bool strict_p)
+{
+  switch (GET_CODE (x))
+    {
+    case PLUS:
+      if (th_memidx_classify_address_index (info, x, mode, strict_p))
+	return true;
+      break;
+
+    case POST_MODIFY:
+    case PRE_MODIFY:
+      if (th_memidx_classify_address_modify (info, x, mode, strict_p))
+	return true;
+      break;
+
+    default:
+      return false;
+  }
+
+    return false;
+}
+
+/* Provide a buffer for a XTheadMemIdx instruction for the given MODE.
+   If LOAD is true, a load instruction will be provided (otherwise,
+   a store instruction).  If X is not suitable return NULL.  */
+
+const char *
+th_output_move (rtx dest, rtx src)
+{
+  enum rtx_code dest_code, src_code;
+  machine_mode mode;
+  const char *insn = NULL;
+
+  dest_code = GET_CODE (dest);
+  src_code = GET_CODE (src);
+  mode = GET_MODE (dest);
+
+  if (dest_code == REG && GP_REG_P (REGNO (dest)) && src_code == MEM)
+    {
+      rtx x = XEXP (src, 0);
+      mode = GET_MODE (src);
+
+      if ((insn = th_memidx_output_index (x, mode, true)))
+	return insn;
+
+      if ((insn = th_memidx_output_modify (x, mode, true)))
+	return insn;
+    }
+  else if (((src_code == REG && GP_REG_P (REGNO (src)))
+	    || (src == CONST0_RTX (mode)))
+	   && dest_code == MEM)
+    {
+	rtx x = XEXP (dest, 0);
+	mode = GET_MODE (dest);
+
+	if ((insn = th_memidx_output_index (x, mode, false)))
+	  return insn;
+
+	if ((insn = th_memidx_output_modify (x, mode, false)))
+	  return insn;
+    }
+
+  return NULL;
+}
+
+/* Implement TARGET_PRINT_OPERAND_ADDRESS for XTheadMemIdx.  */
+
+bool
+th_print_operand_address (FILE *file, machine_mode mode, rtx x)
+{
+  struct riscv_address_info addr;
+
+  if (!th_classify_address (&addr, x, mode, reload_completed))
+    return false;
+
+  switch (addr.type)
+    {
+    case ADDRESS_REG_REG:
+    case ADDRESS_REG_UREG:
+      fprintf (file, "%s,%s,%u", reg_names[REGNO (addr.reg)],
+	       reg_names[REGNO (addr.offset)], addr.shift);
+      return true;
+
+    case ADDRESS_REG_WB:
+      fprintf (file, "(%s),%ld,%u", reg_names[REGNO (addr.reg)],
+	       INTVAL (addr.offset) >> addr.shift, addr.shift);
+	return true;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  gcc_unreachable ();
+}
diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
index 1ac4dd9b462..5536500e4ff 100644
--- a/gcc/config/riscv/thead.md
+++ b/gcc/config/riscv/thead.md
@@ -62,7 +62,7 @@  (define_insn "*extend<SHORT:mode><SUPERQI:mode>2_th_ext"
   [(set (match_operand:SUPERQI 0 "register_operand" "=r,r")
 	(sign_extend:SUPERQI
 	    (match_operand:SHORT 1 "nonimmediate_operand" "r,m")))]
-  "TARGET_XTHEADBB"
+  "TARGET_XTHEADBB && !TARGET_XTHEADMEMIDX"
   "@
    th.ext\t%0,%1,15,0
    l<SHORT:size>\t%0,%1"
@@ -85,7 +85,7 @@  (define_insn "*th_extu<mode>4"
 (define_insn "*zero_extendsidi2_th_extu"
   [(set (match_operand:DI 0 "register_operand" "=r,r")
 	(zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m")))]
-  "TARGET_64BIT && TARGET_XTHEADBB"
+  "TARGET_64BIT && TARGET_XTHEADBB && !TARGET_XTHEADMEMIDX"
   "@
    th.extu\t%0,%1,31,0
    lwu\t%0,%1"
@@ -95,7 +95,7 @@  (define_insn "*zero_extendsidi2_th_extu"
 (define_insn "*zero_extendhi<GPR:mode>2_th_extu"
   [(set (match_operand:GPR 0 "register_operand" "=r,r")
 	(zero_extend:GPR (match_operand:HI 1 "nonimmediate_operand" "r,m")))]
-  "TARGET_XTHEADBB"
+  "TARGET_XTHEADBB && !TARGET_XTHEADMEMIDX"
   "@
    th.extu\t%0,%1,15,0
    lhu\t%0,%1"
@@ -375,4 +375,184 @@  (define_insn "*th_mempair_load_zero_extendsidi2"
    (set_attr "mode" "DI")
    (set_attr "length" "8")])
 
+;; XTheadMemIdx
+
+(define_insn "*th_memidx_zero_extendqi<SUPERQI:mode>2"
+  [(set (match_operand:SUPERQI 0 "register_operand" "=r,r,r,r,r,r")
+	(zero_extend:SUPERQI
+	    (match_operand:QI 1 "nonimmediate_operand"
+         " r,th_m_mia,th_m_mib,th_m_mir,th_m_miu,m")))]
+  "TARGET_XTHEADMEMIDX"
+  "@
+   andi\t%0,%1,0xff
+   th.lbuia\t%0,%1
+   th.lbuib\t%0,%1
+   th.lrbu\t%0,%1
+   th.lurbu\t%0,%1
+   lbu\t%0,%1"
+  [(set_attr "move_type" "andi,load,load,load,load,load")
+   (set_attr "mode" "<SUPERQI:MODE>")])
+
+(define_insn "*th_memidx_extendsidi2"
+  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r,r,r")
+	(sign_extend:DI
+	    (match_operand:SI 1 "nonimmediate_operand"
+         " r,th_m_mia,th_m_mib,th_m_mir,th_m_miu,m")))]
+  "TARGET_64BIT && TARGET_XTHEADMEMIDX"
+  "@
+   sext.w\t%0,%1
+   th.lwia\t%0,%1
+   th.lwib\t%0,%1
+   th.lrw\t%0,%1
+   th.lurw\t%0,%1
+   lw\t%0,%1"
+  [(set_attr "move_type" "move,load,load,load,load,load")
+   (set_attr "mode" "DI")])
+
+;; XTheadMemIdx (without XTheadBb)
+
+(define_insn_and_split "*th_memidx_zero_extendsidi2"
+  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r,r,r")
+	(zero_extend:DI
+	    (match_operand:SI 1 "nonimmediate_operand"
+         " r,th_m_mia,th_m_mib,th_m_mir,th_m_miu,m")))]
+  "TARGET_64BIT && TARGET_XTHEADMEMIDX && !TARGET_XTHEADBB"
+  "@
+   #
+   th.lwuia\t%0,%1
+   th.lwuib\t%0,%1
+   th.lrwu\t%0,%1
+   th.lurwu\t%0,%1
+   lwu\t%0,%1"
+  "&& reload_completed
+   && REG_P (operands[1])
+   && !paradoxical_subreg_p (operands[0])"
+  [(set (match_dup 0)
+	(ashift:DI (match_dup 1) (const_int 32)))
+   (set (match_dup 0)
+	(lshiftrt:DI (match_dup 0) (const_int 32)))]
+  { operands[1] = gen_lowpart (DImode, operands[1]); }
+  [(set_attr "move_type" "shift_shift,load,load,load,load,load")
+   (set_attr "mode" "DI")])
+
+(define_insn_and_split "*th_memidx_zero_extendhi<GPR:mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r,r,r,r,r")
+	(zero_extend:GPR
+	    (match_operand:HI 1 "nonimmediate_operand"
+         " r,th_m_mia,th_m_mib,th_m_mir,th_m_miu,m")))]
+  "TARGET_XTHEADMEMIDX && !TARGET_XTHEADBB"
+  "@
+   #
+   th.lhuia\t%0,%1
+   th.lhuib\t%0,%1
+   th.lrhu\t%0,%1
+   th.lurhu\t%0,%1
+   lhu\t%0,%1"
+  "&& reload_completed
+   && REG_P (operands[1])
+   && !paradoxical_subreg_p (operands[0])"
+  [(set (match_dup 0)
+	(ashift:GPR (match_dup 1) (match_dup 2)))
+   (set (match_dup 0)
+	(lshiftrt:GPR (match_dup 0) (match_dup 2)))]
+  {
+    operands[1] = gen_lowpart (<GPR:MODE>mode, operands[1]);
+    operands[2] = GEN_INT(GET_MODE_BITSIZE(<GPR:MODE>mode) - 16);
+  }
+  [(set_attr "move_type" "shift_shift,load,load,load,load,load")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn_and_split "*th_memidx_extend<SHORT:mode><SUPERQI:mode>2"
+  [(set (match_operand:SUPERQI 0 "register_operand" "=r,r,r,r,r,r")
+	(sign_extend:SUPERQI
+	    (match_operand:SHORT 1 "nonimmediate_operand"
+         " r,th_m_mia,th_m_mib,th_m_mir,th_m_miu,m")))]
+  "TARGET_XTHEADMEMIDX && !TARGET_XTHEADBB"
+  "@
+   #
+   th.l<SHORT:size>ia\t%0,%1
+   th.l<SHORT:size>ib\t%0,%1
+   th.lr<SHORT:size>\t%0,%1
+   th.lur<SHORT:size>\t%0,%1
+   l<SHORT:size>\t%0,%1"
+  "&& reload_completed
+   && REG_P (operands[1])
+   && !paradoxical_subreg_p (operands[0])"
+  [(set (match_dup 0) (ashift:SI (match_dup 1) (match_dup 2)))
+   (set (match_dup 0) (ashiftrt:SI (match_dup 0) (match_dup 2)))]
+{
+  operands[0] = gen_lowpart (SImode, operands[0]);
+  operands[1] = gen_lowpart (SImode, operands[1]);
+  operands[2] = GEN_INT (GET_MODE_BITSIZE (SImode)
+			 - GET_MODE_BITSIZE (<SHORT:MODE>mode));
+}
+  [(set_attr "move_type" "shift_shift,load,load,load,load,load")
+   (set_attr "mode" "SI")])
+
+;; XTheadMemIdx (with XTheadBb)
+
+(define_insn "*th_memidx_bb_zero_extendsidi2"
+  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r,r,r")
+	(zero_extend:DI
+	    (match_operand:SI 1 "nonimmediate_operand"
+         " r,th_m_mia,th_m_mib,th_m_mir,th_m_miu,m")))]
+  "TARGET_64BIT && TARGET_XTHEADMEMIDX && TARGET_XTHEADBB"
+  "@
+   th.extu\t%0,%1,31,0
+   th.lwuia\t%0,%1
+   th.lwuib\t%0,%1
+   th.lrwu\t%0,%1
+   th.lurwu\t%0,%1
+   lwu\t%0,%1"
+  [(set_attr "move_type" "shift_shift,load,load,load,load,load")
+   (set_attr "mode" "DI")])
+
+(define_insn "*th_memidx_bb_zero_extendhi<GPR:mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r,r,r,r,r")
+	(zero_extend:GPR
+	    (match_operand:HI 1 "nonimmediate_operand"
+         " r,th_m_mia,th_m_mib,th_m_mir,th_m_miu,m")))]
+  "TARGET_XTHEADMEMIDX && TARGET_XTHEADBB"
+  "@
+   th.extu\t%0,%1,15,0
+   th.lhuia\t%0,%1
+   th.lhuib\t%0,%1
+   th.lrhu\t%0,%1
+   th.lurhu\t%0,%1
+   lhu\t%0,%1"
+  [(set_attr "move_type" "shift_shift,load,load,load,load,load")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*th_memidx_bb_extendhi<GPR:mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r,r,r,r,r")
+	(sign_extend:GPR
+	    (match_operand:HI 1 "nonimmediate_operand"
+         " r,th_m_mia,th_m_mib,th_m_mir,th_m_miu,m")))]
+  "TARGET_XTHEADMEMIDX && TARGET_XTHEADBB"
+  "@
+   th.ext\t%0,%1,15,0
+   th.lhia\t%0,%1
+   th.lhib\t%0,%1
+   th.lrh\t%0,%1
+   th.lurh\t%0,%1
+   lh\t%0,%1"
+  [(set_attr "move_type" "shift_shift,load,load,load,load,load")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*th_memidx_bb_extendqi<SUPERQI:mode>2"
+  [(set (match_operand:SUPERQI 0 "register_operand" "=r,r,r,r,r,r")
+	(sign_extend:SUPERQI
+	    (match_operand:QI 1 "nonimmediate_operand"
+         " r,th_m_mia,th_m_mib,th_m_mir,th_m_miu,m")))]
+  "TARGET_XTHEADMEMIDX && TARGET_XTHEADBB"
+  "@
+   th.ext\t%0,%1,7,0
+   th.lbia\t%0,%1
+   th.lbib\t%0,%1
+   th.lrb\t%0,%1
+   th.lurb\t%0,%1
+   lb\t%0,%1"
+  [(set_attr "move_type" "shift_shift,load,load,load,load,load")
+   (set_attr "mode" "<SUPERQI:MODE>")])
+
 (include "thead-peephole.md")
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
new file mode 100644
index 00000000000..a97f08c5cc1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
@@ -0,0 +1,152 @@ 
+#ifndef XTHEADMEMIDX_HELPERS_H
+#define XTHEADMEMIDX_HELPERS_H
+
+#include <stdint.h>
+
+#define intX_t long
+#define uintX_t unsigned long
+
+#define PRE_DEC_LOAD(T, N)						\
+  void									\
+  T ## _pre_dec_load_ ## N (T *p)					\
+  {									\
+    extern void f ## T ## N (T*, uintX_t);				\
+    p = p - N;								\
+    T x = *p;								\
+    f ## T ## N (p, x);							\
+  }
+
+#define PRE_INC_LOAD(T, N)						\
+  void									\
+  T ## _pre_inc_load_ ## N (T *p)					\
+  {									\
+    extern void f ## T ## N (T*, uintX_t);				\
+    p = p + N;								\
+    T x = *p;								\
+    f ## T ## N (p, x);							\
+  }
+
+#define POST_DEC_LOAD(T, N)						\
+  void									\
+  T ## _post_dec_load_ ## N (T *p)					\
+  {									\
+    extern void f ## T ## N (T*, uintX_t);				\
+    T x = *p;								\
+    p = p - N;								\
+    f ## T ## N (p, x);							\
+  }
+
+#define POST_INC_LOAD(T,N)						\
+  void									\
+  T ## _post_inc_load_ ## N (T *p)					\
+  {									\
+    extern void f ## T ## N (T*,uintX_t);				\
+    T x = *p;								\
+    p = p + N;								\
+    f ## T ## N (p, x);							\
+  }
+
+#define PRE_DEC_STORE(T, N)						\
+  T *									\
+  T ## _pre_dec_store_ ## N (T *p, T v)					\
+  {									\
+    p = p - N;								\
+    *p = v;								\
+    return p;								\
+  }
+
+#define PRE_INC_STORE(T, N)						\
+  T *									\
+  T ## _pre_inc_store_ ## N (T *p, T v)					\
+  {									\
+    p = p + N;								\
+    *p = v;								\
+    return p;								\
+  }
+
+#define POST_DEC_STORE(T, N)						\
+  T *									\
+  T ## _post_dec_store_ ## N (T *p, T v)				\
+  {									\
+    *p = v;								\
+    p = p - N;								\
+    return p;								\
+  }
+
+#define POST_INC_STORE(T, N)						\
+  T *									\
+  T ## _post_inc_store_ ## N (T *p, T v)				\
+  {									\
+    *p = v;								\
+    p = p + N;								\
+    return p;								\
+  }
+
+#define LR_REG_IMM(T, IMM)						\
+  intX_t								\
+  lr_reg_imm_ ## T ## _ ## IMM (intX_t rs1, intX_t rs2)			\
+  {									\
+    return *(T*)(rs1 + (rs2 << IMM));					\
+  }
+
+#define SR_REG_IMM(T, IMM)						\
+  void									\
+  sr_reg_imm_ ## T ## _ ## IMM (intX_t rs1, intX_t rs2, T val)		\
+  {									\
+    *(T*)(rs1 + (rs2 << IMM)) = val;					\
+  }
+
+#define LR_REG_IMM_UPD(T, IMM)						\
+  intX_t								\
+  lr_reg_imm_upd_ ## T ## _ ## IMM (intX_t *rs1, intX_t rs2)		\
+  {									\
+    *rs1 = *rs1 + (rs2 << IMM);						\
+    return *(T*)(*rs1);							\
+  }
+
+#define SR_REG_IMM_UPD(T, IMM)						\
+  void									\
+  sr_reg_imm_upd_ ## T ## _ ## IMM (intX_t *rs1, intX_t rs2, T val)	\
+  {									\
+    *rs1 = *rs1 + (rs2 << IMM);						\
+    *(T*)(*rs1) = val;	 						\
+  }
+
+#define LRU_REG_IMM(T, IMM)						\
+  intX_t								\
+  lru_reg_imm_ ## T ## IMM (intX_t rs1, intX_t rs2)			\
+  {									\
+    rs2 = (uint32_t)rs2;						\
+    return *(T*)(rs1 + (rs2 << IMM));					\
+  }
+
+#define SRU_REG_IMM(T, IMM)						\
+  void									\
+  sr_reg_imm_ ## T ## _ ## IMM (intX_t rs1, intX_t rs2, T val)		\
+  {									\
+    rs2 = (uint32_t)rs2;						\
+    *(T*)(rs1 + (rs2 << IMM)) = val;					\
+  }
+
+#define LRU_REG_IMM_UPD(T, IMM)						\
+  intX_t								\
+  lru_reg_imm_upd_ ## T ## IMM (intX_t rs1, intX_t *rs2)		\
+  {									\
+    uintX_t rs2_32 = (uint32_t)*rs2;					\
+    intX_t t = rs1 + (rs2_32 << IMM);					\
+    intX_t v = *(T*)t;							\
+    *rs2 = t;								\
+    return v;								\
+  }
+
+#define SRU_REG_IMM_UPD(T, IMM)						\
+  void									\
+  sr_reg_imm_upd_ ## T ## _ ## IMM (intX_t rs1, intX_t *rs2, T val)	\
+  {									\
+    uintX_t rs2_32 = (uint32_t)*rs2;					\
+    intX_t t = rs1 + (rs2_32 << IMM);					\
+    *(T*)t = val;							\
+    *rs2 = t;								\
+  }
+
+#endif /* XTHEADMEMIDX_HELPERS_H */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-index-update.c b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-index-update.c
new file mode 100644
index 00000000000..8237b3e62cc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-index-update.c
@@ -0,0 +1,27 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadmemidx" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LR_REG_IMM_UPD(int8_t, 0)
+LR_REG_IMM_UPD(uint8_t, 1)
+LR_REG_IMM_UPD(int16_t, 2)
+LR_REG_IMM_UPD(uint16_t, 3)
+LR_REG_IMM_UPD(int32_t, 0)
+#if __riscv_xlen == 64
+LR_REG_IMM_UPD(uint32_t, 1)
+LR_REG_IMM_UPD(int64_t, 2)
+#endif
+
+SR_REG_IMM_UPD(int8_t, 3)
+SR_REG_IMM_UPD(int16_t, 0)
+SR_REG_IMM_UPD(int32_t, 1)
+#if __riscv_xlen == 64
+SR_REG_IMM_UPD(int64_t, 2)
+#endif
+
+/* If the shifted value is used later, we cannot eliminate it.  */
+/* { dg-final { scan-assembler-times "slli" 5 { target { rv32 } } } } */
+/* { dg-final { scan-assembler-times "slli" 8 { target { rv64 } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c
new file mode 100644
index 00000000000..069c66d7ef6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c
@@ -0,0 +1,27 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadbb_xtheadmemidx" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LR_REG_IMM_UPD(int8_t, 0)
+LR_REG_IMM_UPD(uint8_t, 1)
+LR_REG_IMM_UPD(int16_t, 2)
+LR_REG_IMM_UPD(uint16_t, 3)
+LR_REG_IMM_UPD(int32_t, 0)
+#if __riscv_xlen == 64
+LR_REG_IMM_UPD(uint32_t, 1)
+LR_REG_IMM_UPD(int64_t, 2)
+#endif
+
+SR_REG_IMM_UPD(int8_t, 3)
+SR_REG_IMM_UPD(int16_t, 0)
+SR_REG_IMM_UPD(int32_t, 1)
+#if __riscv_xlen == 64
+SR_REG_IMM_UPD(int64_t, 2)
+#endif
+
+/* If the shifted value is used later, we cannot eliminate it.  */
+/* { dg-final { scan-assembler-times "slli" 5 { target { rv32 } } } } */
+/* { dg-final { scan-assembler-times "slli" 8 { target { rv64 } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-index-xtheadbb.c b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-index-xtheadbb.c
new file mode 100644
index 00000000000..c9bf1505061
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-index-xtheadbb.c
@@ -0,0 +1,36 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadbb_xtheadmemidx" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LR_REG_IMM(int8_t, 0)
+/* { dg-final { scan-assembler-times "th.lrb\t\[^\n\r\]*0" 1 } } */
+LR_REG_IMM(uint8_t, 1)
+/* { dg-final { scan-assembler-times "th.lrbu\t\[^\n\r\]*1" 1 } } */
+LR_REG_IMM(int16_t, 2)
+/* { dg-final { scan-assembler-times "th.lrh\t\[^\n\r\]*2" 1 } } */
+LR_REG_IMM(uint16_t, 3)
+/* { dg-final { scan-assembler-times "th.lrhu\t\[^\n\r\]*3" 1 } } */
+LR_REG_IMM(int32_t, 0)
+/* { dg-final { scan-assembler-times "th.lrw\t\[^\n\r\]*0" 1 } } */
+#if __riscv_xlen == 64
+LR_REG_IMM(uint32_t, 1)
+/* { dg-final { scan-assembler-times "th.lrwu\t\[^\n\r\]*1" 1 { target { rv64 } } } } */
+LR_REG_IMM(int64_t, 2)
+/* { dg-final { scan-assembler-times "th.lrd\t\[^\n\r\]*2" 1 { target { rv64 } } } } */
+#endif
+
+SR_REG_IMM(int8_t, 3)
+/* { dg-final { scan-assembler-times "th.srb\t\[^\n\r\]*3" 1 } } */
+SR_REG_IMM(int16_t, 0)
+/* { dg-final { scan-assembler-times "th.srh\t\[^\n\r\]*0" 1 } } */
+SR_REG_IMM(int32_t, 1)
+/* { dg-final { scan-assembler-times "th.srw\t\[^\n\r\]*1" 1 } } */
+#if __riscv_xlen == 64
+SR_REG_IMM(int64_t, 2)
+/* { dg-final { scan-assembler-times "th.srd\t\[^\n\r\]*2" 1 { target { rv64 } } } } */
+#endif
+
+/* { dg-final { scan-assembler-not "slli" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-index.c b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-index.c
new file mode 100644
index 00000000000..3fa0e8fa355
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-index.c
@@ -0,0 +1,36 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadmemidx" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LR_REG_IMM(int8_t, 0)
+/* { dg-final { scan-assembler-times "th.lrb\t\[^\n\r\]*0" 1 } } */
+LR_REG_IMM(uint8_t, 1)
+/* { dg-final { scan-assembler-times "th.lrbu\t\[^\n\r\]*1" 1 } } */
+LR_REG_IMM(int16_t, 2)
+/* { dg-final { scan-assembler-times "th.lrh\t\[^\n\r\]*2" 1 } } */
+LR_REG_IMM(uint16_t, 3)
+/* { dg-final { scan-assembler-times "th.lrhu\t\[^\n\r\]*3" 1 } } */
+LR_REG_IMM(int32_t, 0)
+/* { dg-final { scan-assembler-times "th.lrw\t\[^\n\r\]*0" 1 } } */
+#if __riscv_xlen == 64
+LR_REG_IMM(uint32_t, 1)
+/* { dg-final { scan-assembler-times "th.lrwu\t\[^\n\r\]*1" 1 { target { rv64 } } } } */
+LR_REG_IMM(int64_t, 2)
+/* { dg-final { scan-assembler-times "th.lrd\t\[^\n\r\]*2" 1 { target { rv64 } } } } */
+#endif
+
+SR_REG_IMM(int8_t, 3)
+/* { dg-final { scan-assembler-times "th.srb\t\[^\n\r\]*3" 1 } } */
+SR_REG_IMM(int16_t, 0)
+/* { dg-final { scan-assembler-times "th.srh\t\[^\n\r\]*0" 1 } } */
+SR_REG_IMM(int32_t, 1)
+/* { dg-final { scan-assembler-times "th.srw\t\[^\n\r\]*1" 1 } } */
+#if __riscv_xlen == 64
+SR_REG_IMM(int64_t, 2)
+/* { dg-final { scan-assembler-times "th.srd\t\[^\n\r\]*2" 1 { target { rv64 } } } } */
+#endif
+
+/* { dg-final { scan-assembler-not "slli" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c
new file mode 100644
index 00000000000..745a1be8a53
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c
@@ -0,0 +1,74 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadbb_xtheadmemidx" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+/* We have a simm5 shifted by a imm2.
+   imm2  | range (simm5 << imm2)
+     0   |  -16..-1,0..15
+     1   |  -32..-2,0..30
+     2   |  -64..-4,0..60
+     3   | -128..-8,0..120  */
+
+POST_INC_LOAD(int8_t, 1)
+/* { dg-final { scan-assembler "th.lbia\[^\n\r\]*1,0" } } */
+PRE_DEC_LOAD(int8_t, 32)
+/* { dg-final { scan-assembler "th.lbib\[^\n\r\]*-16,1" } } */
+
+POST_DEC_LOAD(uint8_t, 1)
+/* { dg-final { scan-assembler "th.lbuia\[^\n\r\]*-1,0" } } */
+PRE_INC_LOAD(uint8_t, 32)
+/* { dg-final { scan-assembler "th.lbuib\[^\n\r\]*8,2" } } */
+
+POST_INC_LOAD(int16_t, 1)
+/* { dg-final { scan-assembler "th.lhia\[^\n\r\]*2,0" } } */
+POST_DEC_LOAD(int16_t, 64)
+/* { dg-final { scan-assembler "th.lhia\[^\n\r\]*-16,3" } } */
+
+POST_DEC_LOAD(uint16_t, 1)
+/* { dg-final { scan-assembler "th.lhuia\[^\n\r\]*-2,0" } } */
+POST_INC_LOAD(uint16_t, 60)
+/* { dg-final { scan-assembler "th.lhuia\[^\n\r\]*15,3" } } */
+
+POST_INC_LOAD(int32_t, 1)
+/* { dg-final { scan-assembler "th.lwia\[^\n\r\]*4,0" } } */
+PRE_DEC_LOAD(int32_t, 32)
+/* { dg-final { scan-assembler "th.lwib\[^\n\r\]*-16,3" } } */
+
+#if __riscv_xlen == 64
+POST_DEC_LOAD(uint32_t, 1)
+/* { dg-final { scan-assembler "th.lwuia\[^\n\r\]*-4,0" { target { rv64 } } } } */
+PRE_INC_LOAD(uint32_t, 15)
+/* { dg-final { scan-assembler "th.lwuib\[^\n\r\]*15,2" { target { rv64 } } } } */
+
+POST_INC_LOAD(int64_t, 1)
+/* { dg-final { scan-assembler "th.ldia\[^\n\r\]*8,0" { target { rv64 } } } } */
+PRE_DEC_LOAD(int64_t, 16)
+/* { dg-final { scan-assembler "th.ldib\[^\n\r\]*-16,3" { target { rv64 } } } } */
+#endif
+
+POST_DEC_STORE(int8_t, 1)
+/* { dg-final { scan-assembler "th.sbia\[^\n\r\]*-1,0" } } */
+PRE_INC_STORE(int8_t, 120)
+/* { dg-final { scan-assembler "th.sbib\[^\n\r\]*15,3" } } */
+
+POST_INC_STORE(int16_t, 1)
+/* { dg-final { scan-assembler "th.shia\[^\n\r\]*2,0" } } */
+PRE_DEC_STORE(int16_t, 64)
+/* { dg-final { scan-assembler "th.shib\[^\n\r\]*-16,3" } } */
+
+POST_DEC_STORE(int32_t, 1)
+/* { dg-final { scan-assembler "th.swia\[^\n\r\]*-4,0" } } */
+PRE_INC_STORE(int32_t, 2)
+/* { dg-final { scan-assembler "th.swib\[^\n\r\]*8,0" } } */
+
+#if __riscv_xlen == 64
+POST_INC_STORE(int64_t, 1)
+/* { dg-final { scan-assembler "th.sdia\[^\n\r\]*8,0" { target { rv64 } } } } */
+PRE_DEC_STORE(int64_t, 8)
+/* { dg-final { scan-assembler "th.sdib\[^\n\r\]*-16,2" { target { rv64 } } } } */
+#endif
+
+/* { dg-final { scan-assembler-not "\taddi" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-modify.c b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-modify.c
new file mode 100644
index 00000000000..e1ec0bdabd7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-modify.c
@@ -0,0 +1,74 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadmemidx" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+/* We have a simm5 shifted by a imm2.
+   imm2  | range (simm5 << imm2)
+     0   |  -16..-1,0..15
+     1   |  -32..-2,0..30
+     2   |  -64..-4,0..60
+     3   | -128..-8,0..120  */
+
+POST_INC_LOAD(int8_t, 1)
+/* { dg-final { scan-assembler "th.lbia\[^\n\r\]*1,0" } } */
+PRE_DEC_LOAD(int8_t, 32)
+/* { dg-final { scan-assembler "th.lbib\[^\n\r\]*-16,1" } } */
+
+POST_DEC_LOAD(uint8_t, 1)
+/* { dg-final { scan-assembler "th.lbuia\[^\n\r\]*-1,0" } } */
+PRE_INC_LOAD(uint8_t, 32)
+/* { dg-final { scan-assembler "th.lbuib\[^\n\r\]*8,2" } } */
+
+POST_INC_LOAD(int16_t, 1)
+/* { dg-final { scan-assembler "th.lhia\[^\n\r\]*2,0" } } */
+POST_DEC_LOAD(int16_t, 64)
+/* { dg-final { scan-assembler "th.lhia\[^\n\r\]*-16,3" } } */
+
+POST_DEC_LOAD(uint16_t, 1)
+/* { dg-final { scan-assembler "th.lhuia\[^\n\r\]*-2,0" } } */
+POST_INC_LOAD(uint16_t, 60)
+/* { dg-final { scan-assembler "th.lhuia\[^\n\r\]*15,3" } } */
+
+POST_INC_LOAD(int32_t, 1)
+/* { dg-final { scan-assembler "th.lwia\[^\n\r\]*4,0" } } */
+PRE_DEC_LOAD(int32_t, 32)
+/* { dg-final { scan-assembler "th.lwib\[^\n\r\]*-16,3" } } */
+
+#if __riscv_xlen == 64
+POST_DEC_LOAD(uint32_t, 1)
+/* { dg-final { scan-assembler "th.lwuia\[^\n\r\]*-4,0" { target { rv64 } } } } */
+PRE_INC_LOAD(uint32_t, 15)
+/* { dg-final { scan-assembler "th.lwuib\[^\n\r\]*15,2" { target { rv64 } } } } */
+
+POST_INC_LOAD(int64_t, 1)
+/* { dg-final { scan-assembler "th.ldia\[^\n\r\]*8,0" { target { rv64 } } } } */
+PRE_DEC_LOAD(int64_t, 16)
+/* { dg-final { scan-assembler "th.ldib\[^\n\r\]*-16,3" { target { rv64 } } } } */
+#endif
+
+POST_DEC_STORE(int8_t, 1)
+/* { dg-final { scan-assembler "th.sbia\[^\n\r\]*-1,0" } } */
+PRE_INC_STORE(int8_t, 120)
+/* { dg-final { scan-assembler "th.sbib\[^\n\r\]*15,3" } } */
+
+POST_INC_STORE(int16_t, 1)
+/* { dg-final { scan-assembler "th.shia\[^\n\r\]*2,0" } } */
+PRE_DEC_STORE(int16_t, 64)
+/* { dg-final { scan-assembler "th.shib\[^\n\r\]*-16,3" } } */
+
+POST_DEC_STORE(int32_t, 1)
+/* { dg-final { scan-assembler "th.swia\[^\n\r\]*-4,0" } } */
+PRE_INC_STORE(int32_t, 2)
+/* { dg-final { scan-assembler "th.swib\[^\n\r\]*8,0" } } */
+
+#if __riscv_xlen == 64
+POST_INC_STORE(int64_t, 1)
+/* { dg-final { scan-assembler "th.sdia\[^\n\r\]*8,0" { target { rv64 } } } } */
+PRE_DEC_STORE(int64_t, 8)
+/* { dg-final { scan-assembler "th.sdib\[^\n\r\]*-16,2" { target { rv64 } } } } */
+#endif
+
+/* { dg-final { scan-assembler-not "\taddi" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex-update.c b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex-update.c
new file mode 100644
index 00000000000..c46d7d1474d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex-update.c
@@ -0,0 +1,27 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" "-Os" "-Oz" } } */
+/* { dg-options "-march=rv64gc_xtheadmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadmemidx" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LRU_REG_IMM_UPD(int8_t, 0)
+LRU_REG_IMM_UPD(uint8_t, 1)
+LRU_REG_IMM_UPD(int16_t, 2)
+LRU_REG_IMM_UPD(uint16_t, 3)
+LRU_REG_IMM_UPD(int32_t, 0)
+#if __riscv_xlen == 64
+LRU_REG_IMM_UPD(uint32_t, 1)
+LRU_REG_IMM_UPD(int64_t, 2)
+#endif
+
+SRU_REG_IMM_UPD(int8_t, 3)
+SRU_REG_IMM_UPD(int16_t, 0)
+SRU_REG_IMM_UPD(int32_t, 1)
+#if __riscv_xlen == 64
+SRU_REG_IMM_UPD(int64_t, 2)
+#endif
+
+/* If the shifted value is used later, we cannot eliminate it.  */
+/* { dg-final { scan-assembler-times "slli" 5 { target { rv32 } } } } */
+/* { dg-final { scan-assembler-times "slli" 8 { target { rv64 } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c
new file mode 100644
index 00000000000..dc69f228371
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c
@@ -0,0 +1,27 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadbb_xtheadmemidx" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LRU_REG_IMM_UPD(int8_t, 0)
+LRU_REG_IMM_UPD(uint8_t, 1)
+LRU_REG_IMM_UPD(int16_t, 2)
+LRU_REG_IMM_UPD(uint16_t, 3)
+LRU_REG_IMM_UPD(int32_t, 0)
+#if __riscv_xlen == 64
+LRU_REG_IMM_UPD(uint32_t, 1)
+LRU_REG_IMM_UPD(int64_t, 2)
+#endif
+
+SRU_REG_IMM_UPD(int8_t, 3)
+SRU_REG_IMM_UPD(int16_t, 0)
+SRU_REG_IMM_UPD(int32_t, 1)
+#if __riscv_xlen == 64
+SRU_REG_IMM_UPD(int64_t, 2)
+#endif
+
+/* If the shifted value is used later, we cannot eliminate it.  */
+/* { dg-final { scan-assembler-times "slli" 5 { target { rv32 } } } } */
+/* { dg-final { scan-assembler-times "slli" 8 { target { rv64 } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c
new file mode 100644
index 00000000000..6116762b839
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c
@@ -0,0 +1,44 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadbb_xtheadmemidx" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LRU_REG_IMM(int8_t, 0)
+/* { dg-final { scan-assembler-times "th.lurb\t\[^\n\r\]*0" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.lrb\t\[^\n\r\]*0" 1 { target { rv32 } } } } */
+LRU_REG_IMM(uint8_t, 1)
+/* { dg-final { scan-assembler-times "th.lurbu\t\[^\n\r\]*1" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.lrbu\t\[^\n\r\]*1" 1 { target { rv32 } } } } */
+LRU_REG_IMM(int16_t, 2)
+/* { dg-final { scan-assembler-times "th.lurh\t\[^\n\r\]*2" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.lrh\t\[^\n\r\]*2" 1 { target { rv32 } } } } */
+LRU_REG_IMM(uint16_t, 3)
+/* { dg-final { scan-assembler-times "th.lurhu\t\[^\n\r\]*3" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.lrhu\t\[^\n\r\]*3" 1 { target { rv32 } } } } */
+LRU_REG_IMM(int32_t, 0)
+/* { dg-final { scan-assembler-times "th.lurw\t\[^\n\r\]*0" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.lrw\t\[^\n\r\]*0" 1 { target { rv32 } } } } */
+#if __riscv_xlen == 64
+LRU_REG_IMM(uint32_t, 1)
+/* { dg-final { scan-assembler-times "th.lurwu\t\[^\n\r\]*1" 1 { target { rv64 } } } } */
+LRU_REG_IMM(int64_t, 2)
+/* { dg-final { scan-assembler-times "th.lurd\t\[^\n\r\]*2" 1 { target { rv64 } } } } */
+#endif
+
+SRU_REG_IMM(int8_t, 3)
+/* { dg-final { scan-assembler-times "th.surb\t\[^\n\r\]*3" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.srb\t\[^\n\r\]*3" 1 { target { rv32 } } } } */
+SRU_REG_IMM(int16_t, 0)
+/* { dg-final { scan-assembler-times "th.surh\t\[^\n\r\]*0" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.srh\t\[^\n\r\]*0" 1 { target { rv32 } } } } */
+SRU_REG_IMM(int32_t, 1)
+/* { dg-final { scan-assembler-times "th.surw\t\[^\n\r\]*1" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.srw\t\[^\n\r\]*1" 1 { target { rv32 } } } } */
+#if __riscv_xlen == 64
+SRU_REG_IMM(int64_t, 2)
+/* { dg-final { scan-assembler-times "th.surd\t\[^\n\r\]*2" 1 { target { rv64 } } } } */
+#endif
+
+/* { dg-final { scan-assembler-not "slli" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex.c b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex.c
new file mode 100644
index 00000000000..b2ff4542347
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-uindex.c
@@ -0,0 +1,44 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadmemidx" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LRU_REG_IMM(int8_t, 0)
+/* { dg-final { scan-assembler-times "th.lurb\t\[^\n\r\]*0" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.lrb\t\[^\n\r\]*0" 1 { target { rv32 } } } } */
+LRU_REG_IMM(uint8_t, 1)
+/* { dg-final { scan-assembler-times "th.lurbu\t\[^\n\r\]*1" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.lrbu\t\[^\n\r\]*1" 1 { target { rv32 } } } } */
+LRU_REG_IMM(int16_t, 2)
+/* { dg-final { scan-assembler-times "th.lurh\t\[^\n\r\]*2" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.lrh\t\[^\n\r\]*2" 1 { target { rv32 } } } } */
+LRU_REG_IMM(uint16_t, 3)
+/* { dg-final { scan-assembler-times "th.lurhu\t\[^\n\r\]*3" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.lrhu\t\[^\n\r\]*3" 1 { target { rv32 } } } } */
+LRU_REG_IMM(int32_t, 0)
+/* { dg-final { scan-assembler-times "th.lurw\t\[^\n\r\]*0" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.lrw\t\[^\n\r\]*0" 1 { target { rv32 } } } } */
+#if __riscv_xlen == 64
+LRU_REG_IMM(uint32_t, 1)
+/* { dg-final { scan-assembler-times "th.lurwu\t\[^\n\r\]*1" 1 { target { rv64 } } } } */
+LRU_REG_IMM(int64_t, 2)
+/* { dg-final { scan-assembler-times "th.lurd\t\[^\n\r\]*2" 1 { target { rv64 } } } } */
+#endif
+
+SRU_REG_IMM(int8_t, 3)
+/* { dg-final { scan-assembler-times "th.surb\t\[^\n\r\]*3" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.srb\t\[^\n\r\]*3" 1 { target { rv32 } } } } */
+SRU_REG_IMM(int16_t, 0)
+/* { dg-final { scan-assembler-times "th.surh\t\[^\n\r\]*0" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.srh\t\[^\n\r\]*0" 1 { target { rv32 } } } } */
+SRU_REG_IMM(int32_t, 1)
+/* { dg-final { scan-assembler-times "th.surw\t\[^\n\r\]*1" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.srw\t\[^\n\r\]*1" 1 { target { rv32 } } } } */
+#if __riscv_xlen == 64
+SRU_REG_IMM(int64_t, 2)
+/* { dg-final { scan-assembler-times "th.surd\t\[^\n\r\]*2" 1 { target { rv64 } } } } */
+#endif
+
+/* { dg-final { scan-assembler-not "slli" } } */