[6/6] Support Intel AMX-MOVRS

Message ID 20241113084435.1784546-7-haochen.jiang@intel.com
State New
Headers
Series Support Intel Diamond Rapids AMX instructions |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 fail Patch failed to apply
linaro-tcwg-bot/tcwg_binutils_build--master-arm fail Patch failed to apply

Commit Message

Haochen Jiang Nov. 13, 2024, 8:44 a.m. UTC
  From: "Hu, Lin1" <lin1.hu@intel.com>

This patch will support AMX-MOVRS feature. No special handling since
TMM pair has been handled in AMX-TRANSPOSE.

gas/ChangeLog:

	* NEWS: Support Intel AMX-MOVRS.
	* config/tc-i386.c: Add amx_movrs.
	* doc/c-i386.texi: Document .amx_movrs.
	* testsuite/gas/i386/i386.exp: Run AMX-MOVRS tests.
	* testsuite/gas/i386/x86-64.exp: Ditto.
	* testsuite/gas/i386/amx-movrs-inval.l: New test.
	* testsuite/gas/i386/amx-movrs-inval.s: Ditto.
	* testsuite/gas/i386/x86-64-amx-movrs-intel.d: Ditto.
	* testsuite/gas/i386/x86-64-amx-movrs-inval.l: Ditto.
	* testsuite/gas/i386/x86-64-amx-movrs-inval.s: Ditto.
	* testsuite/gas/i386/x86-64-amx-movrs.d: Ditto.
	* testsuite/gas/i386/x86-64-amx-movrs.s: Ditto.

opcodes/ChangeLog:

	* i386-dis.c (MOD_VEX_0F384A_L_0_W_0): New.
	(MOD_VEX_MAP5_F8_L_0_W_0): Ditto.
	(MOD_VEX_MAP5_F9_L_0_W_0): Ditto.
	(PREFIX_VEX_0F384A_L_0_W_0_M_0_X86_64): Ditto.
	(PREFIX_VEX_MAP5_F8_L_0_W_0_M_0_X86_64): Ditto.
	(PREFIX_VEX_MAP5_F9_L_0_W_0_M_0_X86_64): Ditto.
	(X86_64_VEX_0F384A_L_0_W_0_M_0): Ditto.
	(X86_64_VEX_MAP5_F8_L_0_W_0_M_0): Ditto.
	(X86_64_VEX_MAP5_F9_L_0_W_0_M_0): Ditto.
	(VEX_MAP5): Ditto.
	(VEX_LEN_0F384A): Ditto.
	(VEX_LEN_MAP5_F8): Ditto.
	(VEX_LEN_MAP5_F9): Ditto.
	(VEX_W_0F384A_L_0): Ditto.
	(VEX_W_MAP5_F8_L_0): Ditto.
	(VEX_W_MAP5_F9_L_0): Ditto.
	(prefix_table): Add PREFIX_VEX_0F384A_L_0_W_0_M_0_X86_64,
	PREFIX_VEX_MAP5_F8_L_0_W_0_M_0_X86_64
	and PREFIX_VEX_MAP5_F9_L_0_W_0_M_0_X86_64.
	(x86_64_table): Add X86_64_VEX_0F384A_L_0_W_0_M_0,
	X86_64_VEX_MAP5_F8_L_0_W_0_M_0
	and X86_64_VEX_MAP5_F9_L_0_W_0_M_0.
	(vex_len_table): Add VEX_LEN_0F384A,
	VEX_LEN_MAP5_F8 and VEX_LEN_MAP5_F9.
	(vex_w_table): Add VEX_W_0F384A_L_0, VEX_W_MAP5_F8_L_0,
	and VEX_W_MAP5_F9_L_0.
	* i386-gen.c (cpu_flag_init): Add CPU_AMX_MOVRS_FLAGS and
	CPU_ANY_AMX_MOVRS_FLAGS.
	* i386-init.h: Regenerated.
	* i386-mnem.h: Ditto.
	* i386-opc.h (CpuAMX_MOVRS): New.
	(i386_cpu_flags): Add cpuamx_movrs.
	* i386-opc.tbl: Add AMX-MOVRS instructions.
	* i386-tbl.h: Regenerated.
---
 gas/NEWS                                      |    2 +
 gas/config/tc-i386.c                          |    1 +
 gas/doc/c-i386.texi                           |    4 +-
 gas/testsuite/gas/i386/amx-movrs-inval.l      |    7 +
 gas/testsuite/gas/i386/amx-movrs-inval.s      |   11 +
 gas/testsuite/gas/i386/i386.exp               |    1 +
 .../gas/i386/x86-64-amx-movrs-intel.d         |   23 +
 .../gas/i386/x86-64-amx-movrs-inval.l         |    5 +
 .../gas/i386/x86-64-amx-movrs-inval.s         |    9 +
 gas/testsuite/gas/i386/x86-64-amx-movrs.d     |   21 +
 gas/testsuite/gas/i386/x86-64-amx-movrs.s     |   31 +
 gas/testsuite/gas/i386/x86-64.exp             |    3 +
 opcodes/i386-dis.c                            |  100 +-
 opcodes/i386-gen.c                            |    3 +
 opcodes/i386-init.h                           |  688 +--
 opcodes/i386-mnem.h                           | 4362 +++++++++--------
 opcodes/i386-opc.h                            |    3 +
 opcodes/i386-opc.tbl                          |    7 +
 opcodes/i386-tbl.h                            |  287 +-
 19 files changed, 2942 insertions(+), 2626 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/amx-movrs-inval.l
 create mode 100644 gas/testsuite/gas/i386/amx-movrs-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-movrs-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-movrs-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-movrs-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-movrs.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-movrs.s
  

Comments

Jan Beulich Nov. 18, 2024, 3:39 p.m. UTC | #1
On 13.11.2024 09:44, Haochen Jiang wrote:
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.l
> @@ -0,0 +1,5 @@
> +.* Assembler messages:
> +.*:6: Error: `t2rpntlvwz0rs' is not supported on `x86_64.noamx_transpose'
> +.*:7: Error: `t2rpntlvwz0rst1' is not supported on `x86_64.noamx_transpose'
> +.*:8: Error: `t2rpntlvwz1rs' is not supported on `x86_64.noamx_transpose'
> +.*:9: Error: `t2rpntlvwz1rst1' is not supported on `x86_64.noamx_transpose'
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.s
> @@ -0,0 +1,9 @@
> +# Check Invalid 64bit AMX-MOVRS instructions
> +
> +	.text
> +	.arch .noamx_transpose
> +_start:
> +	t2rpntlvwz0rs	(%r9), %tmm3
> +	t2rpntlvwz0rst1	(%r9), %tmm3
> +	t2rpntlvwz1rs	(%r9), %tmm3
> +	t2rpntlvwz1rst1	(%r9), %tmm3

This is too little imo - the SIBMEM constraints also want checking.

> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -3197,6 +3197,11 @@ t2rpntlvwz0t1, 0x6f, AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW
>  t2rpntlvwz1, 0x666e, AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
>  t2rpntlvwz1t1, 0x666f, AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
>  
> +t2rpntlvwz0rs, 0xf8, AMX_MOVRS&AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
> +t2rpntlvwz0rst1, 0xf9, AMX_MOVRS&AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
> +t2rpntlvwz1rs, 0x66f8, AMX_MOVRS&AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
> +t2rpntlvwz1rst1, 0x66f9, AMX_MOVRS&AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }

Judging from VMOVRS{B,W,D,Q}, shouldn't the RS infix move a little earlier,
ahead of the element width specifier: T2RPNTLVRSW{Z0,Z1}{,T1}?

> @@ -3230,6 +3235,8 @@ tdphf8ps, 0x66fd, AMX_FP8, Modrm|Vex128|xVexMap5|Src2VVVV|VexW0|NoSuf, { RegTMM,
>  
>  tileloadd, 0xf24b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
>  tileloaddt1, 0x664b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
> +tileloaddrs, 0xf24a, AMX_MOVRS, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
> +tileloaddrst1, 0x664a, AMX_MOVRS, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
>  tilemovrow, 0x664a, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
>  tilemovrow, 0x6607, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
>  tilestored, 0xf34b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { RegTMM, Unspecified|BaseIndex }

Same here: TILELOADRSD{,T1} would seem like a better match for VMOVRS{B,W,D,Q}.

For all of these: What about their APX forms (presumably simply re-encoded as
EVEX at the same position in the opcode map)?

Jan
  
Haochen Jiang Nov. 19, 2024, 5:46 a.m. UTC | #2
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Monday, November 18, 2024 11:39 PM
> To: Jiang, Haochen <haochen.jiang@intel.com>; Hu, Lin1 <lin1.hu@intel.com>
> Cc: binutils@sourceware.org; H.J. Lu <hjl.tools@gmail.com>
> Subject: Re: [PATCH 6/6] Support Intel AMX-MOVRS
>  
> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -3197,6 +3197,11 @@ t2rpntlvwz0t1, 0x6f, AMX_TRANSPOSE,
> > TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW
> >  t2rpntlvwz1, 0x666e, AMX_TRANSPOSE,
> > TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, {
> > Unspecified|BaseIndex, RegTMM }  t2rpntlvwz1t1, 0x666f,
> AMX_TRANSPOSE,
> > TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, {
> > Unspecified|BaseIndex, RegTMM }
> >
> > +t2rpntlvwz0rs, 0xf8, AMX_MOVRS&AMX_TRANSPOSE,
> > +TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, {
> > +Unspecified|BaseIndex, RegTMM } t2rpntlvwz0rst1, 0xf9,
> > +AMX_MOVRS&AMX_TRANSPOSE,
> > +TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, {
> > +Unspecified|BaseIndex, RegTMM } t2rpntlvwz1rs, 0x66f8,
> > +AMX_MOVRS&AMX_TRANSPOSE,
> > +TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, {
> > +Unspecified|BaseIndex, RegTMM } t2rpntlvwz1rst1, 0x66f9,
> > +AMX_MOVRS&AMX_TRANSPOSE,
> > +TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, {
> > +Unspecified|BaseIndex, RegTMM }
> 
> Judging from VMOVRS{B,W,D,Q}, shouldn't the RS infix move a little earlier,
> ahead of the element width specifier: T2RPNTLVRSW{Z0,Z1}{,T1}?
> ...
> Same here: TILELOADRSD{,T1} would seem like a better match for
> VMOVRS{B,W,D,Q}.

Per my understanding, D is the element size in tile, while RS hint is for whole
tile, not for the elements in tile. That is why RS will not appear before D.

Thx,
Haochen
  
Jan Beulich Nov. 19, 2024, 9:07 a.m. UTC | #3
On 19.11.2024 06:46, Jiang, Haochen wrote:
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Monday, November 18, 2024 11:39 PM
>> To: Jiang, Haochen <haochen.jiang@intel.com>; Hu, Lin1 <lin1.hu@intel.com>
>> Cc: binutils@sourceware.org; H.J. Lu <hjl.tools@gmail.com>
>> Subject: Re: [PATCH 6/6] Support Intel AMX-MOVRS
>>  
>>> --- a/opcodes/i386-opc.tbl
>>> +++ b/opcodes/i386-opc.tbl
>>> @@ -3197,6 +3197,11 @@ t2rpntlvwz0t1, 0x6f, AMX_TRANSPOSE,
>>> TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW
>>>  t2rpntlvwz1, 0x666e, AMX_TRANSPOSE,
>>> TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, {
>>> Unspecified|BaseIndex, RegTMM }  t2rpntlvwz1t1, 0x666f,
>> AMX_TRANSPOSE,
>>> TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, {
>>> Unspecified|BaseIndex, RegTMM }
>>>
>>> +t2rpntlvwz0rs, 0xf8, AMX_MOVRS&AMX_TRANSPOSE,
>>> +TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, {
>>> +Unspecified|BaseIndex, RegTMM } t2rpntlvwz0rst1, 0xf9,
>>> +AMX_MOVRS&AMX_TRANSPOSE,
>>> +TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, {
>>> +Unspecified|BaseIndex, RegTMM } t2rpntlvwz1rs, 0x66f8,
>>> +AMX_MOVRS&AMX_TRANSPOSE,
>>> +TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, {
>>> +Unspecified|BaseIndex, RegTMM } t2rpntlvwz1rst1, 0x66f9,
>>> +AMX_MOVRS&AMX_TRANSPOSE,
>>> +TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, {
>>> +Unspecified|BaseIndex, RegTMM }
>>
>> Judging from VMOVRS{B,W,D,Q}, shouldn't the RS infix move a little earlier,
>> ahead of the element width specifier: T2RPNTLVRSW{Z0,Z1}{,T1}?
>> ...
>> Same here: TILELOADRSD{,T1} would seem like a better match for
>> VMOVRS{B,W,D,Q}.
> 
> Per my understanding, D is the element size in tile, while RS hint is for whole
> tile, not for the elements in tile. That is why RS will not appear before D.

Yet the same can be said for a vector then, when it's VMOVRS{B,W,D,Q}. Of
course the question could also be put the other way around, requesting
VMOVRS{B,W,D,Q} to be changed. Yet then for all vector and (prior) tile
insns element size is last, with everything pertaining to the whole
vector/tile coming first (take e.g. VGATHERPF<n>{D,Q}P{S,D}). All I'm
- as usual - asking for is consistency. Which in this case may actually
mean moving W to the very end of the mnemonics, unlike I said in my
original reply: i.e. T2RPNTLVRS{Z0,Z1}W{,T1} or T2RPNTLVRS{Z0,Z1}{,T1}W.
I didn't spell out these forms because there's no precedent for Z<n>
(and maybe T<n>, albeit the S/G prefetches have some similarity there)
combined with {B,W,D,Q}, and hence their ordering doesn't (currently)
matter as far as consistency goes.

Jan
  
Haochen Jiang Nov. 21, 2024, 5:58 a.m. UTC | #4
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Tuesday, November 19, 2024 5:08 PM
> 
> On 19.11.2024 06:46, Jiang, Haochen wrote:
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: Monday, November 18, 2024 11:39 PM
> >> To: Jiang, Haochen <haochen.jiang@intel.com>; Hu, Lin1
> <lin1.hu@intel.com>
> >> Cc: binutils@sourceware.org; H.J. Lu <hjl.tools@gmail.com>
> >> Subject: Re: [PATCH 6/6] Support Intel AMX-MOVRS
> >>
> >> Judging from VMOVRS{B,W,D,Q}, shouldn't the RS infix move a little earlier,
> >> ahead of the element width specifier: T2RPNTLVRSW{Z0,Z1}{,T1}?
> >> ...
> >> Same here: TILELOADRSD{,T1} would seem like a better match for
> >> VMOVRS{B,W,D,Q}.
> >
> > Per my understanding, D is the element size in tile, while RS hint is for whole
> > tile, not for the elements in tile. That is why RS will not appear before D.
> 
> Yet the same can be said for a vector then, when it's VMOVRS{B,W,D,Q}. Of
> course the question could also be put the other way around, requesting
> VMOVRS{B,W,D,Q} to be changed. Yet then for all vector and (prior) tile
> insns element size is last, with everything pertaining to the whole
> vector/tile coming first (take e.g. VGATHERPF<n>{D,Q}P{S,D}). All I'm
> - as usual - asking for is consistency. Which in this case may actually
> mean moving W to the very end of the mnemonics, unlike I said in my
> original reply: i.e. T2RPNTLVRS{Z0,Z1}W{,T1} or T2RPNTLVRS{Z0,Z1}{,T1}W.
> I didn't spell out these forms because there's no precedent for Z<n>
> (and maybe T<n>, albeit the S/G prefetches have some similarity there)
> combined with {B,W,D,Q}, and hence their ordering doesn't (currently)
> matter as far as consistency goes.

An update on this: I am confirming this with HW.

Thx,
Haochen
  

Patch

diff --git a/gas/NEWS b/gas/NEWS
index ba63043002e..1cf6ca652b5 100644
--- a/gas/NEWS
+++ b/gas/NEWS
@@ -1,5 +1,7 @@ 
 -*- text -*-
 
+* Add support for Intel AMX-MOVRS instructions.
+
 * Add support for Intel AMX-FP8 instructions.
 
 * Add support for Intel AMX-TF32 instructions.
diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 504d1e099f4..588629751a2 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1186,6 +1186,7 @@  static const arch_entry cpu_arch[] =
   SUBARCH (amx_avx512, AMX_AVX512, ANY_AMX_AVX512, false),
   SUBARCH (amx_tf32, AMX_TF32, ANY_AMX_TF32, false),
   SUBARCH (amx_fp8, AMX_FP8, ANY_AMX_FP8, false),
+  SUBARCH (amx_movrs, AMX_MOVRS, ANY_AMX_MOVRS, false),
   SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false),
   SUBARCH (movdiri, MOVDIRI, MOVDIRI, false),
   SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false),
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index bd2f585e1e3..caceb38025a 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -232,6 +232,7 @@  accept various extension mnemonics.  For example,
 @code{amx_avx512},
 @code{amx_tf32},
 @code{amx_fp8}
+@code{amx_movrs},
 @code{amx_tile},
 @code{vmx},
 @code{vmfunc},
@@ -1705,7 +1706,8 @@  supported on the CPU specified.  The choices for @var{cpu_type} are:
 @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk}
 @item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16}
 @item @samp{.amx_complex} @tab @samp{.amx_transpose} @tab @samp{.amx_avx512}
-@item @samp{.amx_tf32} @tab @samp {.amx_fp8} @tab @samp{.amx_tile}
+@item @samp{.amx_tf32} @tab @samp{.amx_tile} @tab @tab @samp{.amx_movrs}
+@item @samp{.amx_tile}
 @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset}
 @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5}
 @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme}
diff --git a/gas/testsuite/gas/i386/amx-movrs-inval.l b/gas/testsuite/gas/i386/amx-movrs-inval.l
new file mode 100644
index 00000000000..6fa829849d5
--- /dev/null
+++ b/gas/testsuite/gas/i386/amx-movrs-inval.l
@@ -0,0 +1,7 @@ 
+.* Assembler messages:
+.*:6: Error: `t2rpntlvwz0rs' is only supported in 64-bit mode
+.*:7: Error: `t2rpntlvwz0rst1' is only supported in 64-bit mode
+.*:8: Error: `t2rpntlvwz1rs' is only supported in 64-bit mode
+.*:9: Error: `t2rpntlvwz1rst1' is only supported in 64-bit mode
+.*:10: Error: `tileloaddrs' is only supported in 64-bit mode
+.*:11: Error: `tileloaddrst1' is only supported in 64-bit mode
diff --git a/gas/testsuite/gas/i386/amx-movrs-inval.s b/gas/testsuite/gas/i386/amx-movrs-inval.s
new file mode 100644
index 00000000000..09778e790bd
--- /dev/null
+++ b/gas/testsuite/gas/i386/amx-movrs-inval.s
@@ -0,0 +1,11 @@ 
+# Check Illegal 32bit AMX-MOVRS instructions
+
+	.allow_index_reg
+	.text
+_start:
+	t2rpntlvwz0rs	0x10000000(%esp, %esi, 8), %tmm6
+	t2rpntlvwz0rst1	0x10000000(%esp, %esi, 8), %tmm6
+	t2rpntlvwz1rs	0x10000000(%esp, %esi, 8), %tmm6
+	t2rpntlvwz1rst1	0x10000000(%esp, %esi, 8), %tmm6
+	tileloaddrs	0x10000000(%esp, %esi, 8), %tmm6
+	tileloaddrst1	0x10000000(%esp, %esi, 8), %tmm6
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index df0b7752ab6..acc50eafd33 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -550,6 +550,7 @@  if [gas_32_check] then {
     run_list_test "amx-avx512-inval"
     run_list_test "amx-tf32-inval"
     run_list_test "amx-fp8-inval"
+    run_list_test "amx-movrs-inval"
     run_list_test "sg"
     run_dump_test "clzero"
     run_dump_test "invlpgb"
diff --git a/gas/testsuite/gas/i386/x86-64-amx-movrs-intel.d b/gas/testsuite/gas/i386/x86-64-amx-movrs-intel.d
new file mode 100644
index 00000000000..f4cd0bd0911
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-movrs-intel.d
@@ -0,0 +1,23 @@ 
+#objdump: -dw -Mintel
+#name: x86_64 AMX-MOVRS insns (Intel disassembly)
+#source: x86-64-amx-movrs.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+#...
+[a-f0-9]+ <_intel>:
+\s*[a-f0-9]+:\s*c4 a5 78 f8 b4 f5 00 00 00 10\s+t2rpntlvwz0rs tmm6,\[rbp\+r14\*8\+0x10000000\]
+\s*[a-f0-9]+:\s*c4 c5 78 f8 14 21\s+t2rpntlvwz0rs tmm2,\[r9\+riz\*1\]
+\s*[a-f0-9]+:\s*c4 a5 78 f9 b4 f5 00 00 00 10\s+t2rpntlvwz0rst1 tmm6,\[rbp\+r14\*8\+0x10000000\]
+\s*[a-f0-9]+:\s*c4 c5 78 f9 14 21\s+t2rpntlvwz0rst1 tmm2,\[r9\+riz\*1\]
+\s*[a-f0-9]+:\s*c4 a5 79 f8 b4 f5 00 00 00 10\s+t2rpntlvwz1rs tmm6,\[rbp\+r14\*8\+0x10000000\]
+\s*[a-f0-9]+:\s*c4 c5 79 f8 14 21\s+t2rpntlvwz1rs tmm2,\[r9\+riz\*1\]
+\s*[a-f0-9]+:\s*c4 a5 79 f9 b4 f5 00 00 00 10\s+t2rpntlvwz1rst1 tmm6,\[rbp\+r14\*8\+0x10000000\]
+\s*[a-f0-9]+:\s*c4 c5 79 f9 14 21\s+t2rpntlvwz1rst1 tmm2,\[r9\+riz\*1\]
+\s*[a-f0-9]+:\s*c4 a2 7b 4a b4 f5 00 00 00 10\s+tileloaddrs tmm6,\[rbp\+r14\*8\+0x10000000\]
+\s*[a-f0-9]+:\s*c4 c2 7b 4a 1c 21\s+tileloaddrs tmm3,\[r9\+riz\*1\]
+\s*[a-f0-9]+:\s*c4 a2 79 4a b4 f5 00 00 00 10\s+tileloaddrst1 tmm6,\[rbp\+r14\*8\+0x10000000\]
+\s*[a-f0-9]+:\s*c4 c2 79 4a 1c 21\s+tileloaddrst1 tmm3,\[r9\+riz\*1\]
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.l b/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.l
new file mode 100644
index 00000000000..0f7d72e73a9
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.l
@@ -0,0 +1,5 @@ 
+.* Assembler messages:
+.*:6: Error: `t2rpntlvwz0rs' is not supported on `x86_64.noamx_transpose'
+.*:7: Error: `t2rpntlvwz0rst1' is not supported on `x86_64.noamx_transpose'
+.*:8: Error: `t2rpntlvwz1rs' is not supported on `x86_64.noamx_transpose'
+.*:9: Error: `t2rpntlvwz1rst1' is not supported on `x86_64.noamx_transpose'
diff --git a/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.s b/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.s
new file mode 100644
index 00000000000..b6499c81c05
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.s
@@ -0,0 +1,9 @@ 
+# Check Invalid 64bit AMX-MOVRS instructions
+
+	.text
+	.arch .noamx_transpose
+_start:
+	t2rpntlvwz0rs	(%r9), %tmm3
+	t2rpntlvwz0rst1	(%r9), %tmm3
+	t2rpntlvwz1rs	(%r9), %tmm3
+	t2rpntlvwz1rst1	(%r9), %tmm3
diff --git a/gas/testsuite/gas/i386/x86-64-amx-movrs.d b/gas/testsuite/gas/i386/x86-64-amx-movrs.d
new file mode 100644
index 00000000000..b0bc77e8f15
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-movrs.d
@@ -0,0 +1,21 @@ 
+#objdump: -dw
+#name: x86_64 AMX-MOVRS insns
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*c4 a5 78 f8 b4 f5 00 00 00 10\s+t2rpntlvwz0rs 0x10000000\(%rbp,%r14,8\),%tmm6
+\s*[a-f0-9]+:\s*c4 c5 78 f8 14 21\s+t2rpntlvwz0rs \(%r9,%riz,1\),%tmm2
+\s*[a-f0-9]+:\s*c4 a5 78 f9 b4 f5 00 00 00 10\s+t2rpntlvwz0rst1 0x10000000\(%rbp,%r14,8\),%tmm6
+\s*[a-f0-9]+:\s*c4 c5 78 f9 14 21\s+t2rpntlvwz0rst1 \(%r9,%riz,1\),%tmm2
+\s*[a-f0-9]+:\s*c4 a5 79 f8 b4 f5 00 00 00 10\s+t2rpntlvwz1rs 0x10000000\(%rbp,%r14,8\),%tmm6
+\s*[a-f0-9]+:\s*c4 c5 79 f8 14 21\s+t2rpntlvwz1rs \(%r9,%riz,1\),%tmm2
+\s*[a-f0-9]+:\s*c4 a5 79 f9 b4 f5 00 00 00 10\s+t2rpntlvwz1rst1 0x10000000\(%rbp,%r14,8\),%tmm6
+\s*[a-f0-9]+:\s*c4 c5 79 f9 14 21\s+t2rpntlvwz1rst1 \(%r9,%riz,1\),%tmm2
+\s*[a-f0-9]+:\s*c4 a2 7b 4a b4 f5 00 00 00 10\s+tileloaddrs 0x10000000\(%rbp,%r14,8\),%tmm6
+\s*[a-f0-9]+:\s*c4 c2 7b 4a 1c 21\s+tileloaddrs \(%r9,%riz,1\),%tmm3
+\s*[a-f0-9]+:\s*c4 a2 79 4a b4 f5 00 00 00 10\s+tileloaddrst1 0x10000000\(%rbp,%r14,8\),%tmm6
+\s*[a-f0-9]+:\s*c4 c2 79 4a 1c 21\s+tileloaddrst1 \(%r9,%riz,1\),%tmm3
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-amx-movrs.s b/gas/testsuite/gas/i386/x86-64-amx-movrs.s
new file mode 100644
index 00000000000..10bc9bc1299
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-movrs.s
@@ -0,0 +1,31 @@ 
+# Check 64bit AMX-MOVRS instructions
+
+	.text
+_start:
+	t2rpntlvwz0rs	0x10000000(%rbp, %r14, 8), %tmm6
+	t2rpntlvwz0rs	(%r9), %tmm3
+	t2rpntlvwz0rst1	0x10000000(%rbp, %r14, 8), %tmm6
+	t2rpntlvwz0rst1	(%r9), %tmm3
+	t2rpntlvwz1rs	0x10000000(%rbp, %r14, 8), %tmm6
+	t2rpntlvwz1rs	(%r9), %tmm3
+	t2rpntlvwz1rst1	0x10000000(%rbp, %r14, 8), %tmm6
+	t2rpntlvwz1rst1	(%r9), %tmm3
+	tileloaddrs	0x10000000(%rbp, %r14, 8), %tmm6
+	tileloaddrs	(%r9), %tmm3
+	tileloaddrst1	0x10000000(%rbp, %r14, 8), %tmm6
+	tileloaddrst1	(%r9), %tmm3
+
+_intel:
+	.intel_syntax noprefix
+	t2rpntlvwz0rs	tmm6, [rbp+r14*8+0x10000000]
+	t2rpntlvwz0rs	tmm3, [r9]
+	t2rpntlvwz0rst1	tmm6, [rbp+r14*8+0x10000000]
+	t2rpntlvwz0rst1	tmm3, [r9]
+	t2rpntlvwz1rs	tmm6, [rbp+r14*8+0x10000000]
+	t2rpntlvwz1rs	tmm3, [r9]
+	t2rpntlvwz1rst1	tmm6, [rbp+r14*8+0x10000000]
+	t2rpntlvwz1rst1	tmm3, [r9]
+	tileloaddrs	tmm6, [rbp+r14*8+0x10000000]
+	tileloaddrs	tmm3, [r9]
+	tileloaddrst1	tmm6, [rbp+r14*8+0x10000000]
+	tileloaddrst1	tmm3, [r9]
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index 7d3f7ebe2b1..751c376ea84 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -535,6 +535,9 @@  run_list_test "x86-64-amx-tf32-inval"
 run_dump_test "x86-64-amx-fp8"
 run_dump_test "x86-64-amx-fp8-intel"
 run_list_test "x86-64-amx-fp8-inval"
+run_dump_test "x86-64-amx-movrs"
+run_dump_test "x86-64-amx-movrs-intel"
+run_list_test "x86-64-amx-movrs-inval"
 run_dump_test "x86-64-clzero"
 run_dump_test "x86-64-mwaitx-bdver4"
 run_list_test "x86-64-mwaitx-reg"
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index 0fe2a5b48ec..f859bcce853 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -964,8 +964,11 @@  enum
   MOD_0F38F8,
 
   MOD_VEX_0F3849_X86_64_L_0_W_0,
+  MOD_VEX_0F384A_X86_64,
   MOD_VEX_0F386E_X86_64,
   MOD_VEX_0F386F_X86_64,
+  MOD_VEX_MAP5_F8_X86_64,
+  MOD_VEX_MAP5_F9_X86_64, 
 
   MOD_EVEX_MAP4_60,
   MOD_EVEX_MAP4_61,
@@ -1135,6 +1138,7 @@  enum
   PREFIX_VEX_0F3848_X86_64_L_0_W_0,
   PREFIX_VEX_0F3849_X86_64_L_0_W_0_M_0,
   PREFIX_VEX_0F3849_X86_64_L_0_W_0_M_1,
+  PREFIX_VEX_0F384A_X86_64_M_0_L_0_W_0,
   PREFIX_VEX_0F384B_X86_64_L_0_W_0,
   PREFIX_VEX_0F3850_W_0,
   PREFIX_VEX_0F3851_W_0,
@@ -1160,6 +1164,8 @@  enum
   PREFIX_VEX_0F38F6_L_0,
   PREFIX_VEX_0F38F7_L_0,
   PREFIX_VEX_0F3AF0_L_0,
+  PREFIX_VEX_MAP5_F8_X86_64_M_0_L_0_W_0,
+  PREFIX_VEX_MAP5_F9_X86_64_M_0_L_0_W_0,
   PREFIX_VEX_MAP5_FD_X86_64_L_0_W_0,
   PREFIX_VEX_MAP7_F6_L_0_W_0_R_0_X86_64,
   PREFIX_VEX_MAP7_F8_L_0_W_0_R_0_X86_64,
@@ -1358,6 +1364,7 @@  enum
 
   X86_64_VEX_0F3848,
   X86_64_VEX_0F3849,
+  X86_64_VEX_0F384A,
   X86_64_VEX_0F384B,
   X86_64_VEX_0F385C,
   X86_64_VEX_0F385E,
@@ -1368,6 +1375,8 @@  enum
   X86_64_VEX_0F386F,
   X86_64_VEX_0F38Ex,
 
+  X86_64_VEX_MAP5_F8,
+  X86_64_VEX_MAP5_F9,
   X86_64_VEX_MAP5_FD,
   X86_64_VEX_MAP7_F6_L_0_W_0_R_0,
   X86_64_VEX_MAP7_F8_L_0_W_0_R_0,
@@ -1453,6 +1462,7 @@  enum
   VEX_LEN_0F3841,
   VEX_LEN_0F3848_X86_64,
   VEX_LEN_0F3849_X86_64,
+  VEX_LEN_0F384A_X86_64_M_0,
   VEX_LEN_0F384B_X86_64,
   VEX_LEN_0F385A,
   VEX_LEN_0F385C_X86_64,
@@ -1500,6 +1510,8 @@  enum
   VEX_LEN_0F3ADE_W_0,
   VEX_LEN_0F3ADF,
   VEX_LEN_0F3AF0,
+  VEX_LEN_MAP5_F8_X86_64_M_0,
+  VEX_LEN_MAP5_F9_X86_64_M_0,
   VEX_LEN_MAP5_FD_X86_64,
   VEX_LEN_MAP7_F6,
   VEX_LEN_MAP7_F8,
@@ -1630,6 +1642,7 @@  enum
   VEX_W_0F3846,
   VEX_W_0F3848_X86_64_L_0,
   VEX_W_0F3849_X86_64_L_0,
+  VEX_W_0F384A_X86_64_M_0_L_0,
   VEX_W_0F384B_X86_64_L_0,
   VEX_W_0F3850,
   VEX_W_0F3851,
@@ -1677,6 +1690,8 @@  enum
   VEX_W_0F3ACE,
   VEX_W_0F3ACF,
   VEX_W_0F3ADE,
+  VEX_W_MAP5_F8_X86_64_M_0_L_0,
+  VEX_W_MAP5_F9_X86_64_M_0_L_0,
   VEX_W_MAP5_FD_X86_64_L_0,
   VEX_W_MAP7_F6_L_0,
   VEX_W_MAP7_F8_L_0,
@@ -4118,6 +4133,14 @@  static const struct dis386 prefix_table[][4] = {
     { RM_TABLE (RM_VEX_0F3849_X86_64_L_0_W_0_M_1_P_3) },
   },
 
+  /* PREFIX_VEX_0F384A_X86_64_M_0_L_0_W_0 */
+  {
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { "tileloaddrst1",	{ TMM, MVexSIBMEM }, 0 },
+    { "tileloaddrs",	{ TMM, MVexSIBMEM }, 0 },
+  },
+
   /* PREFIX_VEX_0F384B_X86_64_L_0_W_0 */
   {
     { Bad_Opcode },
@@ -4302,6 +4325,20 @@  static const struct dis386 prefix_table[][4] = {
     { "%XErorxS",		{ Gdq, Edq, Ib }, 0 },
   },
 
+  /* PREFIX_VEX_MAP5_F8_X86_64_M_0_L_0_W_0 */
+  {
+    { "t2rpntlvwz0rs",	{ TMM, MVexSIBMEM }, 0 },
+    { Bad_Opcode },
+    { "t2rpntlvwz1rs",	{ TMM, MVexSIBMEM }, 0 },
+  },
+
+  /* PREFIX_VEX_MAP5_F9_X86_64_M_0_L_0_W_0 */
+  {
+    { "t2rpntlvwz0rst1",	{ TMM, MVexSIBMEM }, 0 },
+    { Bad_Opcode },
+    { "t2rpntlvwz1rst1",	{ TMM, MVexSIBMEM }, 0 },
+  },
+
   /* PREFIX_VEX_MAP5_FD_X86_64_L_0_W_0 */
   {
     { "tdpbf8ps",	{ TMM, Rtmm, VexTmm }, 0 },
@@ -4658,6 +4695,12 @@  static const struct dis386 x86_64_table[][2] = {
     { VEX_LEN_TABLE (VEX_LEN_0F3849_X86_64) },
   },
 
+  /* X86_64_VEX_0F384A */
+  {
+    { Bad_Opcode },
+    { MOD_TABLE (MOD_VEX_0F384A_X86_64) },
+  },
+
   /* X86_64_VEX_0F384B */
   {
     { Bad_Opcode },
@@ -4712,6 +4755,18 @@  static const struct dis386 x86_64_table[][2] = {
     { "%XEcmp%CCxadd", { Mdq, Gdq, VexGdq }, PREFIX_DATA },
   },
 
+  /* X86_64_VEX_MAP5_F8 */
+  {
+    { Bad_Opcode },
+    { MOD_TABLE (MOD_VEX_MAP5_F8_X86_64) },
+  },
+
+  /* X86_64_VEX_MAP5_F9 */
+  {
+    { Bad_Opcode },
+    { MOD_TABLE (MOD_VEX_MAP5_F9_X86_64) },
+  },
+
   /* X86_64_VEX_MAP5_FD */
   {
     { Bad_Opcode },
@@ -6573,7 +6628,7 @@  static const struct dis386 vex_table[][256] = {
     /* 48 */
     { X86_64_TABLE (X86_64_VEX_0F3848) },
     { X86_64_TABLE (X86_64_VEX_0F3849) },
-    { Bad_Opcode },
+    { X86_64_TABLE (X86_64_VEX_0F384A) },
     { X86_64_TABLE (X86_64_VEX_0F384B) },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -7351,8 +7406,8 @@  static const struct dis386 vex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* f8 */
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_TABLE (X86_64_VEX_MAP5_F8) },
+    { X86_64_TABLE (X86_64_VEX_MAP5_F9) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -7552,6 +7607,11 @@  static const struct dis386 vex_len_table[][2] = {
     { VEX_W_TABLE (VEX_W_0F3849_X86_64_L_0) },
   },
 
+  /* VEX_LEN_0F384A_X86_64_M_0 */
+  {
+    { VEX_W_TABLE (VEX_W_0F384A_X86_64_M_0_L_0) },
+  },
+
   /* VEX_LEN_0F384B_X86_64 */
   {
     { VEX_W_TABLE (VEX_W_0F384B_X86_64_L_0) },
@@ -7799,6 +7859,16 @@  static const struct dis386 vex_len_table[][2] = {
     { PREFIX_TABLE (PREFIX_VEX_0F3AF0_L_0) },
   },
 
+  /* VEX_LEN_MAP5_F8_X86_64_M_0 */
+  {
+    { VEX_W_TABLE (VEX_W_MAP5_F8_X86_64_M_0_L_0) },
+  },
+
+  /* VEX_LEN_MAP5_F9_X86_64_M_0 */
+  {
+    { VEX_W_TABLE (VEX_W_MAP5_F9_X86_64_M_0_L_0) },
+  },
+
   /* VEX_LEN_MAP5_FD_X86_64 */
   {
     { VEX_W_TABLE (VEX_W_MAP5_FD_X86_64_L_0) },
@@ -8246,6 +8316,10 @@  static const struct dis386 vex_w_table[][2] = {
     /* VEX_W_0F3849_X86_64_L_0 */
     { MOD_TABLE (MOD_VEX_0F3849_X86_64_L_0_W_0) },
   },
+  {
+    /* VEX_W_0F384A_X86_64_M_0_L_0 */
+    { PREFIX_TABLE (PREFIX_VEX_0F384A_X86_64_M_0_L_0_W_0) },
+  },
   {
     /* VEX_W_0F384B_X86_64_L_0 */
     { PREFIX_TABLE (PREFIX_VEX_0F384B_X86_64_L_0_W_0) },
@@ -8440,6 +8514,14 @@  static const struct dis386 vex_w_table[][2] = {
     /* VEX_W_0F3ADE */
     { VEX_LEN_TABLE (VEX_LEN_0F3ADE_W_0) },
   },
+  {
+    /* VEX_W_MAP5_F8_X86_64_M_0 */
+    { PREFIX_TABLE (PREFIX_VEX_MAP5_F8_X86_64_M_0_L_0_W_0) },
+  },
+  {
+    /* VEX_W_MAP5_F9_X86_64_M_0 */
+    { PREFIX_TABLE (PREFIX_VEX_MAP5_F9_X86_64_M_0_L_0_W_0) },
+  },
   {
     /* VEX_W_MAP5_FD_X86_64 */
     { PREFIX_TABLE (PREFIX_VEX_MAP5_FD_X86_64_L_0_W_0) },
@@ -8804,6 +8886,10 @@  static const struct dis386 mod_table[][2] = {
     { PREFIX_TABLE (PREFIX_VEX_0F3849_X86_64_L_0_W_0_M_0) },
     { PREFIX_TABLE (PREFIX_VEX_0F3849_X86_64_L_0_W_0_M_1) },
   },
+  {
+    /* MOD_VEX_0F384A_X86_64 */
+    { VEX_LEN_TABLE (VEX_LEN_0F384A_X86_64_M_0) },
+  },
   {
     /* MOD_VEX_0F386E_X86_64 */
     { VEX_LEN_TABLE (VEX_LEN_0F386E_X86_64_M_0) },
@@ -8812,6 +8898,14 @@  static const struct dis386 mod_table[][2] = {
     /* MOD_VEX_0F386F_X86_64 */
     { VEX_LEN_TABLE (VEX_LEN_0F386F_X86_64_M_0) },
   },
+  {
+    /* MOD_VEX_MAP5_F8_X86_64 */
+    { VEX_LEN_TABLE (VEX_LEN_MAP5_F8_X86_64_M_0) },
+  },
+  {
+    /* MOD_VEX_MAP5_F9_X86_64 */
+    { VEX_LEN_TABLE (VEX_LEN_MAP5_F9_X86_64_M_0) },
+  },
 
 #include "i386-dis-evex-mod.h"
 };
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 965eca2e640..17073bec401 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -271,6 +271,8 @@  static const dependency isa_dependencies[] =
     "AMX_TILE" },
   { "AMX_FP8",
     "AMX_TILE" },
+  { "AMX_MOVRS",
+    "AMX_TILE" },
   { "KL",
     "SSE2" },
   { "WIDEKL",
@@ -441,6 +443,7 @@  static bitfield cpu_flags[] =
   BITFIELD (AMX_AVX512),
   BITFIELD (AMX_TF32),
   BITFIELD (AMX_FP8),
+  BITFIELD (AMX_MOVRS),
   BITFIELD (AMX_TILE),
   BITFIELD (MOVDIRI),
   BITFIELD (MOVDIR64B),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index a16ec56d355..27de44908cc 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -258,6 +258,8 @@  enum i386_cpu
   CpuAMX_TF32,
   /* AMX-FP8 instructions required */
   CpuAMX_FP8,
+  /* Intel AMX-MOVRS Instructions support required.  */
+  CpuAMX_MOVRS,
   /* AMX-TILE instructions required */
   CpuAMX_TILE,
   /* GFNI instructions required */
@@ -509,6 +511,7 @@  typedef union i386_cpu_flags
       unsigned int cpuamx_avx512:1;
       unsigned int cpuamx_tf32:1;
       unsigned int cpuamx_fp8:1;
+      unsigned int cpuamx_movrs:1;
       unsigned int cpuamx_tile:1;
       unsigned int cpugfni:1;
       unsigned int cpuvaes:1;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 22c5653c313..4dc08e1e105 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -3197,6 +3197,11 @@  t2rpntlvwz0t1, 0x6f, AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW
 t2rpntlvwz1, 0x666e, AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
 t2rpntlvwz1t1, 0x666f, AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
 
+t2rpntlvwz0rs, 0xf8, AMX_MOVRS&AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
+t2rpntlvwz0rst1, 0xf9, AMX_MOVRS&AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
+t2rpntlvwz1rs, 0x66f8, AMX_MOVRS&AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
+t2rpntlvwz1rst1, 0x66f9, AMX_MOVRS&AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
+
 tcmmimfp16ps, 0x666c, AMX_COMPLEX, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM }
 tcmmrlfp16ps, 0x6c, AMX_COMPLEX, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM }
 
@@ -3230,6 +3235,8 @@  tdphf8ps, 0x66fd, AMX_FP8, Modrm|Vex128|xVexMap5|Src2VVVV|VexW0|NoSuf, { RegTMM,
 
 tileloadd, 0xf24b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
 tileloaddt1, 0x664b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
+tileloaddrs, 0xf24a, AMX_MOVRS, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
+tileloaddrst1, 0x664a, AMX_MOVRS, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
 tilemovrow, 0x664a, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
 tilemovrow, 0x6607, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
 tilestored, 0xf34b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { RegTMM, Unspecified|BaseIndex }