[v2] Support Intel AMX-AVX512

Message ID 20250103025029.1909253-1-haochen.jiang@intel.com
State New
Headers
Series [v2] Support Intel AMX-AVX512 |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_binutils_build--master-arm fail Patch failed to apply
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 fail Patch failed to apply

Commit Message

Jiang, Haochen Jan. 3, 2025, 2:50 a.m. UTC
  Hi all,

Although there is still opensin AMX-AVX512 encoding, I would like
to first send out the v2 patch with the current encodings since in
this patch, there are also other parts need to be reviewed.

Patch descrption and changes are embedded following.

The encoding issue I mentioned previously is on tcvtrowps2[bf16,ph][h,l].
For Reg32 part, it is ok. However, for Imm8 part, under current HW design,
it is split to two opcodes. It is not an ideal design. Due to Christmas/
New Year Holiday, the answer for whether it could be changed is still
delayed for now. I will update that as soon as I get the answer.

Thx,
Haochen

---

Changes in v2:

  - Pull out all GPR mode out of vex length switch in OP_VEX to make
    it more general.
  - Remove invalid test for 32-bit.
  - Reuse VexGdq for operands.
  - Update the mnemonics from tcvtrowps2pbf16[h,l] to tcvtrowps2bf16[h,l]
    according to ISE056.

---

This patch will support AMX-AVX512. In disassmbler, we pull out all
GPR mode out of the vex length switch to make it more general.

---

gas/ChangeLog:

	* config/tc-i386.c: Add amx_avx512.
	* doc/c-i386.texi: Document .amx_avx512.
	* testsuite/gas/i386/x86-64.exp: Run AMX-AVX512 tests.
	* testsuite/gas/i386/x86-64-amx-avx512-intel.d: New test.
	* testsuite/gas/i386/x86-64-amx-avx512.d: Ditto.
	* testsuite/gas/i386/x86-64-amx-avx512.s: Ditto.

opcodes/ChangeLog:

	* i386-dis-evex-len.h: Add EVEX_LEN_0F384A_X86_64_W_0,
	EVEX_LEN_0F386D_X86_64_W_0, EVEX_LEN_0F3A07_X86_64_W_0,
	EVEX_LEN_0F3A77_X86_64_W_0.
	* i386-dis-evex-prefix.h: Add PREFIX_EVEX_0F384A_W_0_L_2,
	PREFIX_EVEX_0F386D_W_0_L_2, PREFIX_EVEX_0F3A07_W_0_L_2,
	PREFIX_EVEX_0F3A77_W_0_L_2.
	* i386-dis-evex-w.h: Add EVEX_W_0F384A_X86_64, EVEX_W_0F386D_X86_64,
	EVEX_W_0F3A07_X86_64, EVEX_W_0F3A77_X86_64.
	* i386-dis-evex-x86-64.h: Add X86_64_EVEX_0F384A, X86_64_EVEX_0F386D,
	X86_64_EVEX_0F3A07, X86_64_EVEX_0F3A77.
	* i386-dis-evex.h: Ditto.
	* i386-dis.c (EVEX_LEN_0F384A_X86_64_W_0): New.
	(EVEX_LEN_0F386D_X86_64_W_0): Ditto.
	(EVEX_LEN_0F3A07_X86_64_W_0): Ditto.
	(EVEX_LEN_0F3A77_X86_64_W_0): Ditto.
	(MOD_EVEX_0F384A_X86_64_W_0): Ditto.
	(MOD_EVEX_0F386D_X86_64_W_0): Ditto.
	(MOD_EVEX_0F3A07_X86_64_W_0): Ditto.
	(MOD_EVEX_0F3A77_X86_64_W_0): Ditto.
	(PREFIX_EVEX_0F384A_W_0_L_2): Ditto.
	(PREFIX_EVEX_0F386D_W_0_L_2): Ditto.
	(PREFIX_EVEX_0F3A07_W_0_L_2): Ditto.
	(PREFIX_EVEX_0F3A77_W_0_L_2): Ditto.
	(EVEX_W_0F384A_X86_64): Ditto.
	(EVEX_W_0F386D_X86_64): Ditto.
	(EVEX_W_0F3A07_X86_64): Ditto.
	(EVEX_W_0F3A77_X86_64): Ditto.
	(X86_64_EVEX_0F384A): Ditto.
	(X86_64_EVEX_0F386D): Ditto.
	(X86_64_EVEX_0F3A07): Ditto.
	(X86_64_EVEX_0F3A77): Ditto.
	(OP_VEX): Pull out all GPR mode out of the vector length switch.
	* i386-gen.c (isa_dependencies): Add AMX-AVX512.
	(cpu_flags): Ditto.
	* i386-init.h: Regenerated.
	* i386-mnem.h: Ditto.
	* i386-opc.h (CpuAMX_AVX512): New.
	(i386_cpu_flags): Add cpuamx_avx512.
	* i386-opc.tbl: Add AMX-AVX512 instructions.
	* i386-tbl.h: Regenerated.
---
 gas/config/tc-i386.c                          |    1 +
 gas/doc/c-i386.texi                           |    4 +-
 .../gas/i386/x86-64-amx-avx512-intel.d        |   35 +
 gas/testsuite/gas/i386/x86-64-amx-avx512.d    |   34 +
 gas/testsuite/gas/i386/x86-64-amx-avx512.s    |   55 +
 gas/testsuite/gas/i386/x86-64.exp             |    2 +
 opcodes/i386-dis-evex-len.h                   |   23 +
 opcodes/i386-dis-evex-prefix.h                |   27 +
 opcodes/i386-dis-evex-w.h                     |   12 +
 opcodes/i386-dis-evex-x86-64.h                |   15 +
 opcodes/i386-dis-evex.h                       |    6 +-
 opcodes/i386-dis.c                            |   52 +-
 opcodes/i386-gen.c                            |    3 +
 opcodes/i386-init.h                           |  718 ++---
 opcodes/i386-mnem.h                           | 2576 +++++++++--------
 opcodes/i386-opc.h                            |    3 +
 opcodes/i386-opc.tbl                          |   15 +
 opcodes/i386-tbl.h                            |  415 ++-
 18 files changed, 2210 insertions(+), 1786 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512.s
  

Comments

Jan Beulich Jan. 6, 2025, 4:29 p.m. UTC | #1
On 03.01.2025 03:50, Haochen Jiang wrote:
> Hi all,
> 
> Although there is still opensin AMX-AVX512 encoding, I would like
> to first send out the v2 patch with the current encodings since in
> this patch, there are also other parts need to be reviewed.
> 
> Patch descrption and changes are embedded following.
> 
> The encoding issue I mentioned previously is on tcvtrowps2[bf16,ph][h,l].
> For Reg32 part, it is ok. However, for Imm8 part, under current HW design,
> it is split to two opcodes. It is not an ideal design. Due to Christmas/
> New Year Holiday, the answer for whether it could be changed is still
> delayed for now. I will update that as soon as I get the answer.
> 
> Thx,
> Haochen
> 
> ---
> 
> Changes in v2:
> 
>   - Pull out all GPR mode out of vex length switch in OP_VEX to make
>     it more general.
>   - Remove invalid test for 32-bit.
>   - Reuse VexGdq for operands.
>   - Update the mnemonics from tcvtrowps2pbf16[h,l] to tcvtrowps2bf16[h,l]
>     according to ISE056.
> 
> ---
> 
> This patch will support AMX-AVX512. In disassmbler, we pull out all
> GPR mode out of the vex length switch to make it more general.
> 
> ---
> 
> gas/ChangeLog:
> 
> 	* config/tc-i386.c: Add amx_avx512.
> 	* doc/c-i386.texi: Document .amx_avx512.
> 	* testsuite/gas/i386/x86-64.exp: Run AMX-AVX512 tests.
> 	* testsuite/gas/i386/x86-64-amx-avx512-intel.d: New test.
> 	* testsuite/gas/i386/x86-64-amx-avx512.d: Ditto.
> 	* testsuite/gas/i386/x86-64-amx-avx512.s: Ditto.
> 
> opcodes/ChangeLog:
> 
> 	* i386-dis-evex-len.h: Add EVEX_LEN_0F384A_X86_64_W_0,
> 	EVEX_LEN_0F386D_X86_64_W_0, EVEX_LEN_0F3A07_X86_64_W_0,
> 	EVEX_LEN_0F3A77_X86_64_W_0.
> 	* i386-dis-evex-prefix.h: Add PREFIX_EVEX_0F384A_W_0_L_2,
> 	PREFIX_EVEX_0F386D_W_0_L_2, PREFIX_EVEX_0F3A07_W_0_L_2,
> 	PREFIX_EVEX_0F3A77_W_0_L_2.
> 	* i386-dis-evex-w.h: Add EVEX_W_0F384A_X86_64, EVEX_W_0F386D_X86_64,
> 	EVEX_W_0F3A07_X86_64, EVEX_W_0F3A77_X86_64.
> 	* i386-dis-evex-x86-64.h: Add X86_64_EVEX_0F384A, X86_64_EVEX_0F386D,
> 	X86_64_EVEX_0F3A07, X86_64_EVEX_0F3A77.
> 	* i386-dis-evex.h: Ditto.
> 	* i386-dis.c (EVEX_LEN_0F384A_X86_64_W_0): New.
> 	(EVEX_LEN_0F386D_X86_64_W_0): Ditto.
> 	(EVEX_LEN_0F3A07_X86_64_W_0): Ditto.
> 	(EVEX_LEN_0F3A77_X86_64_W_0): Ditto.
> 	(MOD_EVEX_0F384A_X86_64_W_0): Ditto.
> 	(MOD_EVEX_0F386D_X86_64_W_0): Ditto.
> 	(MOD_EVEX_0F3A07_X86_64_W_0): Ditto.
> 	(MOD_EVEX_0F3A77_X86_64_W_0): Ditto.
> 	(PREFIX_EVEX_0F384A_W_0_L_2): Ditto.
> 	(PREFIX_EVEX_0F386D_W_0_L_2): Ditto.
> 	(PREFIX_EVEX_0F3A07_W_0_L_2): Ditto.
> 	(PREFIX_EVEX_0F3A77_W_0_L_2): Ditto.
> 	(EVEX_W_0F384A_X86_64): Ditto.
> 	(EVEX_W_0F386D_X86_64): Ditto.
> 	(EVEX_W_0F3A07_X86_64): Ditto.
> 	(EVEX_W_0F3A77_X86_64): Ditto.
> 	(X86_64_EVEX_0F384A): Ditto.
> 	(X86_64_EVEX_0F386D): Ditto.
> 	(X86_64_EVEX_0F3A07): Ditto.
> 	(X86_64_EVEX_0F3A77): Ditto.
> 	(OP_VEX): Pull out all GPR mode out of the vector length switch.
> 	* i386-gen.c (isa_dependencies): Add AMX-AVX512.
> 	(cpu_flags): Ditto.
> 	* i386-init.h: Regenerated.
> 	* i386-mnem.h: Ditto.
> 	* i386-opc.h (CpuAMX_AVX512): New.
> 	(i386_cpu_flags): Add cpuamx_avx512.
> 	* i386-opc.tbl: Add AMX-AVX512 instructions.
> 	* i386-tbl.h: Regenerated.
> ---
>  gas/config/tc-i386.c                          |    1 +
>  gas/doc/c-i386.texi                           |    4 +-
>  .../gas/i386/x86-64-amx-avx512-intel.d        |   35 +
>  gas/testsuite/gas/i386/x86-64-amx-avx512.d    |   34 +
>  gas/testsuite/gas/i386/x86-64-amx-avx512.s    |   55 +
>  gas/testsuite/gas/i386/x86-64.exp             |    2 +
>  opcodes/i386-dis-evex-len.h                   |   23 +
>  opcodes/i386-dis-evex-prefix.h                |   27 +
>  opcodes/i386-dis-evex-w.h                     |   12 +
>  opcodes/i386-dis-evex-x86-64.h                |   15 +
>  opcodes/i386-dis-evex.h                       |    6 +-
>  opcodes/i386-dis.c                            |   52 +-
>  opcodes/i386-gen.c                            |    3 +
>  opcodes/i386-init.h                           |  718 ++---
>  opcodes/i386-mnem.h                           | 2576 +++++++++--------
>  opcodes/i386-opc.h                            |    3 +
>  opcodes/i386-opc.tbl                          |   15 +
>  opcodes/i386-tbl.h                            |  415 ++-
>  18 files changed, 2210 insertions(+), 1786 deletions(-)
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512-intel.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512.s

It is again nowhere in the patch metadata that you put down what other
non-upstream patch(es) this one goes on top of. This is important info
for a reviewer. Since this isn't the first time, let me make it quite
clear: Going forward I may simply refuse to review (reject) patches
with unclear dependencies.

> @@ -14070,6 +14083,29 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
>        return true;
>      }
>  
> +  switch (bytemode)
> +    {
> +      case v_mode:
> +      case dq_mode:
> +	if (ins->rex & REX_W)
> +	  names = att_names64;
> +	else if (bytemode == v_mode
> +		  && !(sizeflag & DFLAG))
> +	  names = att_names16;
> +	else
> +	  names = att_names32;
> +	oappend_register (ins, names[reg]);
> +	return true;
> +      case b_mode:
> +	names = att_names8rex;
> +	oappend_register (ins, names[reg]);
> +	return true;
> +      case q_mode:
> +	names = att_names64;
> +	oappend_register (ins, names[reg]);
> +	return true;
> +    }

I think there are two ways to improve legibility here: Either pull out the
call to oappend_register() (and the return), or avoid using the "names"
local var in the latter two cases.

> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -3243,6 +3243,21 @@ t2rpntlvw<z>rs<loc>, 0x<z:pfx>f8 | <loc:opc>, AMX_TRANSPOSE&APX_F(AMX_MOVRS), Si
>  tileloaddrs, 0xf24a, APX_F(AMX_MOVRS), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
>  tileloaddrst1, 0x664a, APX_F(AMX_MOVRS), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
>  
> +tcvtrowd2ps, 0xf34a, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowd2ps, 0xf307, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +
> +tcvtrowps2bf16h, 0xf26d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowps2bf16h, 0xf207, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +tcvtrowps2bf16l, 0xf36d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowps2bf16l, 0xf377, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +tcvtrowps2phh, 0x6d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowps2phh, 0x07, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +tcvtrowps2phl, 0x666d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tcvtrowps2phl, 0xf277, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
> +
> +tilemovrow, 0x664a, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
> +tilemovrow, 0x6607, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }

Just to double check: The AVX10.2/256 case really is of no interest for this
feature / these insns, and your designers would rather introduce yet another
CPUID flag in case it turned out desirable later on?

Jan
  
Jiang, Haochen Jan. 7, 2025, 2:50 a.m. UTC | #2
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Tuesday, January 7, 2025 12:29 AM
> 
> On 03.01.2025 03:50, Haochen Jiang wrote:
> > Hi all,
> >
> It is again nowhere in the patch metadata that you put down what other non-
> upstream patch(es) this one goes on top of. This is important info for a
> reviewer. Since this isn't the first time, let me make it quite
> clear: Going forward I may simply refuse to review (reject) patches with
> unclear dependencies.

Let me clarify why it is sent out this way. The patch is originally in the same
series with previous patches (some are upstreamed, some are not approved for
current status). But due to the encoding issue, it got split out of the series during
the review.

Maybe I should name this explicitly to avoid that inconvenience or send them
within the patch series to make everything better.

> 
> > @@ -14070,6 +14083,29 @@ OP_VEX (instr_info *ins, int bytemode, int
> sizeflag ATTRIBUTE_UNUSED)
> >        return true;
> >      }
> >
> > +  switch (bytemode)
> > +    {
> > +      case v_mode:
> > +      case dq_mode:
> > +	if (ins->rex & REX_W)
> > +	  names = att_names64;
> > +	else if (bytemode == v_mode
> > +		  && !(sizeflag & DFLAG))
> > +	  names = att_names16;
> > +	else
> > +	  names = att_names32;
> > +	oappend_register (ins, names[reg]);
> > +	return true;
> > +      case b_mode:
> > +	names = att_names8rex;
> > +	oappend_register (ins, names[reg]);
> > +	return true;
> > +      case q_mode:
> > +	names = att_names64;
> > +	oappend_register (ins, names[reg]);
> > +	return true;
> > +    }
> 
> I think there are two ways to improve legibility here: Either pull out the call to
> oappend_register() (and the return), or avoid using the "names"
> local var in the latter two cases.

Let me have a try.

> 
> > +tilemovrow, 0x664a, AMX_AVX512,
> > +Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32,
> RegTMM, RegZMM
> > +} tilemovrow, 0x6607, AMX_AVX512,
> > +Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM,
> RegZMM }
> 
> Just to double check: The AVX10.2/256 case really is of no interest for this
> feature / these insns, and your designers would rather introduce yet another
> CPUID flag in case it turned out desirable later on?

Yes, they have nothing to do with AVX10.2/256.

Thx,
Haochen
  

Patch

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index fb347ced16d..2d4fe94e4ec 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1186,6 +1186,7 @@  static const arch_entry cpu_arch[] =
   SUBARCH (amx_tf32, AMX_TF32, ANY_AMX_TF32, false),
   SUBARCH (amx_fp8, AMX_FP8, ANY_AMX_FP8, false),
   SUBARCH (amx_movrs, AMX_MOVRS, ANY_AMX_MOVRS, false),
+  SUBARCH (amx_avx512, AMX_AVX512, ANY_AMX_AVX512, false),
   SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false),
   SUBARCH (movdiri, MOVDIRI, MOVDIRI, false),
   SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false),
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index b4b17b96ab2..bfd15b1ac10 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -233,6 +233,7 @@  accept various extension mnemonics.  For example,
 @code{amx_tf32},
 @code{amx_fp8}
 @code{amx_movrs},
+@code{amx_avx512},
 @code{amx_tile},
 @code{vmx},
 @code{vmfunc},
@@ -1707,7 +1708,8 @@  supported on the CPU specified.  The choices for @var{cpu_type} are:
 @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk}
 @item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16}
 @item @samp{.amx_complex} @tab @samp{.amx_transpose} @tab @samp{.amx_tf32}
-@item @samp{.amx_fp8} @tab @samp{.amx_movrs} @tab @samp{.amx_tile}
+@item @samp{.amx_fp8} @tab @samp{.amx_movrs} @tab @samp{.amx_avx512}
+@item @samp{.amx_tile}
 @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset}
 @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5}
 @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme}
diff --git a/gas/testsuite/gas/i386/x86-64-amx-avx512-intel.d b/gas/testsuite/gas/i386/x86-64-amx-avx512-intel.d
new file mode 100644
index 00000000000..33e6d01a558
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-avx512-intel.d
@@ -0,0 +1,35 @@ 
+#objdump: -dw -Mintel
+#name: x86_64 AMX-AVX512 insns (Intel disassembly)
+#source: x86-64-amx-avx512.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+#...
+[a-f0-9]+ <_intel>:
+\s*[a-f0-9]+:\s*62 62 6e 48 4a f5\s+tcvtrowd2ps zmm30,tmm5,edx
+\s*[a-f0-9]+:\s*62 62 6e 48 4a f2\s+tcvtrowd2ps zmm30,tmm2,edx
+\s*[a-f0-9]+:\s*62 63 7e 48 07 f5 7b\s+tcvtrowd2ps zmm30,tmm5,0x7b
+\s*[a-f0-9]+:\s*62 63 7e 48 07 f2 7b\s+tcvtrowd2ps zmm30,tmm2,0x7b
+\s*[a-f0-9]+:\s*62 62 6f 48 6d f5\s+tcvtrowps2bf16h zmm30,tmm5,edx
+\s*[a-f0-9]+:\s*62 62 6f 48 6d f2\s+tcvtrowps2bf16h zmm30,tmm2,edx
+\s*[a-f0-9]+:\s*62 63 7f 48 07 f5 7b\s+tcvtrowps2bf16h zmm30,tmm5,0x7b
+\s*[a-f0-9]+:\s*62 63 7f 48 07 f2 7b\s+tcvtrowps2bf16h zmm30,tmm2,0x7b
+\s*[a-f0-9]+:\s*62 62 6e 48 6d f5\s+tcvtrowps2bf16l zmm30,tmm5,edx
+\s*[a-f0-9]+:\s*62 62 6e 48 6d f2\s+tcvtrowps2bf16l zmm30,tmm2,edx
+\s*[a-f0-9]+:\s*62 63 7e 48 77 f5 7b\s+tcvtrowps2bf16l zmm30,tmm5,0x7b
+\s*[a-f0-9]+:\s*62 63 7e 48 77 f2 7b\s+tcvtrowps2bf16l zmm30,tmm2,0x7b
+\s*[a-f0-9]+:\s*62 62 6c 48 6d f5\s+tcvtrowps2phh zmm30,tmm5,edx
+\s*[a-f0-9]+:\s*62 62 6c 48 6d f2\s+tcvtrowps2phh zmm30,tmm2,edx
+\s*[a-f0-9]+:\s*62 63 7c 48 07 f5 7b\s+tcvtrowps2phh zmm30,tmm5,0x7b
+\s*[a-f0-9]+:\s*62 63 7c 48 07 f2 7b\s+tcvtrowps2phh zmm30,tmm2,0x7b
+\s*[a-f0-9]+:\s*62 62 6d 48 6d f5\s+tcvtrowps2phl zmm30,tmm5,edx
+\s*[a-f0-9]+:\s*62 62 6d 48 6d f2\s+tcvtrowps2phl zmm30,tmm2,edx
+\s*[a-f0-9]+:\s*62 63 7f 48 77 f5 7b\s+tcvtrowps2phl zmm30,tmm5,0x7b
+\s*[a-f0-9]+:\s*62 63 7f 48 77 f2 7b\s+tcvtrowps2phl zmm30,tmm2,0x7b
+\s*[a-f0-9]+:\s*62 62 6d 48 4a f5\s+tilemovrow zmm30,tmm5,edx
+\s*[a-f0-9]+:\s*62 62 6d 48 4a f2\s+tilemovrow zmm30,tmm2,edx
+\s*[a-f0-9]+:\s*62 63 7d 48 07 f5 7b\s+tilemovrow zmm30,tmm5,0x7b
+\s*[a-f0-9]+:\s*62 63 7d 48 07 f2 7b\s+tilemovrow zmm30,tmm2,0x7b
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-amx-avx512.d b/gas/testsuite/gas/i386/x86-64-amx-avx512.d
new file mode 100644
index 00000000000..d2f8ac6e51e
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-avx512.d
@@ -0,0 +1,34 @@ 
+#objdump: -dw
+#name: x86_64 AMX-AVX512 insns
+#source: x86-64-amx-avx512.s
+
+.*: +file format .*
+
+Disassembly of section \.text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*62 62 6e 48 4a f5\s+tcvtrowd2ps %edx,%tmm5,%zmm30
+\s*[a-f0-9]+:\s*62 62 6e 48 4a f2\s+tcvtrowd2ps %edx,%tmm2,%zmm30
+\s*[a-f0-9]+:\s*62 63 7e 48 07 f5 7b\s+tcvtrowd2ps \$0x7b,%tmm5,%zmm30
+\s*[a-f0-9]+:\s*62 63 7e 48 07 f2 7b\s+tcvtrowd2ps \$0x7b,%tmm2,%zmm30
+\s*[a-f0-9]+:\s*62 62 6f 48 6d f5\s+tcvtrowps2bf16h %edx,%tmm5,%zmm30
+\s*[a-f0-9]+:\s*62 62 6f 48 6d f2\s+tcvtrowps2bf16h %edx,%tmm2,%zmm30
+\s*[a-f0-9]+:\s*62 63 7f 48 07 f5 7b\s+tcvtrowps2bf16h \$0x7b,%tmm5,%zmm30
+\s*[a-f0-9]+:\s*62 63 7f 48 07 f2 7b\s+tcvtrowps2bf16h \$0x7b,%tmm2,%zmm30
+\s*[a-f0-9]+:\s*62 62 6e 48 6d f5\s+tcvtrowps2bf16l %edx,%tmm5,%zmm30
+\s*[a-f0-9]+:\s*62 62 6e 48 6d f2\s+tcvtrowps2bf16l %edx,%tmm2,%zmm30
+\s*[a-f0-9]+:\s*62 63 7e 48 77 f5 7b\s+tcvtrowps2bf16l \$0x7b,%tmm5,%zmm30
+\s*[a-f0-9]+:\s*62 63 7e 48 77 f2 7b\s+tcvtrowps2bf16l \$0x7b,%tmm2,%zmm30
+\s*[a-f0-9]+:\s*62 62 6c 48 6d f5\s+tcvtrowps2phh %edx,%tmm5,%zmm30
+\s*[a-f0-9]+:\s*62 62 6c 48 6d f2\s+tcvtrowps2phh %edx,%tmm2,%zmm30
+\s*[a-f0-9]+:\s*62 63 7c 48 07 f5 7b\s+tcvtrowps2phh \$0x7b,%tmm5,%zmm30
+\s*[a-f0-9]+:\s*62 63 7c 48 07 f2 7b\s+tcvtrowps2phh \$0x7b,%tmm2,%zmm30
+\s*[a-f0-9]+:\s*62 62 6d 48 6d f5\s+tcvtrowps2phl %edx,%tmm5,%zmm30
+\s*[a-f0-9]+:\s*62 62 6d 48 6d f2\s+tcvtrowps2phl %edx,%tmm2,%zmm30
+\s*[a-f0-9]+:\s*62 63 7f 48 77 f5 7b\s+tcvtrowps2phl \$0x7b,%tmm5,%zmm30
+\s*[a-f0-9]+:\s*62 63 7f 48 77 f2 7b\s+tcvtrowps2phl \$0x7b,%tmm2,%zmm30
+\s*[a-f0-9]+:\s*62 62 6d 48 4a f5\s+tilemovrow %edx,%tmm5,%zmm30
+\s*[a-f0-9]+:\s*62 62 6d 48 4a f2\s+tilemovrow %edx,%tmm2,%zmm30
+\s*[a-f0-9]+:\s*62 63 7d 48 07 f5 7b\s+tilemovrow \$0x7b,%tmm5,%zmm30
+\s*[a-f0-9]+:\s*62 63 7d 48 07 f2 7b\s+tilemovrow \$0x7b,%tmm2,%zmm30
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-amx-avx512.s b/gas/testsuite/gas/i386/x86-64-amx-avx512.s
new file mode 100644
index 00000000000..6df493430a0
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-avx512.s
@@ -0,0 +1,55 @@ 
+# Check 64bit AMX-AVX512 instructions
+
+	.text
+_start:
+	tcvtrowd2ps	%edx, %tmm5, %zmm30
+	tcvtrowd2ps	%edx, %tmm2, %zmm30
+	tcvtrowd2ps	$123, %tmm5, %zmm30
+	tcvtrowd2ps	$123, %tmm2, %zmm30
+	tcvtrowps2bf16h	%edx, %tmm5, %zmm30
+	tcvtrowps2bf16h	%edx, %tmm2, %zmm30
+	tcvtrowps2bf16h	$123, %tmm5, %zmm30
+	tcvtrowps2bf16h	$123, %tmm2, %zmm30
+	tcvtrowps2bf16l	%edx, %tmm5, %zmm30
+	tcvtrowps2bf16l	%edx, %tmm2, %zmm30
+	tcvtrowps2bf16l	$123, %tmm5, %zmm30
+	tcvtrowps2bf16l	$123, %tmm2, %zmm30
+	tcvtrowps2phh	%edx, %tmm5, %zmm30
+	tcvtrowps2phh	%edx, %tmm2, %zmm30
+	tcvtrowps2phh	$123, %tmm5, %zmm30
+	tcvtrowps2phh	$123, %tmm2, %zmm30
+	tcvtrowps2phl	%edx, %tmm5, %zmm30
+	tcvtrowps2phl	%edx, %tmm2, %zmm30
+	tcvtrowps2phl	$123, %tmm5, %zmm30
+	tcvtrowps2phl	$123, %tmm2, %zmm30
+	tilemovrow	%edx, %tmm5, %zmm30
+	tilemovrow	%edx, %tmm2, %zmm30
+	tilemovrow	$123, %tmm5, %zmm30
+	tilemovrow	$123, %tmm2, %zmm30
+
+_intel:
+	.intel_syntax noprefix
+	tcvtrowd2ps	zmm30, tmm5, edx
+	tcvtrowd2ps	zmm30, tmm2, edx
+	tcvtrowd2ps	zmm30, tmm5, 123
+	tcvtrowd2ps	zmm30, tmm2, 123
+	tcvtrowps2bf16h	zmm30, tmm5, edx
+	tcvtrowps2bf16h	zmm30, tmm2, edx
+	tcvtrowps2bf16h	zmm30, tmm5, 123
+	tcvtrowps2bf16h	zmm30, tmm2, 123
+	tcvtrowps2bf16l	zmm30, tmm5, edx
+	tcvtrowps2bf16l	zmm30, tmm2, edx
+	tcvtrowps2bf16l	zmm30, tmm5, 123
+	tcvtrowps2bf16l	zmm30, tmm2, 123
+	tcvtrowps2phh	zmm30, tmm5, edx
+	tcvtrowps2phh	zmm30, tmm2, edx
+	tcvtrowps2phh	zmm30, tmm5, 123
+	tcvtrowps2phh	zmm30, tmm2, 123
+	tcvtrowps2phl	zmm30, tmm5, edx
+	tcvtrowps2phl	zmm30, tmm2, edx
+	tcvtrowps2phl	zmm30, tmm5, 123
+	tcvtrowps2phl	zmm30, tmm2, 123
+	tilemovrow	zmm30, tmm5, edx
+	tilemovrow	zmm30, tmm2, edx
+	tilemovrow	zmm30, tmm5, 123
+	tilemovrow	zmm30, tmm2, 123
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index 44a8d7c8260..edacbaa0f20 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -538,6 +538,8 @@  run_dump_test "x86-64-amx-fp8-bad"
 run_dump_test "x86-64-amx-movrs"
 run_dump_test "x86-64-amx-movrs-intel"
 run_list_test "x86-64-amx-movrs-inval"
+run_dump_test "x86-64-amx-avx512"
+run_dump_test "x86-64-amx-avx512-intel"
 run_dump_test "x86-64-movrs"
 run_dump_test "x86-64-movrs-intel"
 run_dump_test "x86-64-movrs-avx10_2-512"
diff --git a/opcodes/i386-dis-evex-len.h b/opcodes/i386-dis-evex-len.h
index 2b4361f7ae6..434e051bb63 100644
--- a/opcodes/i386-dis-evex-len.h
+++ b/opcodes/i386-dis-evex-len.h
@@ -47,6 +47,8 @@  static const struct dis386 evex_len_table[][3] = {
   /* EVEX_LEN_0F384A_X86_64_W_0 */
   {
     { X86_64_EVEX_PFX_TABLE (PREFIX_VEX_0F384A_X86_64_W_0_L_0) },
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_EVEX_0F384A_X86_64_W_0_L_2) },
   },
 
   /* EVEX_LEN_0F385A */
@@ -63,6 +65,13 @@  static const struct dis386 evex_len_table[][3] = {
     { VEX_W_TABLE (EVEX_W_0F385B_L_2) },
   },
 
+  /* EVEX_LEN_0F386D_X86_64_W_0_M_1 */
+  {
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_EVEX_0F386D_X86_64_W_0_L_2) },
+  },
+
   /* EVEX_LEN_0F38C6 */
   {
     { Bad_Opcode },
@@ -91,6 +100,13 @@  static const struct dis386 evex_len_table[][3] = {
     { VEX_W_TABLE (VEX_W_0F3A01_L_1) },
   },
 
+  /* EVEX_LEN_0F3A07_X86_64_W_0 */
+  {
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_EVEX_0F3A07_X86_64_W_0_L_2) },
+  },
+
   /* EVEX_LEN_0F3A18 */
   {
     { Bad_Opcode },
@@ -161,6 +177,13 @@  static const struct dis386 evex_len_table[][3] = {
     { VEX_W_TABLE (EVEX_W_0F3A43_L_n) },
   },
 
+  /* EVEX_LEN_0F3A77_X86_64_W_0 */
+  {
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { PREFIX_TABLE (PREFIX_EVEX_0F3A77_X86_64_W_0_L_2) },
+  },
+
   /* EVEX_LEN_MAP5_6E */
   {
     { PREFIX_TABLE (PREFIX_EVEX_MAP5_6E_L_0) },
diff --git a/opcodes/i386-dis-evex-prefix.h b/opcodes/i386-dis-evex-prefix.h
index f4c65b3c06d..4683a347b50 100644
--- a/opcodes/i386-dis-evex-prefix.h
+++ b/opcodes/i386-dis-evex-prefix.h
@@ -243,6 +243,12 @@ 
     { VEX_W_TABLE (EVEX_W_0F383A_P_1) },
     { "%XEvpminuw",	{ XM, Vex, EXx }, 0 },
   },
+  /* PREFIX_EVEX_0F384A_W_0_L_2 */
+  {
+    { Bad_Opcode },
+    { "tcvtrowd2ps",	{ XM, Rtmm, VexGdq }, 0 },
+    { "tilemovrow",	{ XM, Rtmm, VexGdq }, 0 },
+  },
   /* PREFIX_EVEX_0F3852 */
   {
     { "vdpphp%XS",	{ XM, Vex, EXx }, 0 },
@@ -264,6 +270,13 @@ 
     { Bad_Opcode },
     { "vp2intersectY%DQ", { MaskG, Vex, EXx, EXxEVexS }, 0 },
   },
+  /* PREFIX_EVEX_0F386D_W_0_L_2 */
+  {
+    { "tcvtrowps2phh",	{ XM, Rtmm, VexGdq }, 0 },
+    { "tcvtrowps2bf16l",	{ XM, Rtmm, VexGdq }, 0 },
+    { "tcvtrowps2phl",	{ XM, Rtmm, VexGdq }, 0 },
+    { "tcvtrowps2bf16h",	{ XM, Rtmm, VexGdq }, 0 },
+  },
   /* PREFIX_EVEX_0F3872 */
   {
     { Bad_Opcode },
@@ -306,6 +319,13 @@ 
     { "%XEvfmsub213s%XW",	{ XMScalar, VexScalar, EXdq, EXxEVexR }, 0 },
     { "v4fnmadds%XS",	{ XMScalar, VexScalar, Mxmm }, 0 },
   },
+  /* PREFIX_EVEX_0F3A07_W_0_L_2 */
+  {
+    { "tcvtrowps2phh",	{ XM, Rtmm, Ib }, 0 },
+    { "tcvtrowd2ps",	{ XM, Rtmm, Ib }, 0 },
+    { "tilemovrow",	{ XM, Rtmm, Ib }, 0 },
+    { "tcvtrowps2bf16h",	{ XM, Rtmm, Ib }, 0 },
+  },
   /* PREFIX_EVEX_0F3A08 */
   {
     { "vrndscalep%XH",  { XM, EXxh, EXxEVexS, Ib }, 0 },
@@ -377,6 +397,13 @@ 
     { Bad_Opcode },
     { "vfpclasss%XW",	{ MaskG, EXdq, Ib }, 0 },
   },
+  /* PREFIX_EVEX_0F3A77_W_0_L_2 */
+  {
+    { Bad_Opcode },
+    { "tcvtrowps2bf16l",	{ XM, Rtmm, Ib }, 0 },
+    { Bad_Opcode },
+    { "tcvtrowps2phl",	{ XM, Rtmm, Ib }, 0 },
+  },
   /* PREFIX_EVEX_0F3AC2 */
   {
     { "vcmpp%XH", { MaskG, Vex, EXxh, EXxEVexS, CMP }, 0 },
diff --git a/opcodes/i386-dis-evex-w.h b/opcodes/i386-dis-evex-w.h
index 4ca3664e1ad..74628d4b138 100644
--- a/opcodes/i386-dis-evex-w.h
+++ b/opcodes/i386-dis-evex-w.h
@@ -365,6 +365,10 @@ 
     { "vbroadcasti32x8",	{ XM, Mymm }, PREFIX_DATA },
     { "vbroadcasti64x4",	{ XM, Mymm }, PREFIX_DATA },
   },
+  /* EVEX_W_0F386D_X86_64 */
+  {
+    { EVEX_LEN_TABLE (EVEX_LEN_0F386D_X86_64_W_0) },
+  },
   /* EVEX_W_0F3870 */
   {
     { Bad_Opcode },
@@ -388,6 +392,10 @@ 
     { Bad_Opcode },
     { "vpmultishiftqb",	{ XM, Vex, EXx }, PREFIX_DATA },
   },
+  /* EVEX_W_0F3A07_X86_64 */
+  {
+    { EVEX_LEN_TABLE (EVEX_LEN_0F3A07_X86_64_W_0) },
+  },
   /* EVEX_W_0F3A18_L_n */
   {
     { "vinsertf32x4",	{ XM, Vex, EXxmm, Ib }, PREFIX_DATA },
@@ -456,6 +464,10 @@ 
     { Bad_Opcode },
     { "vpshrdw",   { XM, Vex, EXx, Ib }, 0 },
   },
+  /* EVEX_W_0F3A77_X86_64 */
+  {
+    { EVEX_LEN_TABLE (EVEX_LEN_0F3A77_X86_64_W_0) },
+  },
   /* EVEX_W_MAP4_8A */
   {
     { MOD_TABLE (MOD_EVEX_MAP4_8A_W_0) },
diff --git a/opcodes/i386-dis-evex-x86-64.h b/opcodes/i386-dis-evex-x86-64.h
index 056a479536f..f7b61bce12d 100644
--- a/opcodes/i386-dis-evex-x86-64.h
+++ b/opcodes/i386-dis-evex-x86-64.h
@@ -3,6 +3,21 @@ 
     { Bad_Opcode },
     { VEX_W_TABLE (EVEX_W_0F384A_X86_64) },
   },
+  /* X86_64_EVEX_0F386D */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (EVEX_W_0F386D_X86_64) },
+  },
+  /* X86_64_EVEX_0F3A07 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (EVEX_W_0F3A07_X86_64) },
+  },
+  /* X86_64_EVEX_0F3A77 */
+  {
+    { Bad_Opcode },
+    { VEX_W_TABLE (EVEX_W_0F3A77_X86_64) },
+  },
   /* X86_64_EVEX_MAP5_6F_M_0 */
   {
     { Bad_Opcode },
diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
index 86cef4bd07b..bf15e41163d 100644
--- a/opcodes/i386-dis-evex.h
+++ b/opcodes/i386-dis-evex.h
@@ -415,7 +415,7 @@  static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_TABLE (X86_64_EVEX_0F386D) },
     { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F386E) },
     { X86_64_EVEX_FROM_VEX_TABLE (X86_64_VEX_0F386F) },
     /* 70 */
@@ -591,7 +591,7 @@  static const struct dis386 evex_table[][256] = {
     { VEX_W_TABLE (VEX_W_0F3A04) },
     { "%XEvpermilp%XD", { XM, EXx, Ib }, PREFIX_DATA },
     { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_TABLE (X86_64_EVEX_0F3A07) },
     /* 08 */
     { PREFIX_TABLE (PREFIX_EVEX_0F3A08) },
     { "vrndscalep%XD", { XM, EXx, EXxEVexS, Ib }, PREFIX_DATA },
@@ -717,7 +717,7 @@  static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
+    { X86_64_TABLE (X86_64_EVEX_0F3A77) },
     /* 78 */
     { Bad_Opcode },
     { Bad_Opcode },
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index 6f77f38460a..c22d812bd19 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -1214,9 +1214,11 @@  enum
   PREFIX_EVEX_0F3838,
   PREFIX_EVEX_0F3839,
   PREFIX_EVEX_0F383A,
+  PREFIX_EVEX_0F384A_X86_64_W_0_L_2,
   PREFIX_EVEX_0F3852,
   PREFIX_EVEX_0F3853,
   PREFIX_EVEX_0F3868,
+  PREFIX_EVEX_0F386D_X86_64_W_0_L_2,
   PREFIX_EVEX_0F3872,
   PREFIX_EVEX_0F3874,
   PREFIX_EVEX_0F389A,
@@ -1224,6 +1226,7 @@  enum
   PREFIX_EVEX_0F38AA,
   PREFIX_EVEX_0F38AB,
 
+  PREFIX_EVEX_0F3A07_X86_64_W_0_L_2,
   PREFIX_EVEX_0F3A08,
   PREFIX_EVEX_0F3A0A,
   PREFIX_EVEX_0F3A26,
@@ -1235,6 +1238,7 @@  enum
   PREFIX_EVEX_0F3A57,
   PREFIX_EVEX_0F3A66,
   PREFIX_EVEX_0F3A67,
+  PREFIX_EVEX_0F3A77_X86_64_W_0_L_2,
   PREFIX_EVEX_0F3AC2,
 
   PREFIX_EVEX_MAP4_4x,
@@ -1384,6 +1388,9 @@  enum
   X86_64_VEX_MAP7_F8_L_0_W_0_R_0,
 
   X86_64_EVEX_0F384A,
+  X86_64_EVEX_0F386D,
+  X86_64_EVEX_0F3A07,
+  X86_64_EVEX_0F3A77,
 
   X86_64_EVEX_MAP5_6F_M_0,
 };
@@ -1583,10 +1590,12 @@  enum
   EVEX_LEN_0F384A_X86_64_W_0,
   EVEX_LEN_0F385A,
   EVEX_LEN_0F385B,
+  EVEX_LEN_0F386D_X86_64_W_0,
   EVEX_LEN_0F38C6,
   EVEX_LEN_0F38C7,
   EVEX_LEN_0F3A00,
   EVEX_LEN_0F3A01,
+  EVEX_LEN_0F3A07_X86_64_W_0,
   EVEX_LEN_0F3A18,
   EVEX_LEN_0F3A19,
   EVEX_LEN_0F3A1A,
@@ -1597,6 +1606,7 @@  enum
   EVEX_LEN_0F3A3A,
   EVEX_LEN_0F3A3B,
   EVEX_LEN_0F3A43,
+  EVEX_LEN_0F3A77_X86_64_W_0,
 
   EVEX_LEN_MAP5_6E,
   EVEX_LEN_MAP5_7E,
@@ -1816,12 +1826,14 @@  enum
   EVEX_W_0F3859,
   EVEX_W_0F385A_L_n,
   EVEX_W_0F385B_L_2,
+  EVEX_W_0F386D_X86_64,
   EVEX_W_0F3870,
   EVEX_W_0F3872_P_2,
   EVEX_W_0F387A,
   EVEX_W_0F387B,
   EVEX_W_0F3883,
 
+  EVEX_W_0F3A07_X86_64,
   EVEX_W_0F3A18_L_n,
   EVEX_W_0F3A19_L_n,
   EVEX_W_0F3A1A_L_2,
@@ -1836,6 +1848,7 @@  enum
   EVEX_W_0F3A43_L_n,
   EVEX_W_0F3A70,
   EVEX_W_0F3A72,
+  EVEX_W_0F3A77_X86_64,
 
   EVEX_W_MAP4_8A,
   EVEX_W_MAP4_8F_R_0,
@@ -14070,6 +14083,29 @@  OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
       return true;
     }
 
+  switch (bytemode)
+    {
+      case v_mode:
+      case dq_mode:
+	if (ins->rex & REX_W)
+	  names = att_names64;
+	else if (bytemode == v_mode
+		  && !(sizeflag & DFLAG))
+	  names = att_names16;
+	else
+	  names = att_names32;
+	oappend_register (ins, names[reg]);
+	return true;
+      case b_mode:
+	names = att_names8rex;
+	oappend_register (ins, names[reg]);
+	return true;
+      case q_mode:
+	names = att_names64;
+	oappend_register (ins, names[reg]);
+	return true;
+    }
+
   switch (ins->vex.length)
     {
     case 128:
@@ -14079,22 +14115,6 @@  OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
 	  names = att_names_xmm;
 	  ins->evex_used |= EVEX_len_used;
 	  break;
-	case v_mode:
-	case dq_mode:
-	  if (ins->rex & REX_W)
-	    names = att_names64;
-	  else if (bytemode == v_mode
-		   && !(sizeflag & DFLAG))
-	    names = att_names16;
-	  else
-	    names = att_names32;
-	  break;
-	case b_mode:
-	  names = att_names8rex;
-	  break;
-	case q_mode:
-	  names = att_names64;
-	  break;
 	case mask_bd_mode:
 	case mask_mode:
 	  if (reg > 0x7)
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index bc4d64bced7..0b08eb11e52 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -273,6 +273,8 @@  static const dependency isa_dependencies[] =
     "AMX_TILE" },
   { "AMX_MOVRS",
     "AMX_TILE" },
+  { "AMX_AVX512",
+    "AMX_TILE|AVX10_2" },
   { "KL",
     "SSE2" },
   { "WIDEKL",
@@ -444,6 +446,7 @@  static bitfield cpu_flags[] =
   BITFIELD (AMX_TF32),
   BITFIELD (AMX_FP8),
   BITFIELD (AMX_MOVRS),
+  BITFIELD (AMX_AVX512),
   BITFIELD (AMX_TILE),
   BITFIELD (MOVDIRI),
   BITFIELD (MOVDIR64B),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index 65bccdbfdb7..9ca9471df39 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -258,6 +258,8 @@  enum i386_cpu
   CpuAMX_FP8,
   /* AMX-MOVRS Instructions support required.  */
   CpuAMX_MOVRS,
+  /* AMX-AVX512 Instructions support required.  */
+  CpuAMX_AVX512,
   /* AMX-TILE instructions required */
   CpuAMX_TILE,
   /* GFNI instructions required */
@@ -512,6 +514,7 @@  typedef union i386_cpu_flags
       unsigned int cpuamx_tf32:1;
       unsigned int cpuamx_fp8:1;
       unsigned int cpuamx_movrs:1;
+      unsigned int cpuamx_avx512:1;
       unsigned int cpuamx_tile:1;
       unsigned int cpugfni:1;
       unsigned int cpuvaes:1;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 31359328de6..c7510b342b4 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -3243,6 +3243,21 @@  t2rpntlvw<z>rs<loc>, 0x<z:pfx>f8 | <loc:opc>, AMX_TRANSPOSE&APX_F(AMX_MOVRS), Si
 tileloaddrs, 0xf24a, APX_F(AMX_MOVRS), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
 tileloaddrst1, 0x664a, APX_F(AMX_MOVRS), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
 
+tcvtrowd2ps, 0xf34a, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
+tcvtrowd2ps, 0xf307, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
+
+tcvtrowps2bf16h, 0xf26d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
+tcvtrowps2bf16h, 0xf207, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
+tcvtrowps2bf16l, 0xf36d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
+tcvtrowps2bf16l, 0xf377, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
+tcvtrowps2phh, 0x6d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
+tcvtrowps2phh, 0x07, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
+tcvtrowps2phl, 0x666d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
+tcvtrowps2phl, 0xf277, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
+
+tilemovrow, 0x664a, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM }
+tilemovrow, 0x6607, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM }
+
 // AMX instructions end.
 
 // KEYLOCKER instructions.