From patchwork Wed Nov 13 08:44:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haochen Jiang X-Patchwork-Id: 100951 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 84455385843D for ; Wed, 13 Nov 2024 08:45:52 +0000 (GMT) X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) by sourceware.org (Postfix) with ESMTPS id B854F3858D26 for ; Wed, 13 Nov 2024 08:44:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B854F3858D26 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B854F3858D26 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.17 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731487492; cv=none; b=A0M7a6hFdH2loFqGty/XjcPHCkXvOTHmhaspmRbM5MKsmOVVKqL+xXNzrJPfuERK+6aMCrlibR/EXQXIxkwpGaHPeBe8mYpUvsf+PLsX1sJmiNZ+aUsAgCBTT7OEBYRCzRMyioprzScA21CgqBazspl0cNtFhoP8rGJXIqL1XcE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731487492; c=relaxed/simple; bh=l9ruQ/D1/arFd6l7l5PRJUqcKKML/jwznK1D/QgvN/c=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=IIYnDyB2ndhZpyNneUTusRP9O/jSITNxu77Wcg13C6MeOrXawcGDwM+emiaYNpkgbqm8XBskGA1A/VZ9Z0kghu89GbGdqFVJRPzU+zL3Je5Js/afkNa1ggz+S+5EH+xdWoSTN2+cPK7Q5S3FNedPawD1YBjxzpKdfCkM3WBcls4= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731487482; x=1763023482; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=l9ruQ/D1/arFd6l7l5PRJUqcKKML/jwznK1D/QgvN/c=; b=jZiExZ0ZiY5yOGExZlAPZfhfiSzlX/p+jmxFwTJ8IxiZLOzZmWlosoxF O98w5v3EdlfSl3IDMJaAhhSS23HeV1z9ZKPVt4ms7idpPs23B4YBFPzwX My/STbUf95ktm7gWtQUPHJhpH5+v8cZGVvCsLH1BpbjVHoQeu7DTeKPu/ Qqa84NlRL8Lv144fwS3XYhaVBPalAL3NeT6MmFFI3c+BIMgK3jxBWED1T jAU1FM2uzweVfQSxoW75QR/ZWWDRl6hmj/26yTIaOFXRJ0bMA8gstKNJq jecSnP8xSeWkhrdJKdwoE9inR9iyHZKcxli/jSs9zVxzBbQfMbOambRvL Q==; X-CSE-ConnectionGUID: /QiNX2WWTn2vjVfmm5M8+Q== X-CSE-MsgGUID: odYc59fDQmORNh23JA2oaA== X-IronPort-AV: E=McAfee;i="6700,10204,11254"; a="31458897" X-IronPort-AV: E=Sophos;i="6.12,150,1728975600"; d="scan'208";a="31458897" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2024 00:44:41 -0800 X-CSE-ConnectionGUID: IlPE1CiETr2rMi3UXrFNZw== X-CSE-MsgGUID: 7R5wHiC7QdO+e6pG1a8nrw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,150,1728975600"; d="scan'208";a="87545451" Received: from shliclel4217.sh.intel.com ([10.239.240.127]) by fmviesa007.fm.intel.com with ESMTP; 13 Nov 2024 00:44:38 -0800 From: Haochen Jiang To: binutils@sourceware.org Cc: jbeulich@suse.com, hjl.tools@intel.com, Hu@sourceware.org, Lin1 Subject: [PATCH 1/6] Support Intel AMX-TRANSPOSE Date: Wed, 13 Nov 2024 16:44:30 +0800 Message-Id: <20241113084435.1784546-2-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20241113084435.1784546-1-haochen.jiang@intel.com> References: <20241113084435.1784546-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: binutils-bounces~patchwork=sourceware.org@sourceware.org In this patch, we will support AMX-TRANSPOSE. Since AMX-TRANSPOSE will be used with other CPUIDs very often, we put it into CPU_FLAGS_COMMON. To implement TMM pair, we introduced TMMPairOperand into i386_opcide_modifier. We allocated three bit for this to indicate whether and which tmm register is tmm pair register for flexiblity in future. In assembler, we need to adjust tmm register number to even number for tmm pair register. gas/ChangeLog: * NEWS: Support Intel AMX-TRANSPOSE. * config/tc-i386.c: Add amx_transpose. * doc/c-i386.texi: Document .amx_transpose. * testsuite/gas/i386/i386.exp: Run AMX-TRANSPOSE tests. * testsuite/gas/i386/x86-64.exp: Ditto. * testsuite/gas/i386/amx-transpose-inval.l: New test. * testsuite/gas/i386/amx-transpose-inval.s: Ditto. * testsuite/gas/i386/x86-64-amx-transpose-intel.d: Ditto. * testsuite/gas/i386/x86-64-amx-transpose-inval.l: Ditto. * testsuite/gas/i386/x86-64-amx-transpose-inval.s: Ditto. * testsuite/gas/i386/x86-64-amx-transpose.d: Ditto. * testsuite/gas/i386/x86-64-amx-transpose.s: Ditto. opcodes/ChangeLog: * i386-dis.c (MOD_VEX_0F386E_X86_64_W_0): New. (MOD_VEX_0F386F_X86_64_W_0): Ditto. (PREFIX_VEX_0F385F_X86_64_W_0_L_0): Ditto. (PREFIX_VEX_0F386B_X86_64_W_0_L_0): Ditto. (PREFIX_VEX_0F386E_X86_64_W_0_M_0_L_0): Ditto. (PREFIX_VEX_0F386F_X86_64_W_0_M_0_L_0): Ditto. (X86_64_VEX_0F385F): Ditto. (X86_64_VEX_0F386B): Ditto. (X86_64_VEX_0F386E): Ditto. (X86_64_VEX_0F386F): Ditto. (VEX_LEN_0F385F_X86_64_W_0): Ditto. (VEX_LEN_0F386B_X86_64_W_0): Ditto. (VEX_LEN_0F386E_X86_64_W_0_M_0): Ditto. (VEX_LEN_0F386F_X86_64_W_0_M_0): Ditto. (VEX_W_0F385F_X86_64): Ditto. (VEX_W_0F386B_X86_64): Ditto. (VEX_W_0F386E_X86_64): Ditto. (VEX_W_0F386F_X86_64): Ditto. (mod_table): Add MOD_VEX_0F386E_X86_64_W_0, MOD_VEX_0F386F_X86_64_W_0. (prefix_table): Add PREFIX_VEX_0F386E_X86_64_W_0_M_0_L_0, PREFIX_VEX_0F386F_X86_64_W_0_M_0_L_0. Add new instructions for PREFIX_VEX_0F386C_X86_64_W_0_L_0. (x86_64_table): Add X86_64_VEX_0F385F, X86_64_VEX_0F386B, X86_64_VEX_0F386E, X86_64_VEX_0F386F. (vex_len_table): Add VEX_LEN_0F385F_X86_64_W_0, VEX_LEN_0F386B_X86_64_W_0, VEX_LEN_0F386E_X86_64_W_0_M_0, VEX_LEN_0F386F_X86_64_W_0_M_0. (vex_w_table): Add VEX_W_0F385F_X86_64, VEX_W_0F386B_X86_64, VEX_W_0F386E_X86_64, VEX_W_0F386F_X86_64. * i386-gen.c (cpu_flag_init): Add AMX_TRANSPOSE. (cpu_flags): Add CpuAMX_TRANSPOSE. * i386-init.h: Regenerated. * i386-mnem.h: Ditto. * i386-opc.h (CpuAMX_TRANSPOSE): New. (i386_cpu): Add cpuamx_transpose. * i386-opc.tbl: Add AMX-TRANSPOSE instructions. * i386-tbl.h: Regenerated. Co-authored-by: Hu, Lin1 --- gas/NEWS | 2 + gas/config/tc-i386.c | 18 +- gas/doc/c-i386.texi | 3 +- gas/testsuite/gas/i386/amx-transpose-inval.l | 12 + gas/testsuite/gas/i386/amx-transpose-inval.s | 16 + gas/testsuite/gas/i386/i386.exp | 1 + .../gas/i386/x86-64-amx-transpose-intel.d | 33 + .../gas/i386/x86-64-amx-transpose-inval.l | 11 + .../gas/i386/x86-64-amx-transpose-inval.s | 15 + gas/testsuite/gas/i386/x86-64-amx-transpose.d | 31 + gas/testsuite/gas/i386/x86-64-amx-transpose.s | 51 + gas/testsuite/gas/i386/x86-64.exp | 3 + opcodes/i386-dis.c | 127 +- opcodes/i386-gen.c | 4 + opcodes/i386-init.h | 540 +- opcodes/i386-mnem.h | 4369 +-- opcodes/i386-opc.h | 7 + opcodes/i386-opc.tbl | 21 + opcodes/i386-tbl.h | 27139 ++++++++-------- 19 files changed, 16451 insertions(+), 15952 deletions(-) create mode 100644 gas/testsuite/gas/i386/amx-transpose-inval.l create mode 100644 gas/testsuite/gas/i386/amx-transpose-inval.s create mode 100644 gas/testsuite/gas/i386/x86-64-amx-transpose-intel.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-transpose-inval.l create mode 100644 gas/testsuite/gas/i386/x86-64-amx-transpose-inval.s create mode 100644 gas/testsuite/gas/i386/x86-64-amx-transpose.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-transpose.s diff --git a/gas/NEWS b/gas/NEWS index 264b790eb66..2a8f79b6360 100644 --- a/gas/NEWS +++ b/gas/NEWS @@ -1,5 +1,7 @@ -*- text -*- +* Add support for Intel AMX-TRANSPOSE instructions. + * Add support for Intel MSR_IMM instructions. * Add support for Intel AVX10.2 instructions. diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index 3991221b17f..c28d98109d3 100644 --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -1182,6 +1182,7 @@ static const arch_entry cpu_arch[] = SUBARCH (amx_bf16, AMX_BF16, ANY_AMX_BF16, false), SUBARCH (amx_fp16, AMX_FP16, ANY_AMX_FP16, false), SUBARCH (amx_complex, AMX_COMPLEX, ANY_AMX_COMPLEX, false), + SUBARCH (amx_transpose, AMX_TRANSPOSE, ANY_AMX_TRANSPOSE, false), SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false), SUBARCH (movdiri, MOVDIRI, MOVDIRI, false), SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false), @@ -1858,6 +1859,7 @@ _is_cpu (const i386_cpu_attr *a, enum i386_cpu cpu) case CpuAVX512F: return a->bitfield.cpuavx512f; case CpuAVX512VL: return a->bitfield.cpuavx512vl; case CpuAPX_F: return a->bitfield.cpuapx_f; + case CpuAMX_TRANSPOSE: return a->bitfield.cpuamx_transpose; case Cpu64: return a->bitfield.cpu64; case CpuNo64: return a->bitfield.cpuno64; default: @@ -10977,7 +10979,7 @@ build_modrm_byte (void) { if (i.mem_operands) { - unsigned int fake_zero_displacement = 0; + unsigned int fake_zero_displacement = 0, tmmpair = 0, pos = 0; gas_assert (i.flags[op] & Operand_Mem); @@ -11009,6 +11011,20 @@ build_modrm_byte (void) i.sib.index = i.index_reg->reg_num; set_rex_vrex (i.index_reg, REX_X, false); } + + /* Since some amx instructions uses tmm pairs, which will + automatically change tmm with odd number to even number. + So we will handle this here. */ + tmmpair = i.tm.opcode_modifier.tmmpairoperand & 7; + while (tmmpair) + { + if (tmmpair % 2 == 1 + && i.op[pos].regs->reg_num % 2 == 1) + i.op[pos].regs--; + tmmpair >>= 1; + pos++; + } + } default_seg = reg_ds; diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi index f74b84f55f0..84bb875791d 100644 --- a/gas/doc/c-i386.texi +++ b/gas/doc/c-i386.texi @@ -228,6 +228,7 @@ accept various extension mnemonics. For example, @code{amx_bf16}, @code{amx_fp16}, @code{amx_complex}, +@code{amx_transpose}, @code{amx_tile}, @code{vmx}, @code{vmfunc}, @@ -1700,7 +1701,7 @@ supported on the CPU specified. The choices for @var{cpu_type} are: @item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpclmulqdq} @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk} @item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} -@item @samp{.amx_complex} @tab @samp{.amx_tile} +@item @samp{.amx_complex} @tab @samp{.amx_transpose} @tab @samp{.amx_tile} @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset} @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5} @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} diff --git a/gas/testsuite/gas/i386/amx-transpose-inval.l b/gas/testsuite/gas/i386/amx-transpose-inval.l new file mode 100644 index 00000000000..8e6fc155878 --- /dev/null +++ b/gas/testsuite/gas/i386/amx-transpose-inval.l @@ -0,0 +1,12 @@ +.* Assembler messages: +.*:6: Error: `ttdpbf16ps' is only supported in 64-bit mode +.*:7: Error: `ttdpfp16ps' is only supported in 64-bit mode +.*:8: Error: `ttransposed' is only supported in 64-bit mode +.*:9: Error: `t2rpntlvwz0' is only supported in 64-bit mode +.*:10: Error: `t2rpntlvwz0t1' is only supported in 64-bit mode +.*:11: Error: `t2rpntlvwz1' is only supported in 64-bit mode +.*:12: Error: `t2rpntlvwz1t1' is only supported in 64-bit mode +.*:13: Error: `tconjtcmmimfp16ps' is only supported in 64-bit mode +.*:14: Error: `tconjtfp16' is only supported in 64-bit mode +.*:15: Error: `ttcmmimfp16ps' is only supported in 64-bit mode +.*:16: Error: `ttcmmrlfp16ps' is only supported in 64-bit mode diff --git a/gas/testsuite/gas/i386/amx-transpose-inval.s b/gas/testsuite/gas/i386/amx-transpose-inval.s new file mode 100644 index 00000000000..e6daa5fde76 --- /dev/null +++ b/gas/testsuite/gas/i386/amx-transpose-inval.s @@ -0,0 +1,16 @@ +# Check Illegal AMX-TRANSPOSE instructions + + .allow_index_reg + .text +_start: + ttdpbf16ps %tmm1, %tmm2, %tmm3 + ttdpfp16ps %tmm1, %tmm2, %tmm3 + ttransposed %tmm2, %tmm3 + t2rpntlvwz0 (%r9), %tmm3 + t2rpntlvwz0t1 (%r9), %tmm3 + t2rpntlvwz1 (%r9), %tmm3 + t2rpntlvwz1t1 (%r9), %tmm3 + tconjtcmmimfp16ps %tmm1, %tmm2, %tmm3 + tconjtfp16 %tmm5, %tmm6 + ttcmmimfp16ps %tmm4, %tmm5, %tmm6 + ttcmmrlfp16ps %tmm4, %tmm5, %tmm6 diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp index 77aec65e7ad..71abb0fc66c 100644 --- a/gas/testsuite/gas/i386/i386.exp +++ b/gas/testsuite/gas/i386/i386.exp @@ -546,6 +546,7 @@ if [gas_32_check] then { run_dump_test "avx10_2-256-miscs" run_dump_test "avx10_2-256-miscs-intel" run_list_test "msr_imm-inval" + run_list_test "amx-transpose-inval" run_list_test "sg" run_dump_test "clzero" run_dump_test "invlpgb" diff --git a/gas/testsuite/gas/i386/x86-64-amx-transpose-intel.d b/gas/testsuite/gas/i386/x86-64-amx-transpose-intel.d new file mode 100644 index 00000000000..e50f09436c8 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-transpose-intel.d @@ -0,0 +1,33 @@ +#objdump: -dw -Mintel +#name: x86_64 AMX-TRANSPOSE insns (Intel disassembly) +#source: x86-64-amx-transpose.s + +.*: +file format .* + +Disassembly of section \.text: + +#... +[a-f0-9]+ <_intel>: +\s*[a-f0-9]+:\s*c4 e2 5a 6c f5\s+ttdpbf16ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 72 6c da\s+ttdpbf16ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e2 5b 6c f5\s+ttdpfp16ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 73 6c da\s+ttdpfp16ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e2 7a 5f f5\s+ttransposed tmm6,tmm5 +\s*[a-f0-9]+:\s*c4 e2 7a 5f da\s+ttransposed tmm3,tmm2 +\s*[a-f0-9]+:\s*c4 a2 78 6e b4 f5 00 00 00 10\s+t2rpntlvwz0 tmm6,\[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 c2 78 6e 14 21\s+t2rpntlvwz0 tmm2,\[r9\+riz\*1\] +\s*[a-f0-9]+:\s*c4 a2 78 6f b4 f5 00 00 00 10\s+t2rpntlvwz0t1 tmm6,\[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 c2 78 6f 14 21\s+t2rpntlvwz0t1 tmm2,\[r9\+riz\*1\] +\s*[a-f0-9]+:\s*c4 a2 79 6e b4 f5 00 00 00 10\s+t2rpntlvwz1 tmm6,\[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 c2 79 6e 14 21\s+t2rpntlvwz1 tmm2,\[r9\+riz\*1\] +\s*[a-f0-9]+:\s*c4 a2 79 6f b4 f5 00 00 00 10\s+t2rpntlvwz1t1 tmm6,\[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 c2 79 6f 14 21\s+t2rpntlvwz1t1 tmm2,\[r9\+riz\*1\] +\s*[a-f0-9]+:\s*c4 e2 58 6b f5\s+tconjtcmmimfp16ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 70 6b da\s+tconjtcmmimfp16ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e2 79 6b f5\s+tconjtfp16 tmm6,tmm5 +\s*[a-f0-9]+:\s*c4 e2 79 6b da\s+tconjtfp16 tmm3,tmm2 +\s*[a-f0-9]+:\s*c4 e2 5b 6b f5\s+ttcmmimfp16ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 73 6b da\s+ttcmmimfp16ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e2 5a 6b f5\s+ttcmmrlfp16ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 72 6b da\s+ttcmmrlfp16ps tmm3,tmm2,tmm1 +#pass diff --git a/gas/testsuite/gas/i386/x86-64-amx-transpose-inval.l b/gas/testsuite/gas/i386/x86-64-amx-transpose-inval.l new file mode 100644 index 00000000000..468b2717257 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-transpose-inval.l @@ -0,0 +1,11 @@ +.* Assembler messages: +.*:6: Error: all tmm registers must be distinct for `ttdpbf16ps' +.*:7: Error: all tmm registers must be distinct for `ttdpbf16ps' +.*:8: Error: all tmm registers must be distinct for `ttdpbf16ps' +.*:9: Error: all tmm registers must be distinct for `ttdpfp16ps' +.*:10: Error: all tmm registers must be distinct for `ttdpfp16ps' +.*:11: Error: all tmm registers must be distinct for `ttdpfp16ps' +.*:12: Error: `\(%rip\)' cannot be used here +.*:13: Error: `\(%rip\)' cannot be used here +.*:14: Error: `\(%rip\)' cannot be used here +.*:15: Error: `\(%rip\)' cannot be used here diff --git a/gas/testsuite/gas/i386/x86-64-amx-transpose-inval.s b/gas/testsuite/gas/i386/x86-64-amx-transpose-inval.s new file mode 100644 index 00000000000..9aa0cfbc6a6 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-transpose-inval.s @@ -0,0 +1,15 @@ +# Check Illegal AMX-TRANSPOSE instructions + + .allow_index_reg + .text +_start: + ttdpbf16ps %tmm1, %tmm1, %tmm2 + ttdpbf16ps %tmm1, %tmm2, %tmm1 + ttdpbf16ps %tmm2, %tmm1, %tmm1 + ttdpfp16ps %tmm1, %tmm1, %tmm2 + ttdpfp16ps %tmm1, %tmm2, %tmm1 + ttdpfp16ps %tmm2, %tmm1, %tmm1 + t2rpntlvwz0 (%rip), %tmm1 + t2rpntlvwz0t1 (%rip), %tmm1 + t2rpntlvwz1 (%rip), %tmm1 + t2rpntlvwz1t1 (%rip), %tmm1 diff --git a/gas/testsuite/gas/i386/x86-64-amx-transpose.d b/gas/testsuite/gas/i386/x86-64-amx-transpose.d new file mode 100644 index 00000000000..47f989136a4 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-transpose.d @@ -0,0 +1,31 @@ +#objdump: -dw +#name: x86_64 AMX-TRANSPOSE insns + +.*: +file format .* + +Disassembly of section \.text: + +0+ <_start>: +\s*[a-f0-9]+:\s*c4 e2 5a 6c f5\s+ttdpbf16ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e2 72 6c da\s+ttdpbf16ps %tmm1,%tmm2,%tmm3 +\s*[a-f0-9]+:\s*c4 e2 5b 6c f5\s+ttdpfp16ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e2 73 6c da\s+ttdpfp16ps %tmm1,%tmm2,%tmm3 +\s*[a-f0-9]+:\s*c4 e2 7a 5f f5\s+ttransposed %tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e2 7a 5f da\s+ttransposed %tmm2,%tmm3 +\s*[a-f0-9]+:\s*c4 a2 78 6e b4 f5 00 00 00 10\s+t2rpntlvwz0 0x10000000\(%rbp,%r14,8\),%tmm6 +\s*[a-f0-9]+:\s*c4 c2 78 6e 14 21\s+t2rpntlvwz0 \(%r9,%riz,1\),%tmm2 +\s*[a-f0-9]+:\s*c4 a2 78 6f b4 f5 00 00 00 10\s+t2rpntlvwz0t1 0x10000000\(%rbp,%r14,8\),%tmm6 +\s*[a-f0-9]+:\s*c4 c2 78 6f 14 21\s+t2rpntlvwz0t1 \(%r9,%riz,1\),%tmm2 +\s*[a-f0-9]+:\s*c4 a2 79 6e b4 f5 00 00 00 10\s+t2rpntlvwz1 0x10000000\(%rbp,%r14,8\),%tmm6 +\s*[a-f0-9]+:\s*c4 c2 79 6e 14 21\s+t2rpntlvwz1 \(%r9,%riz,1\),%tmm2 +\s*[a-f0-9]+:\s*c4 a2 79 6f b4 f5 00 00 00 10\s+t2rpntlvwz1t1 0x10000000\(%rbp,%r14,8\),%tmm6 +\s*[a-f0-9]+:\s*c4 c2 79 6f 14 21\s+t2rpntlvwz1t1 \(%r9,%riz,1\),%tmm2 +\s*[a-f0-9]+:\s*c4 e2 58 6b f5\s+tconjtcmmimfp16ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e2 70 6b da\s+tconjtcmmimfp16ps %tmm1,%tmm2,%tmm3 +\s*[a-f0-9]+:\s*c4 e2 79 6b f5\s+tconjtfp16 %tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e2 79 6b da\s+tconjtfp16 %tmm2,%tmm3 +\s*[a-f0-9]+:\s*c4 e2 5b 6b f5\s+ttcmmimfp16ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e2 73 6b da\s+ttcmmimfp16ps %tmm1,%tmm2,%tmm3 +\s*[a-f0-9]+:\s*c4 e2 5a 6b f5\s+ttcmmrlfp16ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e2 72 6b da\s+ttcmmrlfp16ps %tmm1,%tmm2,%tmm3 +#pass diff --git a/gas/testsuite/gas/i386/x86-64-amx-transpose.s b/gas/testsuite/gas/i386/x86-64-amx-transpose.s new file mode 100644 index 00000000000..63f3ec65079 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-transpose.s @@ -0,0 +1,51 @@ +# Check 64bit AMX-TRANSPOSE instructions + + .text +_start: + ttdpbf16ps %tmm4, %tmm5, %tmm6 + ttdpbf16ps %tmm1, %tmm2, %tmm3 + ttdpfp16ps %tmm4, %tmm5, %tmm6 + ttdpfp16ps %tmm1, %tmm2, %tmm3 + ttransposed %tmm5, %tmm6 + ttransposed %tmm2, %tmm3 + t2rpntlvwz0 0x10000000(%rbp, %r14, 8), %tmm6 + t2rpntlvwz0 (%r9), %tmm3 + t2rpntlvwz0t1 0x10000000(%rbp, %r14, 8), %tmm6 + t2rpntlvwz0t1 (%r9), %tmm3 + t2rpntlvwz1 0x10000000(%rbp, %r14, 8), %tmm6 + t2rpntlvwz1 (%r9), %tmm3 + t2rpntlvwz1t1 0x10000000(%rbp, %r14, 8), %tmm6 + t2rpntlvwz1t1 (%r9), %tmm3 + tconjtcmmimfp16ps %tmm4, %tmm5, %tmm6 + tconjtcmmimfp16ps %tmm1, %tmm2, %tmm3 + tconjtfp16 %tmm5, %tmm6 + tconjtfp16 %tmm2, %tmm3 + ttcmmimfp16ps %tmm4, %tmm5, %tmm6 + ttcmmimfp16ps %tmm1, %tmm2, %tmm3 + ttcmmrlfp16ps %tmm4, %tmm5, %tmm6 + ttcmmrlfp16ps %tmm1, %tmm2, %tmm3 + +_intel: + .intel_syntax noprefix + ttdpbf16ps tmm6, tmm5, tmm4 + ttdpbf16ps tmm3, tmm2, tmm1 + ttdpfp16ps tmm6, tmm5, tmm4 + ttdpfp16ps tmm3, tmm2, tmm1 + ttransposed tmm6, tmm5 + ttransposed tmm3, tmm2 + t2rpntlvwz0 tmm6, [rbp+r14*8+0x10000000] + t2rpntlvwz0 tmm3, [r9] + t2rpntlvwz0t1 tmm6, [rbp+r14*8+0x10000000] + t2rpntlvwz0t1 tmm3, [r9] + t2rpntlvwz1 tmm6, [rbp+r14*8+0x10000000] + t2rpntlvwz1 tmm3, [r9] + t2rpntlvwz1t1 tmm6, [rbp+r14*8+0x10000000] + t2rpntlvwz1t1 tmm3, [r9] + tconjtcmmimfp16ps tmm6, tmm5, tmm4 + tconjtcmmimfp16ps tmm3, tmm2, tmm1 + tconjtfp16 tmm6, tmm5 + tconjtfp16 tmm3, tmm2 + ttcmmimfp16ps tmm6, tmm5, tmm4 + ttcmmimfp16ps tmm3, tmm2, tmm1 + ttcmmrlfp16ps tmm6, tmm5, tmm4 + ttcmmrlfp16ps tmm3, tmm2, tmm1 diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp index 97d1ea14240..1d54d7700e2 100644 --- a/gas/testsuite/gas/i386/x86-64.exp +++ b/gas/testsuite/gas/i386/x86-64.exp @@ -524,6 +524,9 @@ run_dump_test "x86-64-avx10_2-256-miscs-intel" run_dump_test "x86-64-msr_imm" run_dump_test "x86-64-msr_imm-intel" run_list_test "x86-64-msr_imm-inval" +run_dump_test "x86-64-amx-transpose" +run_dump_test "x86-64-amx-transpose-intel" +run_list_test "x86-64-amx-transpose-inval" run_dump_test "x86-64-clzero" run_dump_test "x86-64-mwaitx-bdver4" run_list_test "x86-64-mwaitx-reg" diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c index a842bcacc99..2095bb65196 100644 --- a/opcodes/i386-dis.c +++ b/opcodes/i386-dis.c @@ -963,6 +963,8 @@ enum MOD_0F38F8, MOD_VEX_0F3849_X86_64_L_0_W_0, + MOD_VEX_0F386E_X86_64, + MOD_VEX_0F386F_X86_64, MOD_EVEX_MAP4_60, MOD_EVEX_MAP4_61, @@ -1136,7 +1138,11 @@ enum PREFIX_VEX_0F3851_W_0, PREFIX_VEX_0F385C_X86_64_L_0_W_0, PREFIX_VEX_0F385E_X86_64_L_0_W_0, + PREFIX_VEX_0F385F_X86_64_L_0_W_0, + PREFIX_VEX_0F386B_X86_64_L_0_W_0, PREFIX_VEX_0F386C_X86_64_L_0_W_0, + PREFIX_VEX_0F386E_X86_64_M_0_L_0_W_0, + PREFIX_VEX_0F386F_X86_64_M_0_L_0_W_0, PREFIX_VEX_0F3872, PREFIX_VEX_0F38B0_W_0, PREFIX_VEX_0F38B1_W_0, @@ -1347,7 +1353,11 @@ enum X86_64_VEX_0F384B, X86_64_VEX_0F385C, X86_64_VEX_0F385E, + X86_64_VEX_0F385F, + X86_64_VEX_0F386B, X86_64_VEX_0F386C, + X86_64_VEX_0F386E, + X86_64_VEX_0F386F, X86_64_VEX_0F38Ex, X86_64_VEX_MAP7_F6_L_0_W_0_R_0, @@ -1431,7 +1441,11 @@ enum VEX_LEN_0F385A, VEX_LEN_0F385C_X86_64, VEX_LEN_0F385E_X86_64, + VEX_LEN_0F385F_X86_64, + VEX_LEN_0F386B_X86_64, VEX_LEN_0F386C_X86_64, + VEX_LEN_0F386E_X86_64_M_0, + VEX_LEN_0F386F_X86_64_M_0, VEX_LEN_0F38CB_P_3_W_0, VEX_LEN_0F38CC_P_3_W_0, VEX_LEN_0F38CD_P_3_W_0, @@ -1604,7 +1618,11 @@ enum VEX_W_0F385A_L_0, VEX_W_0F385C_X86_64_L_0, VEX_W_0F385E_X86_64_L_0, + VEX_W_0F385F_X86_64_L_0, + VEX_W_0F386B_X86_64_L_0, VEX_W_0F386C_X86_64_L_0, + VEX_W_0F386E_X86_64_M_0_L_0, + VEX_W_0F386F_X86_64_M_0_L_0, VEX_W_0F3872_P_1, VEX_W_0F3878, VEX_W_0F3879, @@ -4105,11 +4123,40 @@ static const struct dis386 prefix_table[][4] = { { "tdpbssd", {TMM, Rtmm, VexTmm }, 0 }, }, + /* PREFIX_VEX_0F385F_X86_64_L_0_W_0 */ + { + { Bad_Opcode }, + { "ttransposed", { TMM, Rtmm }, 0 }, + }, + + /* PREFIX_VEX_0F386B_X86_64_L_0_W_0 */ + { + { "tconjtcmmimfp16ps", { TMM, Rtmm, VexTmm }, 0 }, + { "ttcmmrlfp16ps", { TMM, Rtmm, VexTmm }, 0 }, + { "tconjtfp16", { TMM, Rtmm }, 0 }, + { "ttcmmimfp16ps", { TMM, Rtmm, VexTmm }, 0 }, + }, + /* PREFIX_VEX_0F386C_X86_64_L_0_W_0 */ { - { "tcmmrlfp16ps", { TMM, Rtmm, VexTmm }, 0 }, + { "tcmmrlfp16ps", { TMM, Rtmm, VexTmm }, 0 }, + { "ttdpbf16ps", { TMM, Rtmm, VexTmm }, 0 }, + { "tcmmimfp16ps", { TMM, Rtmm, VexTmm }, 0 }, + { "ttdpfp16ps", { TMM, Rtmm, VexTmm }, 0 }, + }, + + /* PREFIX_VEX_0F386E_X86_64_M_0_L_0_W_0 */ + { + { "t2rpntlvwz0", { TMM, MVexSIBMEM }, 0 }, { Bad_Opcode }, - { "tcmmimfp16ps", { TMM, Rtmm, VexTmm }, 0 }, + { "t2rpntlvwz1", { TMM, MVexSIBMEM }, 0 }, + }, + + /* PREFIX_VEX_0F386F_X86_64_M_0_L_0_W_0 */ + { + { "t2rpntlvwz0t1", { TMM, MVexSIBMEM }, 0 }, + { Bad_Opcode }, + { "t2rpntlvwz1t1", { TMM, MVexSIBMEM }, 0 }, }, /* PREFIX_VEX_0F3872 */ @@ -4581,12 +4628,36 @@ static const struct dis386 x86_64_table[][2] = { { VEX_LEN_TABLE (VEX_LEN_0F385E_X86_64) }, }, + /* X86_64_VEX_0F385F */ + { + { Bad_Opcode }, + { VEX_LEN_TABLE (VEX_LEN_0F385F_X86_64) }, + }, + + /* X86_64_VEX_0F386B */ + { + { Bad_Opcode }, + { VEX_LEN_TABLE (VEX_LEN_0F386B_X86_64) }, + }, + /* X86_64_VEX_0F386C */ { { Bad_Opcode }, { VEX_LEN_TABLE (VEX_LEN_0F386C_X86_64) }, }, + /* X86_64_VEX_0F386E */ + { + { Bad_Opcode }, + { MOD_TABLE (MOD_VEX_0F386E_X86_64) }, + }, + + /* X86_64_VEX_0F386F */ + { + { Bad_Opcode }, + { MOD_TABLE (MOD_VEX_0F386F_X86_64) }, + }, + /* X86_64_VEX_0F38Ex */ { { Bad_Opcode }, @@ -6471,7 +6542,7 @@ static const struct dis386 vex_table[][256] = { { X86_64_TABLE (X86_64_VEX_0F385C) }, { Bad_Opcode }, { X86_64_TABLE (X86_64_VEX_0F385E) }, - { Bad_Opcode }, + { X86_64_TABLE (X86_64_VEX_0F385F) }, /* 60 */ { Bad_Opcode }, { Bad_Opcode }, @@ -6485,11 +6556,11 @@ static const struct dis386 vex_table[][256] = { { Bad_Opcode }, { Bad_Opcode }, { Bad_Opcode }, - { Bad_Opcode }, + { X86_64_TABLE (X86_64_VEX_0F386B) }, { X86_64_TABLE (X86_64_VEX_0F386C) }, { Bad_Opcode }, - { Bad_Opcode }, - { Bad_Opcode }, + { X86_64_TABLE (X86_64_VEX_0F386E) }, + { X86_64_TABLE (X86_64_VEX_0F386F) }, /* 70 */ { Bad_Opcode }, { Bad_Opcode }, @@ -7152,11 +7223,31 @@ static const struct dis386 vex_len_table[][2] = { { VEX_W_TABLE (VEX_W_0F385E_X86_64_L_0) }, }, + /* VEX_LEN_0F385F_X86_64 */ + { + { VEX_W_TABLE (VEX_W_0F385F_X86_64_L_0) }, + }, + + /* VEX_LEN_0F386B_X86_64 */ + { + { VEX_W_TABLE (VEX_W_0F386B_X86_64_L_0) }, + }, + /* VEX_LEN_0F386C_X86_64 */ { { VEX_W_TABLE (VEX_W_0F386C_X86_64_L_0) }, }, + /* VEX_LEN_0F386E_X86_64_M_0 */ + { + { VEX_W_TABLE (VEX_W_0F386E_X86_64_M_0_L_0) }, + }, + + /* VEX_LEN_0F386F_X86_64_M_0 */ + { + { VEX_W_TABLE (VEX_W_0F386F_X86_64_M_0_L_0) }, + }, + /* VEX_LEN_0F38CB_P_3_W_0 */ { { Bad_Opcode }, @@ -7836,10 +7927,26 @@ static const struct dis386 vex_w_table[][2] = { /* VEX_W_0F385E_X86_64_L_0 */ { PREFIX_TABLE (PREFIX_VEX_0F385E_X86_64_L_0_W_0) }, }, + { + /* VEX_W_0F385F_X86_64_L_0 */ + { PREFIX_TABLE (PREFIX_VEX_0F385F_X86_64_L_0_W_0) }, + }, + { + /* VEX_W_0F386B_X86_64_L_0 */ + { PREFIX_TABLE (PREFIX_VEX_0F386B_X86_64_L_0_W_0) }, + }, { /* VEX_W_0F386C_X86_64_L_0 */ { PREFIX_TABLE (PREFIX_VEX_0F386C_X86_64_L_0_W_0) }, }, + { + /* VEX_W_0F386E_X86_64_M_0_L_0 */ + { PREFIX_TABLE (PREFIX_VEX_0F386E_X86_64_M_0_L_0_W_0) }, + }, + { + /* VEX_W_0F386F_X86_64_M_0_L_0 */ + { PREFIX_TABLE (PREFIX_VEX_0F386F_X86_64_M_0_L_0_W_0) }, + }, { /* VEX_W_0F3872_P_1 */ { "%XVvcvtneps2bf16%XY", { XMM, EXx }, 0 }, @@ -8334,6 +8441,14 @@ static const struct dis386 mod_table[][2] = { { PREFIX_TABLE (PREFIX_VEX_0F3849_X86_64_L_0_W_0_M_0) }, { PREFIX_TABLE (PREFIX_VEX_0F3849_X86_64_L_0_W_0_M_1) }, }, + { + /* MOD_VEX_0F386E_X86_64 */ + { VEX_LEN_TABLE (VEX_LEN_0F386E_X86_64_M_0) }, + }, + { + /* MOD_VEX_0F386F_X86_64 */ + { VEX_LEN_TABLE (VEX_LEN_0F386F_X86_64_M_0) }, + }, #include "i386-dis-evex-mod.h" }; diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c index 9fbf3f723a2..be05b1be817 100644 --- a/opcodes/i386-gen.c +++ b/opcodes/i386-gen.c @@ -263,6 +263,8 @@ static const dependency isa_dependencies[] = "AMX_TILE" }, { "AMX_COMPLEX", "AMX_TILE" }, + { "AMX_TRANSPOSE", + "AMX_TILE" }, { "KL", "SSE2" }, { "WIDEKL", @@ -429,6 +431,7 @@ static bitfield cpu_flags[] = BITFIELD (AMX_BF16), BITFIELD (AMX_FP16), BITFIELD (AMX_COMPLEX), + BITFIELD (AMX_TRANSPOSE), BITFIELD (AMX_TILE), BITFIELD (MOVDIRI), BITFIELD (MOVDIR64B), @@ -499,6 +502,7 @@ static bitfield opcode_modifiers[] = BITFIELD (NoEgpr), BITFIELD (NF), BITFIELD (Rex2), + BITFIELD (TMMPairOperand), }; #define CLASS(n) #n, n diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h index 48e3613d023..fd11f9f0cd8 100644 --- a/opcodes/i386-opc.h +++ b/opcodes/i386-opc.h @@ -327,6 +327,8 @@ enum i386_cpu CpuAVX512VL, /* Intel APX_F Instructions support required. */ CpuAPX_F, + /* Intel AMX-TRANSPOSE Instructions support required. */ + CpuAMX_TRANSPOSE, /* Not supported in the 64bit mode */ CpuNo64, @@ -363,6 +365,7 @@ enum i386_cpu cpuavx512f:1, \ cpuavx512vl:1, \ cpuapx_f:1, \ + cpuamx_transpose:1, \ /* NOTE: This field needs to remain last. */ \ cpuno64:1 @@ -772,6 +775,9 @@ enum /* Instrucion requires REX2 prefix. */ Rex2, + /* Check whether or which tmm register is pair register. */ + TMMPairOperand, + /* The last bitfield in i386_opcode_modifier. */ Opcode_Modifier_Num }; @@ -820,6 +826,7 @@ typedef struct i386_opcode_modifier unsigned int noegpr:1; unsigned int nf:1; unsigned int rex2:1; + unsigned int tmmpairoperand:3; } i386_opcode_modifier; /* Operand classes. */ diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl index 2698bbb0b04..d8f2a180ba7 100644 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -3183,14 +3183,27 @@ xresldtrk, 0xf20f01e9, TSXLDTRK, NoSuf, {} // TSXLDTRK instructions end. +#define TMMPairOperand0 TMMPairOperand=1 +#define TMMPairOperand1 TMMPairOperand=2 +#define TMMPairOperand2 TMMPairOperand=4 + // AMX instructions. ldtilecfg, 0x49/0, APX_F(AMX_TILE), Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex } sttilecfg, 0x6649/0, APX_F(AMX_TILE), Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex } +t2rpntlvwz0, 0x6e, AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } +t2rpntlvwz0t1, 0x6f, AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } +t2rpntlvwz1, 0x666e, AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } +t2rpntlvwz1t1, 0x666f, AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } + tcmmimfp16ps, 0x666c, AMX_COMPLEX, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } tcmmrlfp16ps, 0x6c, AMX_COMPLEX, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } +tconjtcmmimfp16ps, 0x6b, AMX_COMPLEX&AMX_TRANSPOSE, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } + +tconjtfp16, 0x666b, AMX_COMPLEX&AMX_TRANSPOSE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM, RegTMM } + tdpbf16ps, 0xf35c, AMX_BF16, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } tdpfp16ps, 0xf25c, AMX_FP16, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } tdpbssd, 0xf25e, AMX_INT8, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } @@ -3206,6 +3219,14 @@ tilerelease, 0x49c0, AMX_TILE, Vex128|Space0F38|VexW0|NoSuf, {} tilezero, 0xf249, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM } +ttcmmimfp16ps, 0xf26b, AMX_COMPLEX&AMX_TRANSPOSE, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } +ttcmmrlfp16ps, 0xf36b, AMX_COMPLEX&AMX_TRANSPOSE, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } + +ttdpbf16ps, 0xf36c, AMX_BF16&AMX_TRANSPOSE, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } +ttdpfp16ps, 0xf26c, AMX_FP16&AMX_TRANSPOSE, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } + +ttransposed, 0xf35f, AMX_TRANSPOSE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM, RegTMM } + // AMX instructions end. // KEYLOCKER instructions. From patchwork Wed Nov 13 08:44:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haochen Jiang X-Patchwork-Id: 100949 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4B59B3858417 for ; Wed, 13 Nov 2024 08:45:28 +0000 (GMT) X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) by sourceware.org (Postfix) with ESMTPS id 7891B3858D28 for ; Wed, 13 Nov 2024 08:44:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7891B3858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7891B3858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.17 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731487487; cv=none; b=JTVltqiT4DScvEFGEDaO7/EvGK9LOuFa2u/iXkWOrDSjeVSSqU5TMqBAWeitYcR7Ic+dx1fgmitcaZKWkqXvYrcpapPKWVGr/4bjizJaBFa3v6AHPM4oi+cnR0kxJEHJ6/fpkd0Zp2PFUSH4PHZHieAh1ObIQCEBtT7ogDPrFYk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731487487; c=relaxed/simple; bh=LuE78prTs8QfHK7elrFkuEXFEvDlAyV3UX0Fir2atnc=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=lFu5ZWYWkQ3RiGWx09RMVlZ/owTJlJzQoraAjdZgs2g96xHKFkUKVxKw4sVPHpVPvJ+gVIA1Me82lf80x6jjmn6ZUCLq+lsX3wCgwinGZ0W+b8Z5EFno6Ou82BdCS2MDL1hGYkIHaGPTfsjHZBju0tKr45belVLhsMLxBBPZG5g= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731487484; x=1763023484; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LuE78prTs8QfHK7elrFkuEXFEvDlAyV3UX0Fir2atnc=; b=c1hxHwnrIFUbceczvglMdbmGhv3c93T2jFgD47V1niwIJyH/WsCwOxms 4EhxBCI8ulo8Di1P6utMO49Ja9b0k/hiQIM8hDHonAqOneV9bfRGxcwuM 8PjzErxWuuwrC8ocJ2Ucp2P+V13AxCvoLBIANbH9M/2nf5c5GTnB0qZ48 LXoQpf+1UUZp4QRcJyf7siUhMTxUIJFinL3wgqHrm9Y1EG9y0WllVLEcy O4yYEzcooREBW+p/y3p/4dbf3ZF4g5BDJVbAh+TnI39fkrBrho7TknSMz TqUxRhK6m6g2V/TJzfLqw4aAuwkoqDKup8E1LQQQ26nrWZZVyUTeFsJJV A==; X-CSE-ConnectionGUID: x95fEN++SwS6/02K2OW8xg== X-CSE-MsgGUID: rSbN5WYIT0KtEpjl8U2Vkw== X-IronPort-AV: E=McAfee;i="6700,10204,11254"; a="31458903" X-IronPort-AV: E=Sophos;i="6.12,150,1728975600"; d="scan'208";a="31458903" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2024 00:44:43 -0800 X-CSE-ConnectionGUID: WkA9GXZfRDGK0G4SAGBQIQ== X-CSE-MsgGUID: SnheLu0FTMSoi+Zci+VgJg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,150,1728975600"; d="scan'208";a="87545457" Received: from shliclel4217.sh.intel.com ([10.239.240.127]) by fmviesa007.fm.intel.com with ESMTP; 13 Nov 2024 00:44:41 -0800 From: Haochen Jiang To: binutils@sourceware.org Cc: jbeulich@suse.com, hjl.tools@intel.com Subject: [PATCH 2/6] Support Intel AMX-AVX512 Date: Wed, 13 Nov 2024 16:44:31 +0800 Message-Id: <20241113084435.1784546-3-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20241113084435.1784546-1-haochen.jiang@intel.com> References: <20241113084435.1784546-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: binutils-bounces~patchwork=sourceware.org@sourceware.org This patch will support AMX-AVX512. In disassmbler, we need to manually change the operand3 names to r32 when it is register or it will print out zmm instead. gas/ChangeLog: * NEWS: Support Intel AMX-AVX512. * config/tc-i386.c: Add amx_avx512. * doc/c-i386.texi: Document .amx_avx512. * testsuite/gas/i386/i386.exp: Run AMX-AVX512 tests. * testsuite/gas/i386/x86-64.exp: Ditto. * testsuite/gas/i386/amx-avx512-inval.l: New test. * testsuite/gas/i386/amx-avx512-inval.s: Ditto. * testsuite/gas/i386/x86-64-amx-avx512-intel.d: Ditto. * testsuite/gas/i386/x86-64-amx-avx512.d: Ditto. * testsuite/gas/i386/x86-64-amx-avx512.s: Ditto. opcodes/ChangeLog: * i386-dis-evex-len.h: Add EVEX_LEN_0F384A_X86_64_W_0, EVEX_LEN_0F386D_X86_64_W_0, EVEX_LEN_0F3A07_X86_64_W_0, EVEX_LEN_0F3A77_X86_64_W_0. * i386-dis-evex-prefix.h: Add PREFIX_EVEX_0F384A_W_0_L_2, PREFIX_EVEX_0F386D_W_0_L_2, PREFIX_EVEX_0F3A07_W_0_L_2, PREFIX_EVEX_0F3A77_W_0_L_2. * i386-dis-evex-w.h: Add EVEX_W_0F384A_X86_64, EVEX_W_0F386D_X86_64, EVEX_W_0F3A07_X86_64, EVEX_W_0F3A77_X86_64. * i386-dis-evex-x86-64.h: Add X86_64_EVEX_0F384A, X86_64_EVEX_0F386D, X86_64_EVEX_0F3A07, X86_64_EVEX_0F3A77. * i386-dis-evex.h: Ditto. * i386-dis.c (EVEX_LEN_0F384A_X86_64_W_0): New. (EVEX_LEN_0F386D_X86_64_W_0): Ditto. (EVEX_LEN_0F3A07_X86_64_W_0): Ditto. (EVEX_LEN_0F3A77_X86_64_W_0): Ditto. (MOD_EVEX_0F384A_X86_64_W_0): Ditto. (MOD_EVEX_0F386D_X86_64_W_0): Ditto. (MOD_EVEX_0F3A07_X86_64_W_0): Ditto. (MOD_EVEX_0F3A77_X86_64_W_0): Ditto. (PREFIX_EVEX_0F384A_W_0_L_2): Ditto. (PREFIX_EVEX_0F386D_W_0_L_2): Ditto. (PREFIX_EVEX_0F3A07_W_0_L_2): Ditto. (PREFIX_EVEX_0F3A77_W_0_L_2): Ditto. (EVEX_W_0F384A_X86_64): Ditto. (EVEX_W_0F386D_X86_64): Ditto. (EVEX_W_0F3A07_X86_64): Ditto. (EVEX_W_0F3A77_X86_64): Ditto. (X86_64_EVEX_0F384A): Ditto. (X86_64_EVEX_0F386D): Ditto. (X86_64_EVEX_0F3A07): Ditto. (X86_64_EVEX_0F3A77): Ditto. (OP_VEX): Add handler for dq_mode under EVEX512. * i386-gen.c (cpu_flag_init): Add CPU_AMX_AVX512_FLAGS and CPU_ANY_AMX_AVX512_FLAGS. * i386-init.h: Regenerated. * i386-mnem.h: Ditto. * i386-opc.h (CpuAMX_AVX512): New. (i386_cpu_flags): Add cpuamx_avx512. * i386-opc.tbl: Add AMX-AVX512 instructions. * i386-tbl.h: Regenerated. --- gas/NEWS | 2 + gas/config/tc-i386.c | 1 + gas/doc/c-i386.texi | 4 +- gas/testsuite/gas/i386/amx-avx512-inval.l | 7 + gas/testsuite/gas/i386/amx-avx512-inval.s | 11 + gas/testsuite/gas/i386/i386.exp | 1 + .../gas/i386/x86-64-amx-avx512-intel.d | 35 + gas/testsuite/gas/i386/x86-64-amx-avx512.d | 34 + gas/testsuite/gas/i386/x86-64-amx-avx512.s | 55 + gas/testsuite/gas/i386/x86-64.exp | 2 + opcodes/i386-dis-evex-len.h | 28 + opcodes/i386-dis-evex-prefix.h | 27 + opcodes/i386-dis-evex-w.h | 16 + opcodes/i386-dis-evex-x86-64.h | 20 + opcodes/i386-dis-evex.h | 8 +- opcodes/i386-dis.c | 22 +- opcodes/i386-gen.c | 3 + opcodes/i386-init.h | 766 ++--- opcodes/i386-mnem.h | 2554 +++++++++-------- opcodes/i386-opc.h | 3 + opcodes/i386-opc.tbl | 15 + opcodes/i386-tbl.h | 416 ++- 22 files changed, 2244 insertions(+), 1786 deletions(-) create mode 100644 gas/testsuite/gas/i386/amx-avx512-inval.l create mode 100644 gas/testsuite/gas/i386/amx-avx512-inval.s create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512-intel.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-avx512.s diff --git a/gas/NEWS b/gas/NEWS index 2a8f79b6360..9575dcdaaa1 100644 --- a/gas/NEWS +++ b/gas/NEWS @@ -1,5 +1,7 @@ -*- text -*- +* Add support for Intel AMX-AVX512 instructions. + * Add support for Intel AMX-TRANSPOSE instructions. * Add support for Intel MSR_IMM instructions. diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index c28d98109d3..9e54aae65fa 100644 --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -1183,6 +1183,7 @@ static const arch_entry cpu_arch[] = SUBARCH (amx_fp16, AMX_FP16, ANY_AMX_FP16, false), SUBARCH (amx_complex, AMX_COMPLEX, ANY_AMX_COMPLEX, false), SUBARCH (amx_transpose, AMX_TRANSPOSE, ANY_AMX_TRANSPOSE, false), + SUBARCH (amx_avx512, AMX_AVX512, ANY_AMX_AVX512, false), SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false), SUBARCH (movdiri, MOVDIRI, MOVDIRI, false), SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false), diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi index 84bb875791d..dd2e422e323 100644 --- a/gas/doc/c-i386.texi +++ b/gas/doc/c-i386.texi @@ -229,6 +229,7 @@ accept various extension mnemonics. For example, @code{amx_fp16}, @code{amx_complex}, @code{amx_transpose}, +@code{amx_avx512}, @code{amx_tile}, @code{vmx}, @code{vmfunc}, @@ -1701,7 +1702,8 @@ supported on the CPU specified. The choices for @var{cpu_type} are: @item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpclmulqdq} @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk} @item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} -@item @samp{.amx_complex} @tab @samp{.amx_transpose} @tab @samp{.amx_tile} +@item @samp{.amx_complex} @tab @samp{.amx_transpose} @tab @samp{.amx_avx512} +@item @samp{.amx_tile} @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset} @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5} @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} diff --git a/gas/testsuite/gas/i386/amx-avx512-inval.l b/gas/testsuite/gas/i386/amx-avx512-inval.l new file mode 100644 index 00000000000..0cfe3a7bf2f --- /dev/null +++ b/gas/testsuite/gas/i386/amx-avx512-inval.l @@ -0,0 +1,7 @@ +.* Assembler messages: +.*:6: Error: `tcvtrowd2ps' is only supported in 64-bit mode +.*:7: Error: `tcvtrowps2pbf16h' is only supported in 64-bit mode +.*:8: Error: `tcvtrowps2pbf16l' is only supported in 64-bit mode +.*:9: Error: `tcvtrowps2phh' is only supported in 64-bit mode +.*:10: Error: `tcvtrowps2phl' is only supported in 64-bit mode +.*:11: Error: `tilemovrow' is only supported in 64-bit mode diff --git a/gas/testsuite/gas/i386/amx-avx512-inval.s b/gas/testsuite/gas/i386/amx-avx512-inval.s new file mode 100644 index 00000000000..2e7a6af2e60 --- /dev/null +++ b/gas/testsuite/gas/i386/amx-avx512-inval.s @@ -0,0 +1,11 @@ +# Check Illegal AMX-AVX512 instructions + + .allow_index_reg + .text +_start: + tcvtrowd2ps %edx, %tmm5, %zmm30 + tcvtrowps2pbf16h %edx, %tmm5, %zmm30 + tcvtrowps2pbf16l %edx, %tmm5, %zmm30 + tcvtrowps2phh %edx, %tmm5, %zmm30 + tcvtrowps2phl %edx, %tmm5, %zmm30 + tilemovrow %edx, %tmm5, %zmm30 diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp index 71abb0fc66c..acc1e2b9a63 100644 --- a/gas/testsuite/gas/i386/i386.exp +++ b/gas/testsuite/gas/i386/i386.exp @@ -547,6 +547,7 @@ if [gas_32_check] then { run_dump_test "avx10_2-256-miscs-intel" run_list_test "msr_imm-inval" run_list_test "amx-transpose-inval" + run_list_test "amx-avx512-inval" run_list_test "sg" run_dump_test "clzero" run_dump_test "invlpgb" diff --git a/gas/testsuite/gas/i386/x86-64-amx-avx512-intel.d b/gas/testsuite/gas/i386/x86-64-amx-avx512-intel.d new file mode 100644 index 00000000000..06ef2293bb9 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-avx512-intel.d @@ -0,0 +1,35 @@ +#objdump: -dw -Mintel +#name: x86_64 AMX-AVX512 insns (Intel disassembly) +#source: x86-64-amx-avx512.s + +.*: +file format .* + +Disassembly of section \.text: + +#... +[a-f0-9]+ <_intel>: +\s*[a-f0-9]+:\s*62 62 6e 48 4a f5\s+tcvtrowd2ps zmm30,tmm5,edx +\s*[a-f0-9]+:\s*62 62 6e 48 4a f2\s+tcvtrowd2ps zmm30,tmm2,edx +\s*[a-f0-9]+:\s*62 63 7e 48 07 f5 7b\s+tcvtrowd2ps zmm30,tmm5,0x7b +\s*[a-f0-9]+:\s*62 63 7e 48 07 f2 7b\s+tcvtrowd2ps zmm30,tmm2,0x7b +\s*[a-f0-9]+:\s*62 62 6f 48 6d f5\s+tcvtrowps2pbf16h zmm30,tmm5,edx +\s*[a-f0-9]+:\s*62 62 6f 48 6d f2\s+tcvtrowps2pbf16h zmm30,tmm2,edx +\s*[a-f0-9]+:\s*62 63 7f 48 07 f5 7b\s+tcvtrowps2pbf16h zmm30,tmm5,0x7b +\s*[a-f0-9]+:\s*62 63 7f 48 07 f2 7b\s+tcvtrowps2pbf16h zmm30,tmm2,0x7b +\s*[a-f0-9]+:\s*62 62 6e 48 6d f5\s+tcvtrowps2pbf16l zmm30,tmm5,edx +\s*[a-f0-9]+:\s*62 62 6e 48 6d f2\s+tcvtrowps2pbf16l zmm30,tmm2,edx +\s*[a-f0-9]+:\s*62 63 7e 48 77 f5 7b\s+tcvtrowps2pbf16l zmm30,tmm5,0x7b +\s*[a-f0-9]+:\s*62 63 7e 48 77 f2 7b\s+tcvtrowps2pbf16l zmm30,tmm2,0x7b +\s*[a-f0-9]+:\s*62 62 6c 48 6d f5\s+tcvtrowps2phh zmm30,tmm5,edx +\s*[a-f0-9]+:\s*62 62 6c 48 6d f2\s+tcvtrowps2phh zmm30,tmm2,edx +\s*[a-f0-9]+:\s*62 63 7c 48 07 f5 7b\s+tcvtrowps2phh zmm30,tmm5,0x7b +\s*[a-f0-9]+:\s*62 63 7c 48 07 f2 7b\s+tcvtrowps2phh zmm30,tmm2,0x7b +\s*[a-f0-9]+:\s*62 62 6d 48 6d f5\s+tcvtrowps2phl zmm30,tmm5,edx +\s*[a-f0-9]+:\s*62 62 6d 48 6d f2\s+tcvtrowps2phl zmm30,tmm2,edx +\s*[a-f0-9]+:\s*62 63 7f 48 77 f5 7b\s+tcvtrowps2phl zmm30,tmm5,0x7b +\s*[a-f0-9]+:\s*62 63 7f 48 77 f2 7b\s+tcvtrowps2phl zmm30,tmm2,0x7b +\s*[a-f0-9]+:\s*62 62 6d 48 4a f5\s+tilemovrow zmm30,tmm5,edx +\s*[a-f0-9]+:\s*62 62 6d 48 4a f2\s+tilemovrow zmm30,tmm2,edx +\s*[a-f0-9]+:\s*62 63 7d 48 07 f5 7b\s+tilemovrow zmm30,tmm5,0x7b +\s*[a-f0-9]+:\s*62 63 7d 48 07 f2 7b\s+tilemovrow zmm30,tmm2,0x7b +#pass diff --git a/gas/testsuite/gas/i386/x86-64-amx-avx512.d b/gas/testsuite/gas/i386/x86-64-amx-avx512.d new file mode 100644 index 00000000000..410588d494e --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-avx512.d @@ -0,0 +1,34 @@ +#objdump: -dw +#name: x86_64 AMX-AVX512 insns +#source: x86-64-amx-avx512.s + +.*: +file format .* + +Disassembly of section \.text: + +0+ <_start>: +\s*[a-f0-9]+:\s*62 62 6e 48 4a f5\s+tcvtrowd2ps %edx,%tmm5,%zmm30 +\s*[a-f0-9]+:\s*62 62 6e 48 4a f2\s+tcvtrowd2ps %edx,%tmm2,%zmm30 +\s*[a-f0-9]+:\s*62 63 7e 48 07 f5 7b\s+tcvtrowd2ps \$0x7b,%tmm5,%zmm30 +\s*[a-f0-9]+:\s*62 63 7e 48 07 f2 7b\s+tcvtrowd2ps \$0x7b,%tmm2,%zmm30 +\s*[a-f0-9]+:\s*62 62 6f 48 6d f5\s+tcvtrowps2pbf16h %edx,%tmm5,%zmm30 +\s*[a-f0-9]+:\s*62 62 6f 48 6d f2\s+tcvtrowps2pbf16h %edx,%tmm2,%zmm30 +\s*[a-f0-9]+:\s*62 63 7f 48 07 f5 7b\s+tcvtrowps2pbf16h \$0x7b,%tmm5,%zmm30 +\s*[a-f0-9]+:\s*62 63 7f 48 07 f2 7b\s+tcvtrowps2pbf16h \$0x7b,%tmm2,%zmm30 +\s*[a-f0-9]+:\s*62 62 6e 48 6d f5\s+tcvtrowps2pbf16l %edx,%tmm5,%zmm30 +\s*[a-f0-9]+:\s*62 62 6e 48 6d f2\s+tcvtrowps2pbf16l %edx,%tmm2,%zmm30 +\s*[a-f0-9]+:\s*62 63 7e 48 77 f5 7b\s+tcvtrowps2pbf16l \$0x7b,%tmm5,%zmm30 +\s*[a-f0-9]+:\s*62 63 7e 48 77 f2 7b\s+tcvtrowps2pbf16l \$0x7b,%tmm2,%zmm30 +\s*[a-f0-9]+:\s*62 62 6c 48 6d f5\s+tcvtrowps2phh %edx,%tmm5,%zmm30 +\s*[a-f0-9]+:\s*62 62 6c 48 6d f2\s+tcvtrowps2phh %edx,%tmm2,%zmm30 +\s*[a-f0-9]+:\s*62 63 7c 48 07 f5 7b\s+tcvtrowps2phh \$0x7b,%tmm5,%zmm30 +\s*[a-f0-9]+:\s*62 63 7c 48 07 f2 7b\s+tcvtrowps2phh \$0x7b,%tmm2,%zmm30 +\s*[a-f0-9]+:\s*62 62 6d 48 6d f5\s+tcvtrowps2phl %edx,%tmm5,%zmm30 +\s*[a-f0-9]+:\s*62 62 6d 48 6d f2\s+tcvtrowps2phl %edx,%tmm2,%zmm30 +\s*[a-f0-9]+:\s*62 63 7f 48 77 f5 7b\s+tcvtrowps2phl \$0x7b,%tmm5,%zmm30 +\s*[a-f0-9]+:\s*62 63 7f 48 77 f2 7b\s+tcvtrowps2phl \$0x7b,%tmm2,%zmm30 +\s*[a-f0-9]+:\s*62 62 6d 48 4a f5\s+tilemovrow %edx,%tmm5,%zmm30 +\s*[a-f0-9]+:\s*62 62 6d 48 4a f2\s+tilemovrow %edx,%tmm2,%zmm30 +\s*[a-f0-9]+:\s*62 63 7d 48 07 f5 7b\s+tilemovrow \$0x7b,%tmm5,%zmm30 +\s*[a-f0-9]+:\s*62 63 7d 48 07 f2 7b\s+tilemovrow \$0x7b,%tmm2,%zmm30 +#pass diff --git a/gas/testsuite/gas/i386/x86-64-amx-avx512.s b/gas/testsuite/gas/i386/x86-64-amx-avx512.s new file mode 100644 index 00000000000..0faecfde820 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-avx512.s @@ -0,0 +1,55 @@ +# Check 64bit AMX-AVX512 instructions + + .text +_start: + tcvtrowd2ps %edx, %tmm5, %zmm30 + tcvtrowd2ps %edx, %tmm2, %zmm30 + tcvtrowd2ps $123, %tmm5, %zmm30 + tcvtrowd2ps $123, %tmm2, %zmm30 + tcvtrowps2pbf16h %edx, %tmm5, %zmm30 + tcvtrowps2pbf16h %edx, %tmm2, %zmm30 + tcvtrowps2pbf16h $123, %tmm5, %zmm30 + tcvtrowps2pbf16h $123, %tmm2, %zmm30 + tcvtrowps2pbf16l %edx, %tmm5, %zmm30 + tcvtrowps2pbf16l %edx, %tmm2, %zmm30 + tcvtrowps2pbf16l $123, %tmm5, %zmm30 + tcvtrowps2pbf16l $123, %tmm2, %zmm30 + tcvtrowps2phh %edx, %tmm5, %zmm30 + tcvtrowps2phh %edx, %tmm2, %zmm30 + tcvtrowps2phh $123, %tmm5, %zmm30 + tcvtrowps2phh $123, %tmm2, %zmm30 + tcvtrowps2phl %edx, %tmm5, %zmm30 + tcvtrowps2phl %edx, %tmm2, %zmm30 + tcvtrowps2phl $123, %tmm5, %zmm30 + tcvtrowps2phl $123, %tmm2, %zmm30 + tilemovrow %edx, %tmm5, %zmm30 + tilemovrow %edx, %tmm2, %zmm30 + tilemovrow $123, %tmm5, %zmm30 + tilemovrow $123, %tmm2, %zmm30 + +_intel: + .intel_syntax noprefix + tcvtrowd2ps zmm30, tmm5, edx + tcvtrowd2ps zmm30, tmm2, edx + tcvtrowd2ps zmm30, tmm5, 123 + tcvtrowd2ps zmm30, tmm2, 123 + tcvtrowps2pbf16h zmm30, tmm5, edx + tcvtrowps2pbf16h zmm30, tmm2, edx + tcvtrowps2pbf16h zmm30, tmm5, 123 + tcvtrowps2pbf16h zmm30, tmm2, 123 + tcvtrowps2pbf16l zmm30, tmm5, edx + tcvtrowps2pbf16l zmm30, tmm2, edx + tcvtrowps2pbf16l zmm30, tmm5, 123 + tcvtrowps2pbf16l zmm30, tmm2, 123 + tcvtrowps2phh zmm30, tmm5, edx + tcvtrowps2phh zmm30, tmm2, edx + tcvtrowps2phh zmm30, tmm5, 123 + tcvtrowps2phh zmm30, tmm2, 123 + tcvtrowps2phl zmm30, tmm5, edx + tcvtrowps2phl zmm30, tmm2, edx + tcvtrowps2phl zmm30, tmm5, 123 + tcvtrowps2phl zmm30, tmm2, 123 + tilemovrow zmm30, tmm5, edx + tilemovrow zmm30, tmm2, edx + tilemovrow zmm30, tmm5, 123 + tilemovrow zmm30, tmm2, 123 diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp index 1d54d7700e2..131e598e02a 100644 --- a/gas/testsuite/gas/i386/x86-64.exp +++ b/gas/testsuite/gas/i386/x86-64.exp @@ -527,6 +527,8 @@ run_list_test "x86-64-msr_imm-inval" run_dump_test "x86-64-amx-transpose" run_dump_test "x86-64-amx-transpose-intel" run_list_test "x86-64-amx-transpose-inval" +run_dump_test "x86-64-amx-avx512" +run_dump_test "x86-64-amx-avx512-intel" run_dump_test "x86-64-clzero" run_dump_test "x86-64-mwaitx-bdver4" run_list_test "x86-64-mwaitx-reg" diff --git a/opcodes/i386-dis-evex-len.h b/opcodes/i386-dis-evex-len.h index 24cc7b2e027..276749e1d54 100644 --- a/opcodes/i386-dis-evex-len.h +++ b/opcodes/i386-dis-evex-len.h @@ -44,6 +44,13 @@ static const struct dis386 evex_len_table[][3] = { { "vperm%DQ", { XM, Vex, EXx }, PREFIX_DATA }, }, + /* EVEX_LEN_0F384A_X86_64_W_0 */ + { + { Bad_Opcode }, + { Bad_Opcode }, + { PREFIX_TABLE (PREFIX_EVEX_0F384A_X86_64_W_0_L_2) }, + }, + /* EVEX_LEN_0F385A */ { { Bad_Opcode }, @@ -58,6 +65,13 @@ static const struct dis386 evex_len_table[][3] = { { VEX_W_TABLE (EVEX_W_0F385B_L_2) }, }, + /* EVEX_LEN_0F386D_X86_64_W_0_M_1 */ + { + { Bad_Opcode }, + { Bad_Opcode }, + { PREFIX_TABLE (PREFIX_EVEX_0F386D_X86_64_W_0_L_2) }, + }, + /* EVEX_LEN_0F38C6 */ { { Bad_Opcode }, @@ -86,6 +100,13 @@ static const struct dis386 evex_len_table[][3] = { { VEX_W_TABLE (VEX_W_0F3A01_L_1) }, }, + /* EVEX_LEN_0F3A07_X86_64_W_0 */ + { + { Bad_Opcode }, + { Bad_Opcode }, + { PREFIX_TABLE (PREFIX_EVEX_0F3A07_X86_64_W_0_L_2) }, + }, + /* EVEX_LEN_0F3A18 */ { { Bad_Opcode }, @@ -156,6 +177,13 @@ static const struct dis386 evex_len_table[][3] = { { VEX_W_TABLE (EVEX_W_0F3A43_L_n) }, }, + /* EVEX_LEN_0F3A77_X86_64_W_0 */ + { + { Bad_Opcode }, + { Bad_Opcode }, + { PREFIX_TABLE (PREFIX_EVEX_0F3A77_X86_64_W_0_L_2) }, + }, + /* EVEX_LEN_MAP5_6E */ { { PREFIX_TABLE (PREFIX_EVEX_MAP5_6E_L_0) }, diff --git a/opcodes/i386-dis-evex-prefix.h b/opcodes/i386-dis-evex-prefix.h index 55d6d806ccb..a559b0b7c62 100644 --- a/opcodes/i386-dis-evex-prefix.h +++ b/opcodes/i386-dis-evex-prefix.h @@ -243,6 +243,12 @@ { VEX_W_TABLE (EVEX_W_0F383A_P_1) }, { "%XEvpminuw", { XM, Vex, EXx }, 0 }, }, + /* PREFIX_EVEX_0F384A_W_0_L_2 */ + { + { Bad_Opcode }, + { "tcvtrowd2ps", { XM, Rtmm, VexGd }, 0 }, + { "tilemovrow", { XM, Rtmm, VexGd }, 0 }, + }, /* PREFIX_EVEX_0F3852 */ { { "vdpphp%XS", { XM, Vex, EXx }, 0 }, @@ -264,6 +270,13 @@ { Bad_Opcode }, { "vp2intersectY%DQ", { MaskG, Vex, EXx, EXxEVexS }, 0 }, }, + /* PREFIX_EVEX_0F386D_W_0_L_2 */ + { + { "tcvtrowps2phh", { XM, Rtmm, VexGd }, 0 }, + { "tcvtrowps2pbf16l", { XM, Rtmm, VexGd }, 0 }, + { "tcvtrowps2phl", { XM, Rtmm, VexGd }, 0 }, + { "tcvtrowps2pbf16h", { XM, Rtmm, VexGd }, 0 }, + }, /* PREFIX_EVEX_0F3872 */ { { Bad_Opcode }, @@ -306,6 +319,13 @@ { "%XEvfmsub213s%XW", { XMScalar, VexScalar, EXdq, EXxEVexR }, 0 }, { "v4fnmadds%XS", { XMScalar, VexScalar, Mxmm }, 0 }, }, + /* PREFIX_EVEX_0F3A07_W_0_L_2 */ + { + { "tcvtrowps2phh", { XM, Rtmm, Ib }, 0 }, + { "tcvtrowd2ps", { XM, Rtmm, Ib }, 0 }, + { "tilemovrow", { XM, Rtmm, Ib }, 0 }, + { "tcvtrowps2pbf16h", { XM, Rtmm, Ib }, 0 }, + }, /* PREFIX_EVEX_0F3A08 */ { { "vrndscalep%XH", { XM, EXxh, EXxEVexS, Ib }, 0 }, @@ -377,6 +397,13 @@ { Bad_Opcode }, { "vfpclasss%XW", { MaskG, EXdq, Ib }, 0 }, }, + /* PREFIX_EVEX_0F3A77_W_0_L_2 */ + { + { Bad_Opcode }, + { "tcvtrowps2pbf16l", { XM, Rtmm, Ib }, 0 }, + { Bad_Opcode }, + { "tcvtrowps2phl", { XM, Rtmm, Ib }, 0 }, + }, /* PREFIX_EVEX_0F3AC2 */ { { "vcmpp%XH", { MaskG, Vex, EXxh, EXxEVexS, CMP }, 0 }, diff --git a/opcodes/i386-dis-evex-w.h b/opcodes/i386-dis-evex-w.h index 36b8150cd2c..70f65dab96e 100644 --- a/opcodes/i386-dis-evex-w.h +++ b/opcodes/i386-dis-evex-w.h @@ -336,6 +336,10 @@ { { "vpbroadcastmw2dY", { XM, MaskR }, 0 }, }, + /* EVEX_W_0F384A_X86_64 */ + { + { EVEX_LEN_TABLE (EVEX_LEN_0F384A_X86_64_W_0) }, + }, /* EVEX_W_0F3859 */ { { "vbroadcasti32x2", { XM, EXq }, PREFIX_DATA }, @@ -351,6 +355,10 @@ { "vbroadcasti32x8", { XM, Mymm }, PREFIX_DATA }, { "vbroadcasti64x4", { XM, Mymm }, PREFIX_DATA }, }, + /* EVEX_W_0F386D_X86_64 */ + { + { EVEX_LEN_TABLE (EVEX_LEN_0F386D_X86_64_W_0) }, + }, /* EVEX_W_0F3870 */ { { Bad_Opcode }, @@ -374,6 +382,10 @@ { Bad_Opcode }, { "vpmultishiftqb", { XM, Vex, EXx }, PREFIX_DATA }, }, + /* EVEX_W_0F3A07_X86_64 */ + { + { EVEX_LEN_TABLE (EVEX_LEN_0F3A07_X86_64_W_0) }, + }, /* EVEX_W_0F3A18_L_n */ { { "vinsertf32x4", { XM, Vex, EXxmm, Ib }, PREFIX_DATA }, @@ -442,6 +454,10 @@ { Bad_Opcode }, { "vpshrdw", { XM, Vex, EXx, Ib }, 0 }, }, + /* EVEX_W_0F3A77_X86_64 */ + { + { EVEX_LEN_TABLE (EVEX_LEN_0F3A77_X86_64_W_0) }, + }, /* EVEX_W_MAP4_8F_R_0 */ { { "pop2", { { PUSH2_POP2_Fixup, q_mode}, Eq }, NO_PREFIX }, diff --git a/opcodes/i386-dis-evex-x86-64.h b/opcodes/i386-dis-evex-x86-64.h index 4e52607d306..21bf3bf5e5d 100644 --- a/opcodes/i386-dis-evex-x86-64.h +++ b/opcodes/i386-dis-evex-x86-64.h @@ -1,3 +1,23 @@ + /* X86_64_EVEX_0F384A */ + { + { Bad_Opcode }, + { VEX_W_TABLE (EVEX_W_0F384A_X86_64) }, + }, + /* X86_64_EVEX_0F386D */ + { + { Bad_Opcode }, + { VEX_W_TABLE (EVEX_W_0F386D_X86_64) }, + }, + /* X86_64_EVEX_0F3A07 */ + { + { Bad_Opcode }, + { VEX_W_TABLE (EVEX_W_0F3A07_X86_64) }, + }, + /* X86_64_EVEX_0F3A77 */ + { + { Bad_Opcode }, + { VEX_W_TABLE (EVEX_W_0F3A77_X86_64) }, + }, /* X86_64_EVEX_MAP5_6C_W_1_P_1 */ { { Bad_Opcode }, diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h index d42b5af7b53..130b1da0272 100644 --- a/opcodes/i386-dis-evex.h +++ b/opcodes/i386-dis-evex.h @@ -376,7 +376,7 @@ static const struct dis386 evex_table[][256] = { /* 48 */ { Bad_Opcode }, { X86_64_EVEX_MEM_W_TABLE (VEX_W_0F3849_X86_64_L_0) }, - { Bad_Opcode }, + { X86_64_TABLE (X86_64_EVEX_0F384A) }, { X86_64_EVEX_MEM_W_TABLE (VEX_W_0F384B_X86_64_L_0) }, { "vrcp14p%XW", { XM, EXx }, PREFIX_DATA }, { "vrcp14s%XW", { XMScalar, VexScalar, EXdq }, PREFIX_DATA }, @@ -415,7 +415,7 @@ static const struct dis386 evex_table[][256] = { { Bad_Opcode }, { Bad_Opcode }, { Bad_Opcode }, - { Bad_Opcode }, + { X86_64_TABLE (X86_64_EVEX_0F386D) }, { Bad_Opcode }, { Bad_Opcode }, /* 70 */ @@ -591,7 +591,7 @@ static const struct dis386 evex_table[][256] = { { VEX_W_TABLE (VEX_W_0F3A04) }, { "%XEvpermilp%XD", { XM, EXx, Ib }, PREFIX_DATA }, { Bad_Opcode }, - { Bad_Opcode }, + { X86_64_TABLE (X86_64_EVEX_0F3A07) }, /* 08 */ { PREFIX_TABLE (PREFIX_EVEX_0F3A08) }, { "vrndscalep%XD", { XM, EXx, EXxEVexS, Ib }, PREFIX_DATA }, @@ -717,7 +717,7 @@ static const struct dis386 evex_table[][256] = { { Bad_Opcode }, { Bad_Opcode }, { Bad_Opcode }, - { Bad_Opcode }, + { X86_64_TABLE (X86_64_EVEX_0F3A77) }, /* 78 */ { Bad_Opcode }, { Bad_Opcode }, diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c index 2095bb65196..8f651f7a06f 100644 --- a/opcodes/i386-dis.c +++ b/opcodes/i386-dis.c @@ -592,6 +592,7 @@ fetch_error (const instr_info *ins) #define VexGatherD { OP_VEX, vex_vsib_d_w_dq_mode } #define VexGatherQ { OP_VEX, vex_vsib_q_w_dq_mode } #define VexGdq { OP_VEX, dq_mode } +#define VexGd { OP_VEX, d_mode } #define VexGb { OP_VEX, b_mode } #define VexGv { OP_VEX, v_mode } #define VexTmm { OP_VEX, tmm_mode } @@ -1200,9 +1201,11 @@ enum PREFIX_EVEX_0F3838, PREFIX_EVEX_0F3839, PREFIX_EVEX_0F383A, + PREFIX_EVEX_0F384A_X86_64_W_0_L_2, PREFIX_EVEX_0F3852, PREFIX_EVEX_0F3853, PREFIX_EVEX_0F3868, + PREFIX_EVEX_0F386D_X86_64_W_0_L_2, PREFIX_EVEX_0F3872, PREFIX_EVEX_0F3874, PREFIX_EVEX_0F389A, @@ -1210,6 +1213,7 @@ enum PREFIX_EVEX_0F38AA, PREFIX_EVEX_0F38AB, + PREFIX_EVEX_0F3A07_X86_64_W_0_L_2, PREFIX_EVEX_0F3A08, PREFIX_EVEX_0F3A0A, PREFIX_EVEX_0F3A26, @@ -1221,6 +1225,7 @@ enum PREFIX_EVEX_0F3A57, PREFIX_EVEX_0F3A66, PREFIX_EVEX_0F3A67, + PREFIX_EVEX_0F3A77_X86_64_W_0_L_2, PREFIX_EVEX_0F3AC2, PREFIX_EVEX_MAP4_4x, @@ -1362,7 +1367,12 @@ enum X86_64_VEX_MAP7_F6_L_0_W_0_R_0, X86_64_VEX_MAP7_F8_L_0_W_0_R_0, - + + X86_64_EVEX_0F384A, + X86_64_EVEX_0F386D, + X86_64_EVEX_0F3A07, + X86_64_EVEX_0F3A77, + X86_64_EVEX_MAP5_6C_W_1_P_1, X86_64_EVEX_MAP5_6C_W_1_P_3, X86_64_EVEX_MAP5_6D_W_1_P_1, @@ -1555,12 +1565,15 @@ enum EVEX_LEN_0F381A, EVEX_LEN_0F381B, EVEX_LEN_0F3836, + EVEX_LEN_0F384A_X86_64_W_0, EVEX_LEN_0F385A, EVEX_LEN_0F385B, + EVEX_LEN_0F386D_X86_64_W_0, EVEX_LEN_0F38C6, EVEX_LEN_0F38C7, EVEX_LEN_0F3A00, EVEX_LEN_0F3A01, + EVEX_LEN_0F3A07_X86_64_W_0, EVEX_LEN_0F3A18, EVEX_LEN_0F3A19, EVEX_LEN_0F3A1A, @@ -1571,6 +1584,7 @@ enum EVEX_LEN_0F3A3A, EVEX_LEN_0F3A3B, EVEX_LEN_0F3A43, + EVEX_LEN_0F3A77_X86_64_W_0, EVEX_LEN_MAP5_6E, EVEX_LEN_MAP5_7E, @@ -1779,15 +1793,18 @@ enum EVEX_W_0F3835_P_2, EVEX_W_0F3837, EVEX_W_0F383A_P_1, + EVEX_W_0F384A_X86_64, EVEX_W_0F3859, EVEX_W_0F385A_L_n, EVEX_W_0F385B_L_2, + EVEX_W_0F386D_X86_64, EVEX_W_0F3870, EVEX_W_0F3872_P_2, EVEX_W_0F387A, EVEX_W_0F387B, EVEX_W_0F3883, + EVEX_W_0F3A07_X86_64, EVEX_W_0F3A18_L_n, EVEX_W_0F3A19_L_n, EVEX_W_0F3A1A_L_2, @@ -1802,6 +1819,7 @@ enum EVEX_W_0F3A43_L_n, EVEX_W_0F3A70, EVEX_W_0F3A72, + EVEX_W_0F3A77_X86_64, EVEX_W_MAP4_8F_R_0, EVEX_W_MAP4_F8_P1_M_1, @@ -13931,6 +13949,8 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED) case 512: names = att_names_zmm; ins->evex_used |= EVEX_len_used; + if (bytemode == d_mode) + names = att_names32; break; default: abort (); diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c index be05b1be817..168dc565a60 100644 --- a/opcodes/i386-gen.c +++ b/opcodes/i386-gen.c @@ -265,6 +265,8 @@ static const dependency isa_dependencies[] = "AMX_TILE" }, { "AMX_TRANSPOSE", "AMX_TILE" }, + { "AMX_AVX512", + "AMX_TILE|AVX10_2" }, { "KL", "SSE2" }, { "WIDEKL", @@ -432,6 +434,7 @@ static bitfield cpu_flags[] = BITFIELD (AMX_FP16), BITFIELD (AMX_COMPLEX), BITFIELD (AMX_TRANSPOSE), + BITFIELD (AMX_AVX512), BITFIELD (AMX_TILE), BITFIELD (MOVDIRI), BITFIELD (MOVDIR64B), diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h index fd11f9f0cd8..91972954966 100644 --- a/opcodes/i386-opc.h +++ b/opcodes/i386-opc.h @@ -252,6 +252,8 @@ enum i386_cpu CpuAMX_FP16, /* AMX-COMPLEX instructions required. */ CpuAMX_COMPLEX, + /* Intel AMX-AVX512 Instructions support required. */ + CpuAMX_AVX512, /* AMX-TILE instructions required */ CpuAMX_TILE, /* GFNI instructions required */ @@ -500,6 +502,7 @@ typedef union i386_cpu_flags unsigned int cpuamx_bf16:1; unsigned int cpuamx_fp16:1; unsigned int cpuamx_complex:1; + unsigned int cpuamx_avx512:1; unsigned int cpuamx_tile:1; unsigned int cpugfni:1; unsigned int cpuvaes:1; diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl index d8f2a180ba7..d17765aa0af 100644 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -3204,6 +3204,19 @@ tconjtcmmimfp16ps, 0x6b, AMX_COMPLEX&AMX_TRANSPOSE, Modrm|Vex128|Space0F38|Src2V tconjtfp16, 0x666b, AMX_COMPLEX&AMX_TRANSPOSE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM, RegTMM } +tcvtrowd2ps, 0xf34a, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM } +tcvtrowd2ps, 0xf307, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM } + +tcvtrowps2pbf16h, 0xf26d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM } +tcvtrowps2pbf16h, 0xf207, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM } +tcvtrowps2pbf16l, 0xf36d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM } +tcvtrowps2pbf16l, 0xf377, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM } + +tcvtrowps2phh, 0x6d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM } +tcvtrowps2phh, 0x07, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM } +tcvtrowps2phl, 0x666d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM } +tcvtrowps2phl, 0xf277, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM } + tdpbf16ps, 0xf35c, AMX_BF16, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } tdpfp16ps, 0xf25c, AMX_FP16, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } tdpbssd, 0xf25e, AMX_INT8, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } @@ -3213,6 +3226,8 @@ tdpbsud, 0xf35e, AMX_INT8, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM tileloadd, 0xf24b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } tileloaddt1, 0x664b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } +tilemovrow, 0x664a, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM } +tilemovrow, 0x6607, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM } tilestored, 0xf34b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { RegTMM, Unspecified|BaseIndex } tilerelease, 0x49c0, AMX_TILE, Vex128|Space0F38|VexW0|NoSuf, {} From patchwork Wed Nov 13 08:44:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haochen Jiang X-Patchwork-Id: 100950 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 30A733858424 for ; Wed, 13 Nov 2024 08:45:31 +0000 (GMT) X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) by sourceware.org (Postfix) with ESMTPS id 467F03858D29 for ; Wed, 13 Nov 2024 08:44:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 467F03858D29 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 467F03858D29 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.17 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731487491; cv=none; b=fIi58RJyQQHbzt75i4xaajGfWC99k+Nhl4UFpCgMnlkjv7Htj1SqTLapoRVwSCc2Cz2vkYG7HOSExjOIWO2mjRCX9PTiO5fgkzDqIs4O/w7vluzZaghpRz1HySGUNSHqBnephSLqWakkWrg7RqS0eKeDfN98vIOC3peSBKJWMDI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731487491; c=relaxed/simple; bh=/UlemIV5RFqZo4ktl6wLCZw4JLXLdHFu1UVRhF6FARQ=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=CDmwQG3QIERB0wAlMdtguPwEFHhYVrMpELf1KMoIuceL63ZL46O80L0UTfKCFYWeYi2JgqS0H7MBHgq5qEX4uCR3yLluQk70UYftDQzxvDXR6VI+9Jb7ZzMUbKbzASEpADgMGiynwJQNyH4eTBuTmSfPiCWtuu9sTxUN6PMDzBM= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731487486; x=1763023486; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/UlemIV5RFqZo4ktl6wLCZw4JLXLdHFu1UVRhF6FARQ=; b=CF+NQoCSCoeS/dGUQ4NWm9Z1wdWKNJ5JrXj8fFXEgCSsnFwvky+eNmSM 7auboRzKvnsFHwHGfBjp+dtCGW2oxnEg2vsiL7CTVUePpPtJrhE3Md679 6+GePRUDxUORfgrthic+zMmlHOUys73rXeGRZIqU7ptRjTHplxeJbTa4Z vwXXCsuCGJPIIc/1aKa9hXt77aZkFgrGsWLI4d7ZRhgD3zCmfcuRG4eTn GdjILqWC+cV7ns8vqn00Cb1+TTFJyQkb0FwMDR6k1k0ch1Cv2R53vVbNp j9PawmIIcs0FvIunLU0p3Cjfivot0ACRV3kPmDufz2dfDX3Dvgc3ov/KE Q==; X-CSE-ConnectionGUID: rAWZpPAXSBG+q89UwOH+ng== X-CSE-MsgGUID: XZZooCz1Rwiv92rGn73G8g== X-IronPort-AV: E=McAfee;i="6700,10204,11254"; a="31458910" X-IronPort-AV: E=Sophos;i="6.12,150,1728975600"; d="scan'208";a="31458910" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2024 00:44:44 -0800 X-CSE-ConnectionGUID: VSPbY0/aQBylhwxGvs3yQA== X-CSE-MsgGUID: dNqVxq/PQAeZmOB6H1yFsg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,150,1728975600"; d="scan'208";a="87545463" Received: from shliclel4217.sh.intel.com ([10.239.240.127]) by fmviesa007.fm.intel.com with ESMTP; 13 Nov 2024 00:44:43 -0800 From: Haochen Jiang To: binutils@sourceware.org Cc: jbeulich@suse.com, hjl.tools@intel.com Subject: [PATCH 3/6] Support Intel AMX-TF32 Date: Wed, 13 Nov 2024 16:44:32 +0800 Message-Id: <20241113084435.1784546-4-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20241113084435.1784546-1-haochen.jiang@intel.com> References: <20241113084435.1784546-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: binutils-bounces~patchwork=sourceware.org@sourceware.org In this patch, we will support AMX-TF32. It is a simple ISA comparing to the previous ones, so there is no special handling. gas/ChangeLog: * NEWS: Support Intel AMX-TF32. * config/tc-i386.c: Add amx_tf32. * doc/c-i386.texi: Document .amx_tf32. * testsuite/gas/i386/i386.exp: Run AMX-TF32 tests. * testsuite/gas/i386/x86-64.exp: Ditto. * testsuite/gas/i386/amx-tf32-inval.l: New test. * testsuite/gas/i386/amx-tf32-inval.s: Ditto. * testsuite/gas/i386/x86-64-amx-tf32-intel.d: Ditto. * testsuite/gas/i386/x86-64-amx-tf32-inval.l: Ditto. * testsuite/gas/i386/x86-64-amx-tf32-inval.s: Ditto. * testsuite/gas/i386/x86-64-amx-tf32.d: Ditto. * testsuite/gas/i386/x86-64-amx-tf32.s: Ditto. opcodes/ChangeLog: * i386-dis.c (PREFIX_VEX_0F3848_X86_64_W_0_L_0): New. (X86_64_VEX_0F3848): Ditto. (VEX_LEN_0F3848_X86_64_W_0): Ditto. (VEX_W_0F3848_X86_64): Ditto. (prefix_table): Add PREFIX_VEX_0F3848_X86_64_W_0_L_0. (x86_64_table): Add X86_64_VEX_0F3848. (vex_len_table): Add VEX_LEN_0F3848_X86_64_W_0. (vex_w_table): Add VEX_W_0F3848_X86_64. * i386-gen.c (cpu_flag_init): Add CPU_AMX_TF32_FLAGS and CPU_ANY_AMX_TF32_FLAGS. * i386-init.h: Regenerated. * i386-mnem.h: Ditto. * i386-opc.h (CpuAMX_TF32): New. (i386_cpu_flags): Add cpuamx_tf32. * i386-opc.tbl: Add AMX-TF32 instructions. * i386-tbl.h: Regenerated. --- gas/NEWS | 2 + gas/config/tc-i386.c | 1 + gas/doc/c-i386.texi | 3 +- gas/testsuite/gas/i386/amx-tf32-inval.l | 3 + gas/testsuite/gas/i386/amx-tf32-inval.s | 7 + gas/testsuite/gas/i386/i386.exp | 1 + .../gas/i386/x86-64-amx-tf32-intel.d | 15 + .../gas/i386/x86-64-amx-tf32-inval.l | 7 + .../gas/i386/x86-64-amx-tf32-inval.s | 11 + gas/testsuite/gas/i386/x86-64-amx-tf32.d | 13 + gas/testsuite/gas/i386/x86-64-amx-tf32.s | 15 + gas/testsuite/gas/i386/x86-64.exp | 3 + opcodes/i386-dis.c | 28 +- opcodes/i386-gen.c | 3 + opcodes/i386-init.h | 736 +++++----- opcodes/i386-mnem.h | 1250 +++++++++-------- opcodes/i386-opc.h | 3 + opcodes/i386-opc.tbl | 4 + opcodes/i386-tbl.h | 236 ++-- 19 files changed, 1252 insertions(+), 1089 deletions(-) create mode 100644 gas/testsuite/gas/i386/amx-tf32-inval.l create mode 100644 gas/testsuite/gas/i386/amx-tf32-inval.s create mode 100644 gas/testsuite/gas/i386/x86-64-amx-tf32-intel.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-tf32-inval.l create mode 100644 gas/testsuite/gas/i386/x86-64-amx-tf32-inval.s create mode 100644 gas/testsuite/gas/i386/x86-64-amx-tf32.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-tf32.s diff --git a/gas/NEWS b/gas/NEWS index 9575dcdaaa1..56143b9b27e 100644 --- a/gas/NEWS +++ b/gas/NEWS @@ -1,5 +1,7 @@ -*- text -*- +* Add support for Intel AMX-TF32 instructions. + * Add support for Intel AMX-AVX512 instructions. * Add support for Intel AMX-TRANSPOSE instructions. diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index 9e54aae65fa..57c4285cc68 100644 --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -1184,6 +1184,7 @@ static const arch_entry cpu_arch[] = SUBARCH (amx_complex, AMX_COMPLEX, ANY_AMX_COMPLEX, false), SUBARCH (amx_transpose, AMX_TRANSPOSE, ANY_AMX_TRANSPOSE, false), SUBARCH (amx_avx512, AMX_AVX512, ANY_AMX_AVX512, false), + SUBARCH (amx_tf32, AMX_TF32, ANY_AMX_TF32, false), SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false), SUBARCH (movdiri, MOVDIRI, MOVDIRI, false), SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false), diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi index dd2e422e323..bfadb9317e3 100644 --- a/gas/doc/c-i386.texi +++ b/gas/doc/c-i386.texi @@ -230,6 +230,7 @@ accept various extension mnemonics. For example, @code{amx_complex}, @code{amx_transpose}, @code{amx_avx512}, +@code{amx_tf32}, @code{amx_tile}, @code{vmx}, @code{vmfunc}, @@ -1703,7 +1704,7 @@ supported on the CPU specified. The choices for @var{cpu_type} are: @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk} @item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} @item @samp{.amx_complex} @tab @samp{.amx_transpose} @tab @samp{.amx_avx512} -@item @samp{.amx_tile} +@item @samp{.amx_tf32} @tab @samp{.amx_tile} @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset} @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5} @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} diff --git a/gas/testsuite/gas/i386/amx-tf32-inval.l b/gas/testsuite/gas/i386/amx-tf32-inval.l new file mode 100644 index 00000000000..a13a3f6d35b --- /dev/null +++ b/gas/testsuite/gas/i386/amx-tf32-inval.l @@ -0,0 +1,3 @@ +.* Assembler messages: +.*:6: Error: `tmmultf32ps' is only supported in 64-bit mode +.*:7: Error: `ttmmultf32ps' is only supported in 64-bit mode diff --git a/gas/testsuite/gas/i386/amx-tf32-inval.s b/gas/testsuite/gas/i386/amx-tf32-inval.s new file mode 100644 index 00000000000..fd7fb025420 --- /dev/null +++ b/gas/testsuite/gas/i386/amx-tf32-inval.s @@ -0,0 +1,7 @@ +# Check Illegal AMX-TF32 instructions + + .allow_index_reg + .text +_start: + tmmultf32ps %tmm1, %tmm2, %tmm3 + ttmmultf32ps %tmm1, %tmm2, %tmm3 diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp index acc1e2b9a63..45e8adf7723 100644 --- a/gas/testsuite/gas/i386/i386.exp +++ b/gas/testsuite/gas/i386/i386.exp @@ -548,6 +548,7 @@ if [gas_32_check] then { run_list_test "msr_imm-inval" run_list_test "amx-transpose-inval" run_list_test "amx-avx512-inval" + run_list_test "amx-tf32-inval" run_list_test "sg" run_dump_test "clzero" run_dump_test "invlpgb" diff --git a/gas/testsuite/gas/i386/x86-64-amx-tf32-intel.d b/gas/testsuite/gas/i386/x86-64-amx-tf32-intel.d new file mode 100644 index 00000000000..cc9a1d34061 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-tf32-intel.d @@ -0,0 +1,15 @@ +#objdump: -dw -Mintel +#name: x86_64 AMX-TF32 insns (Intel disassembly) +#source: x86-64-amx-tf32.s + +.*: +file format .* + +Disassembly of section \.text: + +#... +[a-f0-9]+ <_intel>: +\s*[a-f0-9]+:\s*c4 e2 59 48 f5\s+tmmultf32ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 71 48 da\s+tmmultf32ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e2 58 48 f5\s+ttmmultf32ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e2 70 48 da\s+ttmmultf32ps tmm3,tmm2,tmm1 +#pass diff --git a/gas/testsuite/gas/i386/x86-64-amx-tf32-inval.l b/gas/testsuite/gas/i386/x86-64-amx-tf32-inval.l new file mode 100644 index 00000000000..069513331b0 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-tf32-inval.l @@ -0,0 +1,7 @@ +.* Assembler messages: +.*:6: Error: all tmm registers must be distinct for `tmmultf32ps' +.*:7: Error: all tmm registers must be distinct for `tmmultf32ps' +.*:8: Error: all tmm registers must be distinct for `tmmultf32ps' +.*:9: Error: all tmm registers must be distinct for `ttmmultf32ps' +.*:10: Error: all tmm registers must be distinct for `ttmmultf32ps' +.*:11: Error: all tmm registers must be distinct for `ttmmultf32ps' diff --git a/gas/testsuite/gas/i386/x86-64-amx-tf32-inval.s b/gas/testsuite/gas/i386/x86-64-amx-tf32-inval.s new file mode 100644 index 00000000000..21a36dc9a82 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-tf32-inval.s @@ -0,0 +1,11 @@ +# Check Illegal 64bit AMX-TF32 instructions + + .allow_index_reg + .text +_start: + tmmultf32ps %tmm1, %tmm1, %tmm2 + tmmultf32ps %tmm1, %tmm2, %tmm1 + tmmultf32ps %tmm2, %tmm1, %tmm1 + ttmmultf32ps %tmm1, %tmm1, %tmm2 + ttmmultf32ps %tmm1, %tmm2, %tmm1 + ttmmultf32ps %tmm2, %tmm1, %tmm1 diff --git a/gas/testsuite/gas/i386/x86-64-amx-tf32.d b/gas/testsuite/gas/i386/x86-64-amx-tf32.d new file mode 100644 index 00000000000..4fa91cbc040 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-tf32.d @@ -0,0 +1,13 @@ +#objdump: -dw +#name: x86_64 AMX-TF32 insns + +.*: +file format .* + +Disassembly of section \.text: + +0+ <_start>: +\s*[a-f0-9]+:\s*c4 e2 59 48 f5\s+tmmultf32ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e2 71 48 da\s+tmmultf32ps %tmm1,%tmm2,%tmm3 +\s*[a-f0-9]+:\s*c4 e2 58 48 f5\s+ttmmultf32ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e2 70 48 da\s+ttmmultf32ps %tmm1,%tmm2,%tmm3 +#pass diff --git a/gas/testsuite/gas/i386/x86-64-amx-tf32.s b/gas/testsuite/gas/i386/x86-64-amx-tf32.s new file mode 100644 index 00000000000..9c1433ed49b --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-tf32.s @@ -0,0 +1,15 @@ +# Check 64bit AMX-TF32 instructions + + .text +_start: + tmmultf32ps %tmm4, %tmm5, %tmm6 + tmmultf32ps %tmm1, %tmm2, %tmm3 + ttmmultf32ps %tmm4, %tmm5, %tmm6 + ttmmultf32ps %tmm1, %tmm2, %tmm3 + +_intel: + .intel_syntax noprefix + tmmultf32ps tmm6, tmm5, tmm4 + tmmultf32ps tmm3, tmm2, tmm1 + ttmmultf32ps tmm6, tmm5, tmm4 + ttmmultf32ps tmm3, tmm2, tmm1 diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp index 131e598e02a..9cb79eb0a4c 100644 --- a/gas/testsuite/gas/i386/x86-64.exp +++ b/gas/testsuite/gas/i386/x86-64.exp @@ -529,6 +529,9 @@ run_dump_test "x86-64-amx-transpose-intel" run_list_test "x86-64-amx-transpose-inval" run_dump_test "x86-64-amx-avx512" run_dump_test "x86-64-amx-avx512-intel" +run_dump_test "x86-64-amx-tf32" +run_dump_test "x86-64-amx-tf32-intel" +run_list_test "x86-64-amx-tf32-inval" run_dump_test "x86-64-clzero" run_dump_test "x86-64-mwaitx-bdver4" run_list_test "x86-64-mwaitx-reg" diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c index 8f651f7a06f..57f8246bf76 100644 --- a/opcodes/i386-dis.c +++ b/opcodes/i386-dis.c @@ -1132,6 +1132,7 @@ enum PREFIX_VEX_0F98_L_0_W_1, PREFIX_VEX_0F99_L_0_W_0, PREFIX_VEX_0F99_L_0_W_1, + PREFIX_VEX_0F3848_X86_64_L_0_W_0, PREFIX_VEX_0F3849_X86_64_L_0_W_0_M_0, PREFIX_VEX_0F3849_X86_64_L_0_W_0_M_1, PREFIX_VEX_0F384B_X86_64_L_0_W_0, @@ -1354,6 +1355,7 @@ enum X86_64_0F38F8_M_1, X86_64_0FC7_REG_6_MOD_3_PREFIX_1, + X86_64_VEX_0F3848, X86_64_VEX_0F3849, X86_64_VEX_0F384B, X86_64_VEX_0F385C, @@ -1446,6 +1448,7 @@ enum VEX_LEN_0F381A, VEX_LEN_0F3836, VEX_LEN_0F3841, + VEX_LEN_0F3848_X86_64, VEX_LEN_0F3849_X86_64, VEX_LEN_0F384B_X86_64, VEX_LEN_0F385A, @@ -1621,6 +1624,7 @@ enum VEX_W_0F382F, VEX_W_0F3836, VEX_W_0F3846, + VEX_W_0F3848_X86_64_L_0, VEX_W_0F3849_X86_64_L_0, VEX_W_0F384B_X86_64_L_0, VEX_W_0F3850, @@ -4087,6 +4091,13 @@ static const struct dis386 prefix_table[][4] = { { "ktestd", { MaskG, MaskR }, 0 }, }, + /* PREFIX_VEX_0F3848_X86_64_L_0_W_0 */ + { + { "ttmmultf32ps", { TMM, Rtmm, VexTmm }, 0 }, + { Bad_Opcode }, + { "tmmultf32ps", { TMM, Rtmm, VexTmm }, 0 }, + }, + /* PREFIX_VEX_0F3849_X86_64_L_0_W_0_M_0 */ { { "ldtilecfg", { M }, 0 }, @@ -4622,6 +4633,12 @@ static const struct dis386 x86_64_table[][2] = { { "senduipi", { Eq }, 0 }, }, + /* X86_64_VEX_0F3848 */ + { + { Bad_Opcode }, + { VEX_LEN_TABLE (VEX_LEN_0F3848_X86_64) }, + }, + /* X86_64_VEX_0F3849 */ { { Bad_Opcode }, @@ -6535,7 +6552,7 @@ static const struct dis386 vex_table[][256] = { { VEX_W_TABLE (VEX_W_0F3846) }, { "vpsllv%DQ", { XM, Vex, EXx }, PREFIX_DATA }, /* 48 */ - { Bad_Opcode }, + { X86_64_TABLE (X86_64_VEX_0F3848) }, { X86_64_TABLE (X86_64_VEX_0F3849) }, { Bad_Opcode }, { X86_64_TABLE (X86_64_VEX_0F384B) }, @@ -7215,6 +7232,11 @@ static const struct dis386 vex_len_table[][2] = { { "vphminposuw", { XM, EXx }, PREFIX_DATA }, }, + /* VEX_LEN_0F3848_X86_64 */ + { + { VEX_W_TABLE (VEX_W_0F3848_X86_64_L_0) }, + }, + /* VEX_LEN_0F3849_X86_64 */ { { VEX_W_TABLE (VEX_W_0F3849_X86_64_L_0) }, @@ -7901,6 +7923,10 @@ static const struct dis386 vex_w_table[][2] = { /* VEX_W_0F3846 */ { "vpsravd", { XM, Vex, EXx }, PREFIX_DATA }, }, + { + /* VEX_W_0F3848_X86_64_L_0 */ + { PREFIX_TABLE (PREFIX_VEX_0F3848_X86_64_L_0_W_0) }, + }, { /* VEX_W_0F3849_X86_64_L_0 */ { MOD_TABLE (MOD_VEX_0F3849_X86_64_L_0_W_0) }, diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c index 168dc565a60..90a6be46950 100644 --- a/opcodes/i386-gen.c +++ b/opcodes/i386-gen.c @@ -267,6 +267,8 @@ static const dependency isa_dependencies[] = "AMX_TILE" }, { "AMX_AVX512", "AMX_TILE|AVX10_2" }, + { "AMX_TF32", + "AMX_TILE" }, { "KL", "SSE2" }, { "WIDEKL", @@ -435,6 +437,7 @@ static bitfield cpu_flags[] = BITFIELD (AMX_COMPLEX), BITFIELD (AMX_TRANSPOSE), BITFIELD (AMX_AVX512), + BITFIELD (AMX_TF32), BITFIELD (AMX_TILE), BITFIELD (MOVDIRI), BITFIELD (MOVDIR64B), diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h index 91972954966..8d7879c8eb4 100644 --- a/opcodes/i386-opc.h +++ b/opcodes/i386-opc.h @@ -254,6 +254,8 @@ enum i386_cpu CpuAMX_COMPLEX, /* Intel AMX-AVX512 Instructions support required. */ CpuAMX_AVX512, + /* Intel AMX-TF32 Instructions support required. */ + CpuAMX_TF32, /* AMX-TILE instructions required */ CpuAMX_TILE, /* GFNI instructions required */ @@ -503,6 +505,7 @@ typedef union i386_cpu_flags unsigned int cpuamx_fp16:1; unsigned int cpuamx_complex:1; unsigned int cpuamx_avx512:1; + unsigned int cpuamx_tf32:1; unsigned int cpuamx_tile:1; unsigned int cpugfni:1; unsigned int cpuvaes:1; diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl index d17765aa0af..2a195c8bbb3 100644 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -3234,12 +3234,16 @@ tilerelease, 0x49c0, AMX_TILE, Vex128|Space0F38|VexW0|NoSuf, {} tilezero, 0xf249, AMX_TILE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM } +tmmultf32ps, 0x6648, AMX_TF32, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } + ttcmmimfp16ps, 0xf26b, AMX_COMPLEX&AMX_TRANSPOSE, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } ttcmmrlfp16ps, 0xf36b, AMX_COMPLEX&AMX_TRANSPOSE, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } ttdpbf16ps, 0xf36c, AMX_BF16&AMX_TRANSPOSE, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } ttdpfp16ps, 0xf26c, AMX_FP16&AMX_TRANSPOSE, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } +ttmmultf32ps, 0x48, AMX_TF32&AMX_TRANSPOSE, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } + ttransposed, 0xf35f, AMX_TRANSPOSE, Modrm|Vex128|Space0F38|VexW0|NoSuf, { RegTMM, RegTMM } // AMX instructions end. From patchwork Wed Nov 13 08:44:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haochen Jiang X-Patchwork-Id: 100952 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 904063858405 for ; Wed, 13 Nov 2024 08:47:05 +0000 (GMT) X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) by sourceware.org (Postfix) with ESMTPS id 0F8333858C66 for ; Wed, 13 Nov 2024 08:44:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0F8333858C66 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0F8333858C66 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.17 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731487496; cv=none; b=jHQ2Q+ijSwkE+4JB1KdLrP61RoMeFnXp3uH0Hp759AqMyApPcS/i9bz02fZ9jFtagpALtxxHaQlaMfxmif/RDW3jAPa4RmRaumtYQjmP1n/MAzoB1Lt3hC1nD4Em4YjxfkLMC3GzJ6KALIqDLV0J6yKyg47vVH9GFIFZCFl418E= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731487496; c=relaxed/simple; bh=oziO1kTu5AnD7ilvJR23J0mg1yeRMXtJBVwXhIqqays=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=uyuIbCIDUmKuNyjkaNutVFlDf3Q5/RHv4YSbeacGV1Bj3xw9jMMS128i7LnCJoWp/u2jw0yxsTTwqZdQ/lYlqOWMca7n5WHcdPfsMfKge+DEYQmRgrJgFKhAeVcS/7MN65kib1tcHs0i32xPWyLe55/EkYSsCKozEJWxxAU17pg= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731487488; x=1763023488; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oziO1kTu5AnD7ilvJR23J0mg1yeRMXtJBVwXhIqqays=; b=ZBPErj1JDncV6cBv2afXGpwEkQE8KpdzfpiVowRntXt981qSBTnhcecv a/777gYsRjaCz323EpzAmAeSmSX6P2utA4SKDs9wxysOMWCpZt55nzVRZ X7qkG3l3rQQoJqZ0EE4TzqLbOr19lTpibRlwnl4nkNJW0xFQREQCcFErn +fmlsibpVnlrjGLOQxHRhr5SiDrAdMG8Ggf/+Bfqar3EhAI6dcUC7+oB/ bCPJ7fH80mosI++aSiVkjm/sfpw4PPokvdg+wTEwzE+khGs4iWbdcEnpb PJ21ZAY5hvWCUpb5pBDiyCnQnUIEiI13QvEt89ftlQhsJcDzKAPWaE9u+ A==; X-CSE-ConnectionGUID: dwCJI9KhSj+bXcMPyc+fRQ== X-CSE-MsgGUID: Sd8nPdIYRQems4arjw2tMQ== X-IronPort-AV: E=McAfee;i="6700,10204,11254"; a="31458913" X-IronPort-AV: E=Sophos;i="6.12,150,1728975600"; d="scan'208";a="31458913" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2024 00:44:46 -0800 X-CSE-ConnectionGUID: MD/bm7yTR7Wko2NXeRxC7Q== X-CSE-MsgGUID: xGIGuPuUQnaclB6wTXKLxQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,150,1728975600"; d="scan'208";a="87545468" Received: from shliclel4217.sh.intel.com ([10.239.240.127]) by fmviesa007.fm.intel.com with ESMTP; 13 Nov 2024 00:44:44 -0800 From: Haochen Jiang To: binutils@sourceware.org Cc: jbeulich@suse.com, hjl.tools@intel.com, "Hu, Lin1" Subject: [PATCH 4/6] Rename Opcode Space Name Evexmap5 to xVexmap5 Date: Wed, 13 Nov 2024 16:44:33 +0800 Message-Id: <20241113084435.1784546-5-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20241113084435.1784546-1-haochen.jiang@intel.com> References: <20241113084435.1784546-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: binutils-bounces~patchwork=sourceware.org@sourceware.org From: "Hu, Lin1" Nit: Currently this patch is based on all AVX10.2 instructions, if the AVX10.2 insts are checked in after this patch, I will remove those insts. gas/ChangeLog: * config/tc-i386.c (pte): Change xvexmap5 to map5. (md_assemble): Extend the opcode space scope of sse check. opcodes/ChangeLog: * i386-dis.c (vex_table): Add VEX_MAP5. (get_valid_i386): Ditto. * i386-gen.c (process_i386_opcode_modifier): Change SPACE(EVEXMAP5) to SPACE(xVEXMAP5). * i386-opc.h (SPACE_EVEXMAP5): Change to SPACE_xVEXMAP5. * i386-opc.tbl: Change EVexMap5 to xVexMap5. * i386-tbl.h: Regenerated. --- gas/config/tc-i386.c | 5 +- opcodes/i386-dis.c | 297 ++++++++++++++++++++++++++++++++++++- opcodes/i386-gen.c | 2 +- opcodes/i386-opc.h | 4 +- opcodes/i386-opc.tbl | 196 ++++++++++++------------ opcodes/i386-tbl.h | 344 +++++++++++++++++++++---------------------- 6 files changed, 572 insertions(+), 276 deletions(-) diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index 57c4285cc68..574d7946dad 100644 --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -3650,7 +3650,7 @@ pte (insn_template *t) { static const unsigned char opc_pfx[] = { 0, 0x66, 0xf3, 0xf2 }; static const char *const opc_spc[] = { - NULL, "0f", "0f38", "0f3a", NULL, "evexmap5", "evexmap6", NULL, + NULL, "0f", "0f38", "0f3a", NULL, "xvexmap5", "evexmap6", NULL, "XOP08", "XOP09", "XOP0A", }; unsigned int j; @@ -4218,6 +4218,7 @@ build_vex_prefix (const insn_template *t) case SPACE_0F: case SPACE_0F38: case SPACE_0F3A: + case SPACE_xVEXMAP5: case SPACE_MAP7: i.vex.bytes[0] = 0xc4; break; @@ -7188,7 +7189,7 @@ i386_assemble (char *line) /* The opcode space check isn't strictly needed; it's there only to bypass the logic below when easily possible. */ && t->opcode_space >= SPACE_0F - && t->opcode_space <= SPACE_0F3A + && t->opcode_space <= SPACE_xVEXMAP5 && !is_cpu (&i.tm, CpuSSE4a) && !is_any_vex_encoding (t)) { diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c index 57f8246bf76..815e02a5d59 100644 --- a/opcodes/i386-dis.c +++ b/opcodes/i386-dis.c @@ -1399,7 +1399,8 @@ enum VEX_0F = 0, VEX_0F38, VEX_0F3A, - VEX_MAP7, + VEX_MAP5, + VEX_MAP7 }; enum @@ -7050,6 +7051,297 @@ static const struct dis386 vex_table[][256] = { { Bad_Opcode }, { Bad_Opcode }, }, + /* VEX_MAP5 */ + { + /* 00 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 08 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 10 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 18 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 20 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 28 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 30 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 38 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 40 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 48 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 50 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 58 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 60 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 68 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 70 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 78 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 80 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 88 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 90 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* 98 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* a0 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* a8 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* b0 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* b8 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* c0 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* c8 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* d0 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* d8 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* e0 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* e8 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* f0 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + /* f8 */ + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + { Bad_Opcode }, + }, }; #include "i386-dis-evex.h" @@ -9157,6 +9449,9 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins) case 0x3: vex_table_index = VEX_0F3A; break; + case 0x5: + vex_table_index = VEX_MAP5; + break; case 0x7: vex_table_index = VEX_MAP7; break; diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c index 90a6be46950..6d2cdac6e4e 100644 --- a/opcodes/i386-gen.c +++ b/opcodes/i386-gen.c @@ -1153,7 +1153,7 @@ process_i386_opcode_modifier (FILE *table, char *mod, unsigned int space, SPACE(0F38), SPACE(0F3A), SPACE(EVEXMAP4), - SPACE(EVEXMAP5), + SPACE(xVEXMAP5), SPACE(EVEXMAP6), SPACE(MAP7), SPACE(XOP08), diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h index 8d7879c8eb4..89a6396686a 100644 --- a/opcodes/i386-opc.h +++ b/opcodes/i386-opc.h @@ -1009,7 +1009,7 @@ typedef struct insn_template 2: 0F38 opcode prefix / space. 3: 0F3A opcode prefix / space. 4: EVEXMAP4 opcode prefix / space. - 5: EVEXMAP5 opcode prefix / space. + 5: xVEXMAP5 opcode prefix / space. 6: EVEXMAP6 opcode prefix / space. 7: MAP7 opcode prefix / space. 8: XOP 08 opcode space. @@ -1021,7 +1021,7 @@ typedef struct insn_template #define SPACE_0F38 2 #define SPACE_0F3A 3 #define SPACE_EVEXMAP4 4 -#define SPACE_EVEXMAP5 5 +#define SPACE_xVEXMAP5 5 #define SPACE_EVEXMAP6 6 #define SPACE_MAP7 7 #define SPACE_XOP08 8 diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl index 2a195c8bbb3..b24f255307e 100644 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -118,7 +118,7 @@ #define SpaceXOP0A OpcodeSpace=SPACE_XOP0A #define EVexMap4 OpcodeSpace=SPACE_EVEXMAP4|EVex128 -#define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5 +#define xVexMap5 OpcodeSpace=SPACE_xVEXMAP5 #define EVexMap6 OpcodeSpace=SPACE_EVEXMAP6 #define VexW0 VexW=VEXW0 @@ -1933,7 +1933,7 @@ vcvtps2ph, 0x661d, F16C, Modrm|Vex=2|Space0F3A|VexW=1|NoSuf, { Imm8, RegYMM, Uns + h:AVX512_FP16:AVX512_FP16:AVX512_FP16::f3::xVexMap5:EVexMap6:0::EVexLIG:VexW0:Word:Disp8MemShift=1> vp, 0x66 | 0x, , Modrm||Masking||Src1VVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vs, 0x66 | 1 | 0x, , Modrm||Masking||Src1VVVV||Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } @@ -3309,85 +3309,85 @@ vcmpph, 0xc2, AVX512_FP16, Modrm|Masking|Space0F3A|Src1VVVV|VexW0|Broadcast|Disp vcmpsh, 0xf3c2/0x, AVX512_FP16, Modrm|EVexLIG|Masking|Space0F3A|Src1VVVV|VexW0|Disp8MemShift=1|NoSuf|ImmExt|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegMask } vcmpsh, 0xf3c2, AVX512_FP16, Modrm|EVexLIG|Masking|Space0F3A|Src1VVVV|VexW0|Disp8MemShift=1|NoSuf|SAE, { Imm8, RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegMask } -vcvtdq2ph, 0x5b, AVX512_FP16&, Modrm||Masking|EVexMap5|VexW0|Broadcast|NoSuf|, { |Dword, } -vcvtudq2ph, 0xf27a, AVX512_FP16&, Modrm||Masking|EVexMap5|VexW0|Broadcast|NoSuf|, { |Dword, } +vcvtdq2ph, 0x5b, AVX512_FP16&, Modrm||Masking|xVexMap5|VexW0|Broadcast|NoSuf|, { |Dword, } +vcvtudq2ph, 0xf27a, AVX512_FP16&, Modrm||Masking|xVexMap5|VexW0|Broadcast|NoSuf|, { |Dword, } -vcvtqq2ph, 0x5b, AVX512_FP16&, Modrm||Masking|EVexMap5|VexW1|Broadcast|NoSuf||, { |Qword, RegXMM } -vcvtuqq2ph, 0xf27a, AVX512_FP16&, Modrm||Masking|EVexMap5|VexW1|Broadcast|NoSuf||, { |Qword, RegXMM } +vcvtqq2ph, 0x5b, AVX512_FP16&, Modrm||Masking|xVexMap5|VexW1|Broadcast|NoSuf||, { |Qword, RegXMM } +vcvtuqq2ph, 0xf27a, AVX512_FP16&, Modrm||Masking|xVexMap5|VexW1|Broadcast|NoSuf||, { |Qword, RegXMM } -vcvtpd2ph, 0x665a, AVX512_FP16&, Modrm||Masking|EVexMap5|VexW1|Broadcast|NoSuf||, { |Qword, RegXMM } +vcvtpd2ph, 0x665a, AVX512_FP16&, Modrm||Masking|xVexMap5|VexW1|Broadcast|NoSuf||, { |Qword, RegXMM } -vcvtps2phx, 0x661d, AVX512_FP16&, Modrm||Masking|EVexMap5|VexW0|Broadcast|NoSuf|, { |Dword, } +vcvtps2phx, 0x661d, AVX512_FP16&, Modrm||Masking|xVexMap5|VexW0|Broadcast|NoSuf|, { |Dword, } -vcvtw2ph, 0xf37d, AVX512_FP16, Modrm|Masking|EVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvtuw2ph, 0xf27d, AVX512_FP16, Modrm|Masking|EVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvtw2ph, 0xf37d, AVX512_FP16, Modrm|Masking|xVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvtuw2ph, 0xf27d, AVX512_FP16, Modrm|Masking|xVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvtph2dq, 0x665b, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegXMM } -vcvtph2dq, 0x665b, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|StaticRounding|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegYMM } -vcvtph2dq, 0x665b, AVX512_FP16, Modrm|EVex512|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=5|NoSuf|StaticRounding|SAE, { RegYMM|Word|Unspecified|BaseIndex, RegZMM } +vcvtph2dq, 0x665b, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegXMM } +vcvtph2dq, 0x665b, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|StaticRounding|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegYMM } +vcvtph2dq, 0x665b, AVX512_FP16, Modrm|EVex512|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=5|NoSuf|StaticRounding|SAE, { RegYMM|Word|Unspecified|BaseIndex, RegZMM } -vcvtph2udq, 0x79, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegXMM } -vcvtph2udq, 0x79, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|StaticRounding|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegYMM } -vcvtph2udq, 0x79, AVX512_FP16, Modrm|EVex512|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=5|NoSuf|StaticRounding|SAE, { RegYMM|Word|Unspecified|BaseIndex, RegZMM } +vcvtph2udq, 0x79, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegXMM } +vcvtph2udq, 0x79, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|StaticRounding|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegYMM } +vcvtph2udq, 0x79, AVX512_FP16, Modrm|EVex512|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=5|NoSuf|StaticRounding|SAE, { RegYMM|Word|Unspecified|BaseIndex, RegZMM } -vcvtph2qq, 0x667b, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=2|NoSuf, { RegXMM|Word|Dword|Unspecified|BaseIndex, RegXMM } -vcvtph2qq, 0x667b, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf|StaticRounding|SAE, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegYMM } -vcvtph2qq, 0x667b, AVX512_FP16, Modrm|EVex512|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|StaticRounding|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegZMM } +vcvtph2qq, 0x667b, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=2|NoSuf, { RegXMM|Word|Dword|Unspecified|BaseIndex, RegXMM } +vcvtph2qq, 0x667b, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf|StaticRounding|SAE, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegYMM } +vcvtph2qq, 0x667b, AVX512_FP16, Modrm|EVex512|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|StaticRounding|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegZMM } -vcvtph2uqq, 0x6679, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=2|NoSuf, { RegXMM|Word|Dword|Unspecified|BaseIndex, RegXMM } -vcvtph2uqq, 0x6679, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf|StaticRounding|SAE, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegYMM } -vcvtph2uqq, 0x6679, AVX512_FP16, Modrm|EVex512|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|StaticRounding|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegZMM } +vcvtph2uqq, 0x6679, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=2|NoSuf, { RegXMM|Word|Dword|Unspecified|BaseIndex, RegXMM } +vcvtph2uqq, 0x6679, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf|StaticRounding|SAE, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegYMM } +vcvtph2uqq, 0x6679, AVX512_FP16, Modrm|EVex512|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|StaticRounding|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegZMM } -vcvtph2pd, 0x5a, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=2|NoSuf, { RegXMM|Word|Dword|Unspecified|BaseIndex, RegXMM } -vcvtph2pd, 0x5a, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf|SAE, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegYMM } -vcvtph2pd, 0x5a, AVX512_FP16, Modrm|EVex512|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegZMM } +vcvtph2pd, 0x5a, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=2|NoSuf, { RegXMM|Word|Dword|Unspecified|BaseIndex, RegXMM } +vcvtph2pd, 0x5a, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf|SAE, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegYMM } +vcvtph2pd, 0x5a, AVX512_FP16, Modrm|EVex512|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegZMM } -vcvtph2w, 0x667d, AVX512_FP16, Modrm|Masking|EVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvtph2uw, 0x7d, AVX512_FP16, Modrm|Masking|EVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvtph2w, 0x667d, AVX512_FP16, Modrm|Masking|xVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvtph2uw, 0x7d, AVX512_FP16, Modrm|Masking|xVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvtsd2sh, 0xf25a, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap5|Src1VVVV|VexW1|Disp8MemShift=3|NoSuf|StaticRounding|SAE, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM, RegXMM } -vcvtss2sh, 0x1d, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap5|Src1VVVV|VexW0|Disp8MemShift=2|NoSuf|StaticRounding|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM } +vcvtsd2sh, 0xf25a, AVX512_FP16, Modrm|EVexLIG|Masking|xVexMap5|Src1VVVV|VexW1|Disp8MemShift=3|NoSuf|StaticRounding|SAE, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM, RegXMM } +vcvtss2sh, 0x1d, AVX512_FP16, Modrm|EVexLIG|Masking|xVexMap5|Src1VVVV|VexW0|Disp8MemShift=2|NoSuf|StaticRounding|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM } -vcvtsi2sh, 0xf32a, AVX512_FP16, Modrm|EVexLIG|EVexMap5|Src1VVVV|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM } -vcvtsi2sh, 0xf32a, AVX512_FP16, Modrm|EVexLIG|EVexMap5|Src1VVVV|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM } +vcvtsi2sh, 0xf32a, AVX512_FP16, Modrm|EVexLIG|xVexMap5|Src1VVVV|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM } +vcvtsi2sh, 0xf32a, AVX512_FP16, Modrm|EVexLIG|xVexMap5|Src1VVVV|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM } -vcvtusi2sh, 0xf37b, AVX512_FP16, Modrm|EVexLIG|EVexMap5|Src1VVVV|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM } -vcvtusi2sh, 0xf37b, AVX512_FP16, Modrm|EVexLIG|EVexMap5|Src1VVVV|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM } +vcvtusi2sh, 0xf37b, AVX512_FP16, Modrm|EVexLIG|xVexMap5|Src1VVVV|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM } +vcvtusi2sh, 0xf37b, AVX512_FP16, Modrm|EVexLIG|xVexMap5|Src1VVVV|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM } -vcvtsh2sd, 0xf35a, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap5|Src1VVVV|VexW0|Disp8MemShift=1|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM } +vcvtsh2sd, 0xf35a, AVX512_FP16, Modrm|EVexLIG|Masking|xVexMap5|Src1VVVV|VexW0|Disp8MemShift=1|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM } vcvtsh2ss, 0x13, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|Src1VVVV|VexW0|Disp8MemShift=1|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM } -vcvtsh2si, 0xf32d, AVX512_FP16, Modrm|EVexLIG|EVexMap5|Disp8MemShift=1|NoSuf|StaticRounding|SAE, { RegXMM|Word|Unspecified|BaseIndex, Reg32|Reg64 } +vcvtsh2si, 0xf32d, AVX512_FP16, Modrm|EVexLIG|xVexMap5|Disp8MemShift=1|NoSuf|StaticRounding|SAE, { RegXMM|Word|Unspecified|BaseIndex, Reg32|Reg64 } -vcvttph2dq, 0xf35b, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegXMM } -vcvttph2dq, 0xf35b, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegYMM } -vcvttph2dq, 0xf35b, AVX512_FP16, Modrm|EVex512|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=5|NoSuf|SAE, { RegYMM|Word|Unspecified|BaseIndex, RegZMM } +vcvttph2dq, 0xf35b, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegXMM } +vcvttph2dq, 0xf35b, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegYMM } +vcvttph2dq, 0xf35b, AVX512_FP16, Modrm|EVex512|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=5|NoSuf|SAE, { RegYMM|Word|Unspecified|BaseIndex, RegZMM } -vcvttph2udq, 0x78, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegXMM } -vcvttph2udq, 0x78, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegYMM } -vcvttph2udq, 0x78, AVX512_FP16, Modrm|EVex512|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=5|NoSuf|SAE, { RegYMM|Word|Unspecified|BaseIndex, RegZMM } +vcvttph2udq, 0x78, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegXMM } +vcvttph2udq, 0x78, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegYMM } +vcvttph2udq, 0x78, AVX512_FP16, Modrm|EVex512|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=5|NoSuf|SAE, { RegYMM|Word|Unspecified|BaseIndex, RegZMM } -vcvttph2qq, 0x667a, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=2|NoSuf, { RegXMM|Word|Dword|Unspecified|BaseIndex, RegXMM } -vcvttph2qq, 0x667a, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf|SAE, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegYMM } -vcvttph2qq, 0x667a, AVX512_FP16, Modrm|EVex512|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegZMM } +vcvttph2qq, 0x667a, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=2|NoSuf, { RegXMM|Word|Dword|Unspecified|BaseIndex, RegXMM } +vcvttph2qq, 0x667a, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf|SAE, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegYMM } +vcvttph2qq, 0x667a, AVX512_FP16, Modrm|EVex512|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegZMM } -vcvttph2uqq, 0x6678, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=2|NoSuf, { RegXMM|Word|Dword|Unspecified|BaseIndex, RegXMM } -vcvttph2uqq, 0x6678, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf|SAE, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegYMM } -vcvttph2uqq, 0x6678, AVX512_FP16, Modrm|EVex512|Masking|EVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegZMM } +vcvttph2uqq, 0x6678, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=2|NoSuf, { RegXMM|Word|Dword|Unspecified|BaseIndex, RegXMM } +vcvttph2uqq, 0x6678, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=3|NoSuf|SAE, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegYMM } +vcvttph2uqq, 0x6678, AVX512_FP16, Modrm|EVex512|Masking|xVexMap5|VexW0|Broadcast|Disp8MemShift=4|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegZMM } vcvtph2psx, 0x6613, AVX512_FP16&AVX512VL, Modrm|EVex128|Masking|EVexMap6|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Word|Qword|Unspecified|BaseIndex, RegXMM } vcvtph2psx, 0x6613, AVX512_FP16&AVX512VL, Modrm|EVex256|Masking|EVexMap6|VexW0|Broadcast|Disp8MemShift=4|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegYMM } vcvtph2psx, 0x6613, AVX512_FP16, Modrm|EVex512|Masking|EVexMap6|VexW0|Broadcast|Disp8MemShift=5|NoSuf|SAE, { RegYMM|Word|Unspecified|BaseIndex, RegZMM } -vcvttph2w, 0x667c, AVX512_FP16, Modrm|Masking|EVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvttph2uw, 0x7c, AVX512_FP16, Modrm|Masking|EVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvttph2w, 0x667c, AVX512_FP16, Modrm|Masking|xVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvttph2uw, 0x7c, AVX512_FP16, Modrm|Masking|xVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvttsh2si, 0xf32c, AVX512_FP16, Modrm|EVexLIG|EVexMap5|Disp8MemShift=1|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, Reg32|Reg64 } +vcvttsh2si, 0xf32c, AVX512_FP16, Modrm|EVexLIG|xVexMap5|Disp8MemShift=1|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, Reg32|Reg64 } vfpclassph, 0x66, AVX512_FP16&, Modrm||Masking|Space0F3A|VexW0|Broadcast|NoSuf|, { Imm8|Imm8S, |Word, RegMask } -vmovw, 0x666e, AVX512_FP16, D|Modrm|EVex128|VexWIG|EVexMap5|Disp8MemShift=1|NoSuf, { Word|Unspecified|BaseIndex, RegXMM } -vmovw, 0x667e, AVX512_FP16, D|RegMem|EVex128|VexWIG|EVexMap5|NoSuf, { RegXMM, Reg32 } +vmovw, 0x666e, AVX512_FP16, D|Modrm|EVex128|VexWIG|xVexMap5|Disp8MemShift=1|NoSuf, { Word|Unspecified|BaseIndex, RegXMM } +vmovw, 0x667e, AVX512_FP16, D|RegMem|EVex128|VexWIG|xVexMap5|NoSuf, { RegXMM, Reg32 } vrcpph, 0x664c, AVX512_FP16, Modrm|Masking|EVexMap6|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } @@ -3484,9 +3484,9 @@ vcvt2ps2phx, 0x6667, AVX10_2, Modrm|Space0F38|Src1VVVV|VexW0|Masking|Broadcast|D + bf8s:74:xVexMap5, + + hf8:18:xVexMap5, + + hf8s:1b:xVexMap5> vcvtbiasph2, 0x, AVX10_2, Modrm||Src1VVVV|VexW0|Masking|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM } vcvtbiasph2, 0x, AVX10_2, Modrm||EVex512|Src1VVVV|VexW0|Masking|Broadcast|Disp8MemShift=6|NoSuf, { RegZMM|Word|Unspecified|BaseIndex, RegZMM, RegYMM } @@ -3495,11 +3495,11 @@ vcvtneph2, 0xf3, AVX10_2, Modrm|||VexW0 -vcvthf82ph, 0xf21e, AVX10_2, Modrm|EVexMap5|EVex128|VexW0|Masking|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } -vcvthf82ph, 0xf21e, AVX10_2, Modrm|EVexMap5|EVex256|VexW0|Masking|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } -vcvthf82ph, 0xf21e, AVX10_2, Modrm|EVexMap5|EVex512|VexW0|Masking|Disp8MemShift=5|NoSuf, { RegYMM|Unspecified|BaseIndex, RegZMM } +vcvthf82ph, 0xf21e, AVX10_2, Modrm|xVexMap5|EVex128|VexW0|Masking|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } +vcvthf82ph, 0xf21e, AVX10_2, Modrm|xVexMap5|EVex256|VexW0|Masking|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } +vcvthf82ph, 0xf21e, AVX10_2, Modrm|xVexMap5|EVex512|VexW0|Masking|Disp8MemShift=5|NoSuf, { RegYMM|Unspecified|BaseIndex, RegZMM } -vpbf16, 0x66, AVX10_2, Modrm|EVexMap5|Src1VVVV|VexW0|Masking|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpbf16, 0x66, AVX10_2, Modrm|xVexMap5|Src1VVVV|VexW0|Masking|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vnepbf16, 0x | 0x, AVX10_2, Modrm|EVexMap6|Src1VVVV|VexW0|Masking|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } @@ -3515,44 +3515,44 @@ vreducenepbf16, 0xf256, AVX10_2, Modrm|Space0F3A|VexW0|Masking|Broadcast|Disp8Sh vrndscalenepbf16, 0xf208, AVX10_2, Modrm|Space0F3A|VexW0|Masking|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vrsqrtpbf16, 0x4e, AVX10_2, Modrm|EVexMap6|VexW0|Masking|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vscalefnepbf16, 0x2c, AVX10_2, Modrm|EVexMap6|Src1VVVV|VexW0|Masking|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vsqrtnepbf16, 0x6651, AVX10_2, Modrm|EVexMap5|VexW0|Masking|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } - -vcomsbf16, 0x662f, AVX10_2, Modrm|EVexMap5|EVexLIG|VexW0|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM } - -vcvtnebf162ibs, 0xf269, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvtnebf162iubs, 0xf26b, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvttnebf162ibs, 0xf268, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvttnebf162iubs, 0xf26a, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvtph2ibs, 0x69, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvtph2iubs, 0x6b, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvttph2ibs, 0x68, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvttph2iubs, 0x6a, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvtps2ibs, 0x6669, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|DWord|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvtps2iubs, 0x666b, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|DWord|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvttps2ibs, 0x6668, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|DWord|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvttps2iubs, 0x666a, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|DWord|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } - -vcvttpd2dqs, 0x6d, AVX10_2, Modrm||EVexMap5|Masking|VexW1|Broadcast|CheckOperandSize|NoSuf|, { |Qword, } -vcvttpd2qqs, 0x666d, AVX10_2, Modrm|EVexMap5|Masking|VexW1|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvttpd2udqs, 0x6c, AVX10_2, Modrm||EVexMap5|Masking|VexW1|Broadcast|CheckOperandSize|NoSuf|, { |Qword, } -vcvttpd2uqqs, 0x666c, AVX10_2, Modrm|EVexMap5|Masking|VexW1|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvttps2dqs, 0x6d, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvttps2qqs, 0x666d, AVX10_2, Modrm|EVex128|EVexMap5|Masking|VexW0|Disp8MemShift=3|Broadcast|NoSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM } -vcvttps2qqs, 0x666d, AVX10_2, Modrm|EVex256|EVexMap5|Masking|VexW0|Disp8MemShift=4|Broadcast|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } -vcvttps2qqs, 0x666d, AVX10_2, Modrm|EVex512|EVexMap5|Masking|VexW0|Disp8MemShift=5|Broadcast|NoSuf|SAE, { RegYMM|Dword|Unspecified|BaseIndex, RegZMM } -vcvttps2udqs, 0x6c, AVX10_2, Modrm|EVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vcvttps2uqqs, 0x666c, AVX10_2, Modrm|EVex128|EVexMap5|Masking|VexW0|Disp8MemShift=3|Broadcast|NoSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM } -vcvttps2uqqs, 0x666c, AVX10_2, Modrm|EVex256|EVexMap5|Masking|VexW0|Disp8MemShift=4|Broadcast|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } -vcvttps2uqqs, 0x666c, AVX10_2, Modrm|EVex512|EVexMap5|Masking|VexW0|Disp8MemShift=5|Broadcast|NoSuf|SAE, { RegYMM|Dword|Unspecified|BaseIndex, RegZMM } - -vcvttsd2sis, 0xf26d, AVX10_2, Modrm|EVexLIG|EVexMap5|VexW0|Disp8MemShift|NoSuf|SAE, { RegXMM|Qword|Unspecified|BaseIndex, Reg32 } -vcvttsd2sis, 0xf26d, AVX10_2&x64, Modrm|EVexLIG|EVexMap5|VexW1|Disp8MemShift|NoSuf|SAE, { RegXMM|Qword|Unspecified|BaseIndex, Reg64 } -vcvttsd2usis, 0xf26c, AVX10_2, Modrm|EVexLIG|EVexMap5|VexW0|Disp8MemShift|NoSuf|SAE, { RegXMM|Qword|Unspecified|BaseIndex, Reg32 } -vcvttsd2usis, 0xf26c, AVX10_2&x64, Modrm|EVexLIG|EVexMap5|VexW1|Disp8MemShift|NoSuf|SAE, { RegXMM|Qword|Unspecified|BaseIndex, Reg64 } -vcvttss2sis, 0xf36d, AVX10_2, Modrm|EVexLIG|EVexMap5|VexW0|Disp8MemShift|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, Reg32 } -vcvttss2sis, 0xf36d, AVX10_2&x64, Modrm|EVexLIG|EVexMap5|VexW1|Disp8MemShift|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, Reg64 } -vcvttss2usis, 0xf36c, AVX10_2, Modrm|EVexLIG|EVexMap5|VexW0|Disp8MemShift|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, Reg32 } -vcvttss2usis, 0xf36c, AVX10_2&x64, Modrm|EVexLIG|EVexMap5|VexW1|Disp8MemShift|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, Reg64 } +vsqrtnepbf16, 0x6651, AVX10_2, Modrm|xVexMap5|VexW0|Masking|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } + +vcomsbf16, 0x662f, AVX10_2, Modrm|xVexMap5|EVexLIG|VexW0|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM } + +vcvtnebf162ibs, 0xf269, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvtnebf162iubs, 0xf26b, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvttnebf162ibs, 0xf268, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvttnebf162iubs, 0xf26a, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvtph2ibs, 0x69, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvtph2iubs, 0x6b, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvttph2ibs, 0x68, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvttph2iubs, 0x6a, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvtps2ibs, 0x6669, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|DWord|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvtps2iubs, 0x666b, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|DWord|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvttps2ibs, 0x6668, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|DWord|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvttps2iubs, 0x666a, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|DWord|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } + +vcvttpd2dqs, 0x6d, AVX10_2, Modrm||xVexMap5|Masking|VexW1|Broadcast|CheckOperandSize|NoSuf|, { |Qword, } +vcvttpd2qqs, 0x666d, AVX10_2, Modrm|xVexMap5|Masking|VexW1|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvttpd2udqs, 0x6c, AVX10_2, Modrm||xVexMap5|Masking|VexW1|Broadcast|CheckOperandSize|NoSuf|, { |Qword, } +vcvttpd2uqqs, 0x666c, AVX10_2, Modrm|xVexMap5|Masking|VexW1|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvttps2dqs, 0x6d, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvttps2qqs, 0x666d, AVX10_2, Modrm|EVex128|xVexMap5|Masking|VexW0|Disp8MemShift=3|Broadcast|NoSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM } +vcvttps2qqs, 0x666d, AVX10_2, Modrm|EVex256|xVexMap5|Masking|VexW0|Disp8MemShift=4|Broadcast|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } +vcvttps2qqs, 0x666d, AVX10_2, Modrm|EVex512|xVexMap5|Masking|VexW0|Disp8MemShift=5|Broadcast|NoSuf|SAE, { RegYMM|Dword|Unspecified|BaseIndex, RegZMM } +vcvttps2udqs, 0x6c, AVX10_2, Modrm|xVexMap5|Masking|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vcvttps2uqqs, 0x666c, AVX10_2, Modrm|EVex128|xVexMap5|Masking|VexW0|Disp8MemShift=3|Broadcast|NoSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM } +vcvttps2uqqs, 0x666c, AVX10_2, Modrm|EVex256|xVexMap5|Masking|VexW0|Disp8MemShift=4|Broadcast|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } +vcvttps2uqqs, 0x666c, AVX10_2, Modrm|EVex512|xVexMap5|Masking|VexW0|Disp8MemShift=5|Broadcast|NoSuf|SAE, { RegYMM|Dword|Unspecified|BaseIndex, RegZMM } + +vcvttsd2sis, 0xf26d, AVX10_2, Modrm|EVexLIG|xVexMap5|VexW0|Disp8MemShift|NoSuf|SAE, { RegXMM|Qword|Unspecified|BaseIndex, Reg32 } +vcvttsd2sis, 0xf26d, AVX10_2&x64, Modrm|EVexLIG|xVexMap5|VexW1|Disp8MemShift|NoSuf|SAE, { RegXMM|Qword|Unspecified|BaseIndex, Reg64 } +vcvttsd2usis, 0xf26c, AVX10_2, Modrm|EVexLIG|xVexMap5|VexW0|Disp8MemShift|NoSuf|SAE, { RegXMM|Qword|Unspecified|BaseIndex, Reg32 } +vcvttsd2usis, 0xf26c, AVX10_2&x64, Modrm|EVexLIG|xVexMap5|VexW1|Disp8MemShift|NoSuf|SAE, { RegXMM|Qword|Unspecified|BaseIndex, Reg64 } +vcvttss2sis, 0xf36d, AVX10_2, Modrm|EVexLIG|xVexMap5|VexW0|Disp8MemShift|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, Reg32 } +vcvttss2sis, 0xf36d, AVX10_2&x64, Modrm|EVexLIG|xVexMap5|VexW1|Disp8MemShift|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, Reg64 } +vcvttss2usis, 0xf36c, AVX10_2, Modrm|EVexLIG|xVexMap5|VexW0|Disp8MemShift|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, Reg32 } +vcvttss2usis, 0xf36c, AVX10_2&x64, Modrm|EVexLIG|xVexMap5|VexW1|Disp8MemShift|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, Reg64 } vminmaxnepbf16, 0xf252, AVX10_2, Modrm|Masking|Space0F3A|Src1VVVV|VexW0|Disp8ShiftVL|Broadcast|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vminmaxp, 0x52, AVX10_2, Modrm|Masking|Space0F3A||Broadcast|Src1VVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8, RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } @@ -3560,7 +3560,7 @@ vminmaxs, 0x53, AVX10_2, Modrm|EVexLIG|Masking|Space0F3A|Src1VVVV| vmovd, 0xf37e, AVX10_2, Load|Modrm|EVex128|VexW0|Space0F|Disp8MemShift=2|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM } vmovd, 0x66d6, AVX10_2, Modrm|EVex128|VexW0|Space0F|Disp8MemShift=2|NoSuf, { RegXMM, Dword|Unspecified|BaseIndex|RegXMM } -vmovw, 0xf36e, AVX10_2, D|Modrm|EVex128|VexW0|EVexMap5|Disp8MemShift=1|NoSuf, { Word|Unspecified|BaseIndex|RegXMM, RegXMM } +vmovw, 0xf36e, AVX10_2, D|Modrm|EVex128|VexW0|xVexMap5|Disp8MemShift=1|NoSuf, { Word|Unspecified|BaseIndex|RegXMM, RegXMM } vcomxs, 0x2f, AVX10_2, Modrm|EVexLIG||||NoSuf|SAE, { RegXMM||Unspecified|BaseIndex, RegXMM } vucomxs, 0x2e, AVX10_2, Modrm|EVexLIG||||NoSuf|SAE, { RegXMM||Unspecified|BaseIndex, RegXMM } From patchwork Wed Nov 13 08:44:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haochen Jiang X-Patchwork-Id: 100954 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1C20F3858D38 for ; Wed, 13 Nov 2024 08:48:13 +0000 (GMT) X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) by sourceware.org (Postfix) with ESMTPS id F3F483858408 for ; Wed, 13 Nov 2024 08:44:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F3F483858408 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org F3F483858408 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.17 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731487503; cv=none; b=Kj9IT6RS1pa6Pnbh5qGTCc1k7kb1pk5uU2sh7pOTtr/1yvaeF8PlKpEvtEZwh8w3sS5ve5Yo14+x2BDSAIPpR2pN/jFCxU6WfPcmBlAYlWWuD31wRGxYC3sUsQ4RP7uGzQVxu8pf5yxOS7ccM0cp92toMdg/Xm1wEQ7K238X3xo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731487503; c=relaxed/simple; bh=SRauyrA8k/Xg7y0dP51sMGRgmXXiHUE/2UEie9f4kDU=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=rNfxYjFuUsdQm9EfCRa86qijHrodHEi7FZA/o2jfdqmCSO4aGuX9+YmVrHemkfqf/3ocUYRa3QWWyYlqxoOoSMbx5nLhoo8PxOX7Wc4wYsrbA7g41ACtD5RJLMF8b2BXln0gf2x0Mcst2c3oGZa4tUaOayY9nb6mF32yIRTA2Oo= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731487492; x=1763023492; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SRauyrA8k/Xg7y0dP51sMGRgmXXiHUE/2UEie9f4kDU=; b=O+R6PEFKq0fi5cMtZ5hzQMdo4oP7sPjhyEvv41nMIrLZjRUxcbhoxB07 BkACho/RvM2AL334lk9ahAycHYIql7aq2vVYAd7OmnzSli0f/wzmyh3yp P33znRSLRIBqcpkmwu0Sl9Btkx5Qf9okjZFSZJ70FgFHoLCTrdaevVtqE 6NA5Ss/ZTUA1VKdcFKLB99hWmjWGLxy/05Wkl0XIIHLVJNomfEyFuNqpY uUEyehMyD0b9SRpkM+ZFZDNwNSimzuUjZeMduyADCtU04ebIp6dRZFl2E LYI+6eej0frzSzjZpfwiJn6aTcpyT8NAn+QEwhC8O17Bl/52WeRBvJgi1 Q==; X-CSE-ConnectionGUID: cole0/VLR2qaR/BBu+Xsgg== X-CSE-MsgGUID: FwozgViqTUuWUiOt6DCSpQ== X-IronPort-AV: E=McAfee;i="6700,10204,11254"; a="31458922" X-IronPort-AV: E=Sophos;i="6.12,150,1728975600"; d="scan'208";a="31458922" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2024 00:44:48 -0800 X-CSE-ConnectionGUID: MlC6l53ZRvy9d7bwCoKOqA== X-CSE-MsgGUID: KsDBfgbaRCerCOxoycaoIg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,150,1728975600"; d="scan'208";a="87545471" Received: from shliclel4217.sh.intel.com ([10.239.240.127]) by fmviesa007.fm.intel.com with ESMTP; 13 Nov 2024 00:44:46 -0800 From: Haochen Jiang To: binutils@sourceware.org Cc: jbeulich@suse.com, hjl.tools@intel.com, Liwei Xu Subject: [PATCH 5/6] Support Intel AMX-FP8 Date: Wed, 13 Nov 2024 16:44:34 +0800 Message-Id: <20241113084435.1784546-6-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20241113084435.1784546-1-haochen.jiang@intel.com> References: <20241113084435.1784546-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: binutils-bounces~patchwork=sourceware.org@sourceware.org From: Liwei Xu In this patch, we will support AMX-FP8 feature. No special handling. gas/ChangeLog: * NEWS: Support Intel AMX-FP8. * config/tc-i386.c: Add amx_fp8. * doc/c-i386.texi: Document .amx_fp8. * testsuite/gas/i386/i386.exp: Run AMX-TRANSPOSE tests. * testsuite/gas/i386/x86-64.exp: Run AMX-FP8 tests. * testsuite/gas/i386/amx-fp8-inval.l: New test. * testsuite/gas/i386/amx-fp8-inval.s: Ditto. * testsuite/gas/i386/x86-64-amx-fp8-intel.d: Ditto. * testsuite/gas/i386/x86-64-amx-fp8-inval.l: Ditto. * testsuite/gas/i386/x86-64-amx-fp8-inval.s: Ditto. * testsuite/gas/i386/x86-64-amx-fp8.d: Ditto. * testsuite/gas/i386/x86-64-amx-fp8.s: Ditto. opcodes/ChangeLog: * i386-dis.c (PREFIX_VEX_MAP5_FD_X86_64_L_0_W_0): New. (X86_64_VEX_MAP5_FD): Ditto. (VEX_LEN_MAP5_FD_X86_64): Ditto. (VEX_W_MAP5_FD_X86_64_L_0):Ditto. (prefix_table): Add PREFIX_VEX_MAP5_FD_X86_64_L_0_W_0. (x86_64_table): Add X86_64_VEX_MAP5_FD. (vex_len_table): Add VEX_LEN_MAP5_FD_X86_64. (vex_w_table): Add VEX_W_MAP5_FD_X86_64_L_0. * i386-gen.c: Add CPU_AMX_FP8_FLAGS and CPU_ANY_AMX_FP8_FLAGS. * i386-init.h: Regenerated. * i386-mnem.h: Ditto. * i386-opc.h: Add cpuamx_fp8. * i386-opc.tbl: Add AMX_FP8 instructions. * i386-tbl.h: Regenerated. --- gas/NEWS | 2 + gas/config/tc-i386.c | 1 + gas/doc/c-i386.texi | 3 +- gas/testsuite/gas/i386/amx-fp8-inval.l | 9 + gas/testsuite/gas/i386/amx-fp8-inval.s | 13 + gas/testsuite/gas/i386/i386.exp | 1 + gas/testsuite/gas/i386/x86-64-amx-fp8-intel.d | 19 + gas/testsuite/gas/i386/x86-64-amx-fp8-inval.l | 9 + gas/testsuite/gas/i386/x86-64-amx-fp8-inval.s | 13 + gas/testsuite/gas/i386/x86-64-amx-fp8.d | 17 + gas/testsuite/gas/i386/x86-64-amx-fp8.s | 23 + gas/testsuite/gas/i386/x86-64.exp | 3 + opcodes/i386-dis.c | 29 +- opcodes/i386-gen.c | 3 + opcodes/i386-init.h | 688 +++++----- opcodes/i386-mnem.h | 1184 +++++++++-------- opcodes/i386-opc.h | 3 + opcodes/i386-opc.tbl | 4 + opcodes/i386-tbl.h | 272 ++-- 19 files changed, 1260 insertions(+), 1036 deletions(-) create mode 100644 gas/testsuite/gas/i386/amx-fp8-inval.l create mode 100644 gas/testsuite/gas/i386/amx-fp8-inval.s create mode 100644 gas/testsuite/gas/i386/x86-64-amx-fp8-intel.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-fp8-inval.l create mode 100644 gas/testsuite/gas/i386/x86-64-amx-fp8-inval.s create mode 100644 gas/testsuite/gas/i386/x86-64-amx-fp8.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-fp8.s diff --git a/gas/NEWS b/gas/NEWS index 56143b9b27e..ba63043002e 100644 --- a/gas/NEWS +++ b/gas/NEWS @@ -1,5 +1,7 @@ -*- text -*- +* Add support for Intel AMX-FP8 instructions. + * Add support for Intel AMX-TF32 instructions. * Add support for Intel AMX-AVX512 instructions. diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index 574d7946dad..504d1e099f4 100644 --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -1185,6 +1185,7 @@ static const arch_entry cpu_arch[] = SUBARCH (amx_transpose, AMX_TRANSPOSE, ANY_AMX_TRANSPOSE, false), SUBARCH (amx_avx512, AMX_AVX512, ANY_AMX_AVX512, false), SUBARCH (amx_tf32, AMX_TF32, ANY_AMX_TF32, false), + SUBARCH (amx_fp8, AMX_FP8, ANY_AMX_FP8, false), SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false), SUBARCH (movdiri, MOVDIRI, MOVDIRI, false), SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false), diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi index bfadb9317e3..bd2f585e1e3 100644 --- a/gas/doc/c-i386.texi +++ b/gas/doc/c-i386.texi @@ -231,6 +231,7 @@ accept various extension mnemonics. For example, @code{amx_transpose}, @code{amx_avx512}, @code{amx_tf32}, +@code{amx_fp8} @code{amx_tile}, @code{vmx}, @code{vmfunc}, @@ -1704,7 +1705,7 @@ supported on the CPU specified. The choices for @var{cpu_type} are: @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk} @item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} @item @samp{.amx_complex} @tab @samp{.amx_transpose} @tab @samp{.amx_avx512} -@item @samp{.amx_tf32} @tab @samp{.amx_tile} +@item @samp{.amx_tf32} @tab @samp {.amx_fp8} @tab @samp{.amx_tile} @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset} @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5} @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} diff --git a/gas/testsuite/gas/i386/amx-fp8-inval.l b/gas/testsuite/gas/i386/amx-fp8-inval.l new file mode 100644 index 00000000000..b74ff0c73e0 --- /dev/null +++ b/gas/testsuite/gas/i386/amx-fp8-inval.l @@ -0,0 +1,9 @@ +.* Assembler messages: +.*:6: Error: `tdpbf8ps' is only supported in 64-bit mode +.*:7: Error: `tdpbf8ps' is only supported in 64-bit mode +.*:8: Error: `tdpbhf8ps' is only supported in 64-bit mode +.*:9: Error: `tdpbhf8ps' is only supported in 64-bit mode +.*:10: Error: `tdphbf8ps' is only supported in 64-bit mode +.*:11: Error: `tdphbf8ps' is only supported in 64-bit mode +.*:12: Error: `tdphf8ps' is only supported in 64-bit mode +.*:13: Error: `tdphf8ps' is only supported in 64-bit mode diff --git a/gas/testsuite/gas/i386/amx-fp8-inval.s b/gas/testsuite/gas/i386/amx-fp8-inval.s new file mode 100644 index 00000000000..371c28db49d --- /dev/null +++ b/gas/testsuite/gas/i386/amx-fp8-inval.s @@ -0,0 +1,13 @@ +# Check Illegal AMX-FP8 instructions + + .allow_index_reg + .text +_start: + tdpbf8ps %tmm4, %tmm5, %tmm6 + tdpbf8ps %tmm1, %tmm2, %tmm3 + tdpbhf8ps %tmm4, %tmm5, %tmm6 + tdpbhf8ps %tmm1, %tmm2, %tmm3 + tdphbf8ps %tmm4, %tmm5, %tmm6 + tdphbf8ps %tmm1, %tmm2, %tmm3 + tdphf8ps %tmm4, %tmm5, %tmm6 + tdphf8ps %tmm1, %tmm2, %tmm3 diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp index 45e8adf7723..df0b7752ab6 100644 --- a/gas/testsuite/gas/i386/i386.exp +++ b/gas/testsuite/gas/i386/i386.exp @@ -549,6 +549,7 @@ if [gas_32_check] then { run_list_test "amx-transpose-inval" run_list_test "amx-avx512-inval" run_list_test "amx-tf32-inval" + run_list_test "amx-fp8-inval" run_list_test "sg" run_dump_test "clzero" run_dump_test "invlpgb" diff --git a/gas/testsuite/gas/i386/x86-64-amx-fp8-intel.d b/gas/testsuite/gas/i386/x86-64-amx-fp8-intel.d new file mode 100644 index 00000000000..8af297b1f92 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-fp8-intel.d @@ -0,0 +1,19 @@ +#objdump: -dw -Mintel +#name: x86_64 AMX-FP8 insns (Intel disassembly) +#source: x86-64-amx-fp8.s + +.*: +file format .* + +Disassembly of section \.text: + +#... +[a-f0-9]+ <_intel>: +\s*[a-f0-9]+:\s*c4 e5 58 fd f5\s+tdpbf8ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e5 70 fd da\s+tdpbf8ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e5 5b fd f5\s+tdpbhf8ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e5 73 fd da\s+tdpbhf8ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e5 5a fd f5\s+tdphbf8ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e5 72 fd da\s+tdphbf8ps tmm3,tmm2,tmm1 +\s*[a-f0-9]+:\s*c4 e5 59 fd f5\s+tdphf8ps tmm6,tmm5,tmm4 +\s*[a-f0-9]+:\s*c4 e5 71 fd da\s+tdphf8ps tmm3,tmm2,tmm1 +#pass diff --git a/gas/testsuite/gas/i386/x86-64-amx-fp8-inval.l b/gas/testsuite/gas/i386/x86-64-amx-fp8-inval.l new file mode 100644 index 00000000000..726452418a0 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-fp8-inval.l @@ -0,0 +1,9 @@ +.* Assembler messages: +.*:6: Error: all tmm registers must be distinct for `tdpbf8ps' +.*:7: Error: all tmm registers must be distinct for `tdpbf8ps' +.*:8: Error: all tmm registers must be distinct for `tdpbhf8ps' +.*:9: Error: all tmm registers must be distinct for `tdpbhf8ps' +.*:10: Error: all tmm registers must be distinct for `tdphbf8ps' +.*:11: Error: all tmm registers must be distinct for `tdphbf8ps' +.*:12: Error: all tmm registers must be distinct for `tdphf8ps' +.*:13: Error: all tmm registers must be distinct for `tdphf8ps' diff --git a/gas/testsuite/gas/i386/x86-64-amx-fp8-inval.s b/gas/testsuite/gas/i386/x86-64-amx-fp8-inval.s new file mode 100644 index 00000000000..f3d7cafe73c --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-fp8-inval.s @@ -0,0 +1,13 @@ +# Check Illegal AMX-FP8 instructions + + .allow_index_reg + .text +_start: + tdpbf8ps %tmm1, %tmm1, %tmm2 + tdpbf8ps %tmm1, %tmm2, %tmm2 + tdpbhf8ps %tmm1, %tmm1, %tmm2 + tdpbhf8ps %tmm1, %tmm2, %tmm2 + tdphbf8ps %tmm1, %tmm1, %tmm2 + tdphbf8ps %tmm1, %tmm2, %tmm2 + tdphf8ps %tmm1, %tmm1, %tmm2 + tdphf8ps %tmm1, %tmm2, %tmm2 diff --git a/gas/testsuite/gas/i386/x86-64-amx-fp8.d b/gas/testsuite/gas/i386/x86-64-amx-fp8.d new file mode 100644 index 00000000000..fd81d0c52ff --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-fp8.d @@ -0,0 +1,17 @@ +#objdump: -dw +#name: x86_64 AMX-FP8 insns + +.*: +file format .* + +Disassembly of section \.text: + +0+ <_start>: +\s*[a-f0-9]+:\s*c4 e5 58 fd f5\s+tdpbf8ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e5 70 fd da\s+tdpbf8ps %tmm1,%tmm2,%tmm3 +\s*[a-f0-9]+:\s*c4 e5 5b fd f5\s+tdpbhf8ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e5 73 fd da\s+tdpbhf8ps %tmm1,%tmm2,%tmm3 +\s*[a-f0-9]+:\s*c4 e5 5a fd f5\s+tdphbf8ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e5 72 fd da\s+tdphbf8ps %tmm1,%tmm2,%tmm3 +\s*[a-f0-9]+:\s*c4 e5 59 fd f5\s+tdphf8ps %tmm4,%tmm5,%tmm6 +\s*[a-f0-9]+:\s*c4 e5 71 fd da\s+tdphf8ps %tmm1,%tmm2,%tmm3 +#pass diff --git a/gas/testsuite/gas/i386/x86-64-amx-fp8.s b/gas/testsuite/gas/i386/x86-64-amx-fp8.s new file mode 100644 index 00000000000..b8357b41ecb --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-fp8.s @@ -0,0 +1,23 @@ +# Check 64bit AMX-FP8 instructions + + .text +_start: + tdpbf8ps %tmm4, %tmm5, %tmm6 + tdpbf8ps %tmm1, %tmm2, %tmm3 + tdpbhf8ps %tmm4, %tmm5, %tmm6 + tdpbhf8ps %tmm1, %tmm2, %tmm3 + tdphbf8ps %tmm4, %tmm5, %tmm6 + tdphbf8ps %tmm1, %tmm2, %tmm3 + tdphf8ps %tmm4, %tmm5, %tmm6 + tdphf8ps %tmm1, %tmm2, %tmm3 + +_intel: + .intel_syntax noprefix + tdpbf8ps tmm6, tmm5, tmm4 + tdpbf8ps tmm3, tmm2, tmm1 + tdpbhf8ps tmm6, tmm5, tmm4 + tdpbhf8ps tmm3, tmm2, tmm1 + tdphbf8ps tmm6, tmm5, tmm4 + tdphbf8ps tmm3, tmm2, tmm1 + tdphf8ps tmm6, tmm5, tmm4 + tdphf8ps tmm3, tmm2, tmm1 diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp index 9cb79eb0a4c..7d3f7ebe2b1 100644 --- a/gas/testsuite/gas/i386/x86-64.exp +++ b/gas/testsuite/gas/i386/x86-64.exp @@ -532,6 +532,9 @@ run_dump_test "x86-64-amx-avx512-intel" run_dump_test "x86-64-amx-tf32" run_dump_test "x86-64-amx-tf32-intel" run_list_test "x86-64-amx-tf32-inval" +run_dump_test "x86-64-amx-fp8" +run_dump_test "x86-64-amx-fp8-intel" +run_list_test "x86-64-amx-fp8-inval" run_dump_test "x86-64-clzero" run_dump_test "x86-64-mwaitx-bdver4" run_list_test "x86-64-mwaitx-reg" diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c index 815e02a5d59..0fe2a5b48ec 100644 --- a/opcodes/i386-dis.c +++ b/opcodes/i386-dis.c @@ -1160,6 +1160,7 @@ enum PREFIX_VEX_0F38F6_L_0, PREFIX_VEX_0F38F7_L_0, PREFIX_VEX_0F3AF0_L_0, + PREFIX_VEX_MAP5_FD_X86_64_L_0_W_0, PREFIX_VEX_MAP7_F6_L_0_W_0_R_0_X86_64, PREFIX_VEX_MAP7_F8_L_0_W_0_R_0_X86_64, @@ -1367,6 +1368,7 @@ enum X86_64_VEX_0F386F, X86_64_VEX_0F38Ex, + X86_64_VEX_MAP5_FD, X86_64_VEX_MAP7_F6_L_0_W_0_R_0, X86_64_VEX_MAP7_F8_L_0_W_0_R_0, @@ -1498,6 +1500,7 @@ enum VEX_LEN_0F3ADE_W_0, VEX_LEN_0F3ADF, VEX_LEN_0F3AF0, + VEX_LEN_MAP5_FD_X86_64, VEX_LEN_MAP7_F6, VEX_LEN_MAP7_F8, VEX_LEN_XOP_08_85, @@ -1674,6 +1677,7 @@ enum VEX_W_0F3ACE, VEX_W_0F3ACF, VEX_W_0F3ADE, + VEX_W_MAP5_FD_X86_64_L_0, VEX_W_MAP7_F6_L_0, VEX_W_MAP7_F8_L_0, @@ -4298,6 +4302,14 @@ static const struct dis386 prefix_table[][4] = { { "%XErorxS", { Gdq, Edq, Ib }, 0 }, }, + /* PREFIX_VEX_MAP5_FD_X86_64_L_0_W_0 */ + { + { "tdpbf8ps", { TMM, Rtmm, VexTmm }, 0 }, + { "tdphbf8ps", { TMM, Rtmm, VexTmm }, 0 }, + { "tdphf8ps", { TMM, Rtmm, VexTmm }, 0 }, + { "tdpbhf8ps", { TMM, Rtmm, VexTmm }, 0 }, + }, + /* PREFIX_VEX_MAP7_F6_L_0_W_0_R_0_X86_64 */ { { Bad_Opcode }, @@ -4700,6 +4712,12 @@ static const struct dis386 x86_64_table[][2] = { { "%XEcmp%CCxadd", { Mdq, Gdq, VexGdq }, PREFIX_DATA }, }, + /* X86_64_VEX_MAP5_FD */ + { + { Bad_Opcode }, + { VEX_LEN_TABLE (VEX_LEN_MAP5_FD_X86_64) }, + }, + /* X86_64_VEX_MAP7_F6_L_0_W_0_R_0 */ { { Bad_Opcode }, @@ -7338,7 +7356,7 @@ static const struct dis386 vex_table[][256] = { { Bad_Opcode }, { Bad_Opcode }, { Bad_Opcode }, - { Bad_Opcode }, + { X86_64_TABLE (X86_64_VEX_MAP5_FD) }, { Bad_Opcode }, { Bad_Opcode }, }, @@ -7781,6 +7799,11 @@ static const struct dis386 vex_len_table[][2] = { { PREFIX_TABLE (PREFIX_VEX_0F3AF0_L_0) }, }, + /* VEX_LEN_MAP5_FD_X86_64 */ + { + { VEX_W_TABLE (VEX_W_MAP5_FD_X86_64_L_0) }, + }, + /* VEX_LEN_MAP7_F6 */ { { VEX_W_TABLE (VEX_W_MAP7_F6_L_0) }, @@ -8417,6 +8440,10 @@ static const struct dis386 vex_w_table[][2] = { /* VEX_W_0F3ADE */ { VEX_LEN_TABLE (VEX_LEN_0F3ADE_W_0) }, }, + { + /* VEX_W_MAP5_FD_X86_64 */ + { PREFIX_TABLE (PREFIX_VEX_MAP5_FD_X86_64_L_0_W_0) }, + }, { /* VEX_W_MAP7_F6_L_0 */ { REG_TABLE (REG_VEX_MAP7_F6_L_0_W_0) }, diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c index 6d2cdac6e4e..965eca2e640 100644 --- a/opcodes/i386-gen.c +++ b/opcodes/i386-gen.c @@ -269,6 +269,8 @@ static const dependency isa_dependencies[] = "AMX_TILE|AVX10_2" }, { "AMX_TF32", "AMX_TILE" }, + { "AMX_FP8", + "AMX_TILE" }, { "KL", "SSE2" }, { "WIDEKL", @@ -438,6 +440,7 @@ static bitfield cpu_flags[] = BITFIELD (AMX_TRANSPOSE), BITFIELD (AMX_AVX512), BITFIELD (AMX_TF32), + BITFIELD (AMX_FP8), BITFIELD (AMX_TILE), BITFIELD (MOVDIRI), BITFIELD (MOVDIR64B), diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h index 89a6396686a..a16ec56d355 100644 --- a/opcodes/i386-opc.h +++ b/opcodes/i386-opc.h @@ -256,6 +256,8 @@ enum i386_cpu CpuAMX_AVX512, /* Intel AMX-TF32 Instructions support required. */ CpuAMX_TF32, + /* AMX-FP8 instructions required */ + CpuAMX_FP8, /* AMX-TILE instructions required */ CpuAMX_TILE, /* GFNI instructions required */ @@ -506,6 +508,7 @@ typedef union i386_cpu_flags unsigned int cpuamx_complex:1; unsigned int cpuamx_avx512:1; unsigned int cpuamx_tf32:1; + unsigned int cpuamx_fp8:1; unsigned int cpuamx_tile:1; unsigned int cpugfni:1; unsigned int cpuvaes:1; diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl index b24f255307e..22c5653c313 100644 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -3217,12 +3217,16 @@ tcvtrowps2phh, 0x07, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, Re tcvtrowps2phl, 0x666d, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM } tcvtrowps2phl, 0xf277, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM } +tdpbf8ps, 0xfd, AMX_FP8, Modrm|Vex128|xVexMap5|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } +tdpbhf8ps, 0xf2fd, AMX_FP8, Modrm|Vex128|xVexMap5|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } tdpbf16ps, 0xf35c, AMX_BF16, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } tdpfp16ps, 0xf25c, AMX_FP16, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } tdpbssd, 0xf25e, AMX_INT8, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } tdpbuud, 0x5e, AMX_INT8, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } tdpbusd, 0x665e, AMX_INT8, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } tdpbsud, 0xf35e, AMX_INT8, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } +tdphbf8ps, 0xf3fd, AMX_FP8, Modrm|Vex128|xVexMap5|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } +tdphf8ps, 0x66fd, AMX_FP8, Modrm|Vex128|xVexMap5|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } tileloadd, 0xf24b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } tileloaddt1, 0x664b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } From patchwork Wed Nov 13 08:44:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haochen Jiang X-Patchwork-Id: 100953 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BEEF73858C60 for ; Wed, 13 Nov 2024 08:47:22 +0000 (GMT) X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) by sourceware.org (Postfix) with ESMTPS id E892E3858C62 for ; Wed, 13 Nov 2024 08:44:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E892E3858C62 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E892E3858C62 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.17 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731487503; cv=none; b=IAf+24rI+X4ZFzIEgFIPRaP1Je6VOJWsG+Ps4B8RH2bwZfi/s3VmLAWsJhooU0ltKWL0bC2zPcSR99AsA5ngtccUlwidtD9AdQd6Pl5zU+zNfNdFbs6LkDbWHfohogZgHk8n1dDaZ5jxidZFdSumfqwgEHiHTAUGCSevYiLJGEQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731487503; c=relaxed/simple; bh=woRofr51RC/B9URvAO/AY1HEB5x80Qm52HCDzqdYZ0M=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=LWsdRbsgmyGXMduCXTVFuSSh9Y6347gLFcyfNTWGwX+dgGslMVVbG7SMcbNvOfH208Q63vykuYAE7KWwYU5EVRxFNkAG1XWDU7kKxeDIjdAi0e85jRKB/Lq6ko0XvbfAHLo8NZthYY+RILmvgdMXgmGXg5hzbzN+aedF+c55X1M= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731487493; x=1763023493; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=woRofr51RC/B9URvAO/AY1HEB5x80Qm52HCDzqdYZ0M=; b=QVPCCccep5jbVF68nR6qTRAjNnJLvJ6Xi40hRMv010eN02RxG6crhXCN 3hCXFrtCsx+1y+xqufzg2IGyuamuEIuftMvqxrhvysmeJ+bZvJPTAZICK vWekIFjF6WbNrCigPmojLvkQiuKex+MwaDu5GO+6NwXt93+921ck3OSfp E0jkdWmCfanO8TnE7ijFkQ3VI3Xas/zFAsNdwZINu/A1xnJru+P21ZGa+ 3e/1tPL9af5Q7YPkJ53Scq+QLq2QWKNXIWPzWqG/JnzULzDzoceSpfMaG 30PtNIgQ8KKjiyVQrP+Ub8Kjrx3wOtZ+C0UIMcEzjId0KbjwCNFTAFY+D Q==; X-CSE-ConnectionGUID: 5qnmTmXjR0aOWvSnmQJ8xA== X-CSE-MsgGUID: xMsR33kwRyKIK2PMBcB2kA== X-IronPort-AV: E=McAfee;i="6700,10204,11254"; a="31458927" X-IronPort-AV: E=Sophos;i="6.12,150,1728975600"; d="scan'208";a="31458927" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2024 00:44:50 -0800 X-CSE-ConnectionGUID: 7Zi2ZqrcQeKD+rDC4Lf65g== X-CSE-MsgGUID: S7ezyIyjSW+tC9YzilPNKg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,150,1728975600"; d="scan'208";a="87545478" Received: from shliclel4217.sh.intel.com ([10.239.240.127]) by fmviesa007.fm.intel.com with ESMTP; 13 Nov 2024 00:44:48 -0800 From: Haochen Jiang To: binutils@sourceware.org Cc: jbeulich@suse.com, hjl.tools@intel.com, "Hu, Lin1" Subject: [PATCH 6/6] Support Intel AMX-MOVRS Date: Wed, 13 Nov 2024 16:44:35 +0800 Message-Id: <20241113084435.1784546-7-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20241113084435.1784546-1-haochen.jiang@intel.com> References: <20241113084435.1784546-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: binutils-bounces~patchwork=sourceware.org@sourceware.org From: "Hu, Lin1" This patch will support AMX-MOVRS feature. No special handling since TMM pair has been handled in AMX-TRANSPOSE. gas/ChangeLog: * NEWS: Support Intel AMX-MOVRS. * config/tc-i386.c: Add amx_movrs. * doc/c-i386.texi: Document .amx_movrs. * testsuite/gas/i386/i386.exp: Run AMX-MOVRS tests. * testsuite/gas/i386/x86-64.exp: Ditto. * testsuite/gas/i386/amx-movrs-inval.l: New test. * testsuite/gas/i386/amx-movrs-inval.s: Ditto. * testsuite/gas/i386/x86-64-amx-movrs-intel.d: Ditto. * testsuite/gas/i386/x86-64-amx-movrs-inval.l: Ditto. * testsuite/gas/i386/x86-64-amx-movrs-inval.s: Ditto. * testsuite/gas/i386/x86-64-amx-movrs.d: Ditto. * testsuite/gas/i386/x86-64-amx-movrs.s: Ditto. opcodes/ChangeLog: * i386-dis.c (MOD_VEX_0F384A_L_0_W_0): New. (MOD_VEX_MAP5_F8_L_0_W_0): Ditto. (MOD_VEX_MAP5_F9_L_0_W_0): Ditto. (PREFIX_VEX_0F384A_L_0_W_0_M_0_X86_64): Ditto. (PREFIX_VEX_MAP5_F8_L_0_W_0_M_0_X86_64): Ditto. (PREFIX_VEX_MAP5_F9_L_0_W_0_M_0_X86_64): Ditto. (X86_64_VEX_0F384A_L_0_W_0_M_0): Ditto. (X86_64_VEX_MAP5_F8_L_0_W_0_M_0): Ditto. (X86_64_VEX_MAP5_F9_L_0_W_0_M_0): Ditto. (VEX_MAP5): Ditto. (VEX_LEN_0F384A): Ditto. (VEX_LEN_MAP5_F8): Ditto. (VEX_LEN_MAP5_F9): Ditto. (VEX_W_0F384A_L_0): Ditto. (VEX_W_MAP5_F8_L_0): Ditto. (VEX_W_MAP5_F9_L_0): Ditto. (prefix_table): Add PREFIX_VEX_0F384A_L_0_W_0_M_0_X86_64, PREFIX_VEX_MAP5_F8_L_0_W_0_M_0_X86_64 and PREFIX_VEX_MAP5_F9_L_0_W_0_M_0_X86_64. (x86_64_table): Add X86_64_VEX_0F384A_L_0_W_0_M_0, X86_64_VEX_MAP5_F8_L_0_W_0_M_0 and X86_64_VEX_MAP5_F9_L_0_W_0_M_0. (vex_len_table): Add VEX_LEN_0F384A, VEX_LEN_MAP5_F8 and VEX_LEN_MAP5_F9. (vex_w_table): Add VEX_W_0F384A_L_0, VEX_W_MAP5_F8_L_0, and VEX_W_MAP5_F9_L_0. * i386-gen.c (cpu_flag_init): Add CPU_AMX_MOVRS_FLAGS and CPU_ANY_AMX_MOVRS_FLAGS. * i386-init.h: Regenerated. * i386-mnem.h: Ditto. * i386-opc.h (CpuAMX_MOVRS): New. (i386_cpu_flags): Add cpuamx_movrs. * i386-opc.tbl: Add AMX-MOVRS instructions. * i386-tbl.h: Regenerated. --- gas/NEWS | 2 + gas/config/tc-i386.c | 1 + gas/doc/c-i386.texi | 4 +- gas/testsuite/gas/i386/amx-movrs-inval.l | 7 + gas/testsuite/gas/i386/amx-movrs-inval.s | 11 + gas/testsuite/gas/i386/i386.exp | 1 + .../gas/i386/x86-64-amx-movrs-intel.d | 23 + .../gas/i386/x86-64-amx-movrs-inval.l | 5 + .../gas/i386/x86-64-amx-movrs-inval.s | 9 + gas/testsuite/gas/i386/x86-64-amx-movrs.d | 21 + gas/testsuite/gas/i386/x86-64-amx-movrs.s | 31 + gas/testsuite/gas/i386/x86-64.exp | 3 + opcodes/i386-dis.c | 100 +- opcodes/i386-gen.c | 3 + opcodes/i386-init.h | 688 +-- opcodes/i386-mnem.h | 4362 +++++++++-------- opcodes/i386-opc.h | 3 + opcodes/i386-opc.tbl | 7 + opcodes/i386-tbl.h | 287 +- 19 files changed, 2942 insertions(+), 2626 deletions(-) create mode 100644 gas/testsuite/gas/i386/amx-movrs-inval.l create mode 100644 gas/testsuite/gas/i386/amx-movrs-inval.s create mode 100644 gas/testsuite/gas/i386/x86-64-amx-movrs-intel.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-movrs-inval.l create mode 100644 gas/testsuite/gas/i386/x86-64-amx-movrs-inval.s create mode 100644 gas/testsuite/gas/i386/x86-64-amx-movrs.d create mode 100644 gas/testsuite/gas/i386/x86-64-amx-movrs.s diff --git a/gas/NEWS b/gas/NEWS index ba63043002e..1cf6ca652b5 100644 --- a/gas/NEWS +++ b/gas/NEWS @@ -1,5 +1,7 @@ -*- text -*- +* Add support for Intel AMX-MOVRS instructions. + * Add support for Intel AMX-FP8 instructions. * Add support for Intel AMX-TF32 instructions. diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index 504d1e099f4..588629751a2 100644 --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -1186,6 +1186,7 @@ static const arch_entry cpu_arch[] = SUBARCH (amx_avx512, AMX_AVX512, ANY_AMX_AVX512, false), SUBARCH (amx_tf32, AMX_TF32, ANY_AMX_TF32, false), SUBARCH (amx_fp8, AMX_FP8, ANY_AMX_FP8, false), + SUBARCH (amx_movrs, AMX_MOVRS, ANY_AMX_MOVRS, false), SUBARCH (amx_tile, AMX_TILE, ANY_AMX_TILE, false), SUBARCH (movdiri, MOVDIRI, MOVDIRI, false), SUBARCH (movdir64b, MOVDIR64B, MOVDIR64B, false), diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi index bd2f585e1e3..caceb38025a 100644 --- a/gas/doc/c-i386.texi +++ b/gas/doc/c-i386.texi @@ -232,6 +232,7 @@ accept various extension mnemonics. For example, @code{amx_avx512}, @code{amx_tf32}, @code{amx_fp8} +@code{amx_movrs}, @code{amx_tile}, @code{vmx}, @code{vmfunc}, @@ -1705,7 +1706,8 @@ supported on the CPU specified. The choices for @var{cpu_type} are: @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk} @item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_fp16} @item @samp{.amx_complex} @tab @samp{.amx_transpose} @tab @samp{.amx_avx512} -@item @samp{.amx_tf32} @tab @samp {.amx_fp8} @tab @samp{.amx_tile} +@item @samp{.amx_tf32} @tab @samp{.amx_tile} @tab @tab @samp{.amx_movrs} +@item @samp{.amx_tile} @item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset} @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5} @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} diff --git a/gas/testsuite/gas/i386/amx-movrs-inval.l b/gas/testsuite/gas/i386/amx-movrs-inval.l new file mode 100644 index 00000000000..6fa829849d5 --- /dev/null +++ b/gas/testsuite/gas/i386/amx-movrs-inval.l @@ -0,0 +1,7 @@ +.* Assembler messages: +.*:6: Error: `t2rpntlvwz0rs' is only supported in 64-bit mode +.*:7: Error: `t2rpntlvwz0rst1' is only supported in 64-bit mode +.*:8: Error: `t2rpntlvwz1rs' is only supported in 64-bit mode +.*:9: Error: `t2rpntlvwz1rst1' is only supported in 64-bit mode +.*:10: Error: `tileloaddrs' is only supported in 64-bit mode +.*:11: Error: `tileloaddrst1' is only supported in 64-bit mode diff --git a/gas/testsuite/gas/i386/amx-movrs-inval.s b/gas/testsuite/gas/i386/amx-movrs-inval.s new file mode 100644 index 00000000000..09778e790bd --- /dev/null +++ b/gas/testsuite/gas/i386/amx-movrs-inval.s @@ -0,0 +1,11 @@ +# Check Illegal 32bit AMX-MOVRS instructions + + .allow_index_reg + .text +_start: + t2rpntlvwz0rs 0x10000000(%esp, %esi, 8), %tmm6 + t2rpntlvwz0rst1 0x10000000(%esp, %esi, 8), %tmm6 + t2rpntlvwz1rs 0x10000000(%esp, %esi, 8), %tmm6 + t2rpntlvwz1rst1 0x10000000(%esp, %esi, 8), %tmm6 + tileloaddrs 0x10000000(%esp, %esi, 8), %tmm6 + tileloaddrst1 0x10000000(%esp, %esi, 8), %tmm6 diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp index df0b7752ab6..acc50eafd33 100644 --- a/gas/testsuite/gas/i386/i386.exp +++ b/gas/testsuite/gas/i386/i386.exp @@ -550,6 +550,7 @@ if [gas_32_check] then { run_list_test "amx-avx512-inval" run_list_test "amx-tf32-inval" run_list_test "amx-fp8-inval" + run_list_test "amx-movrs-inval" run_list_test "sg" run_dump_test "clzero" run_dump_test "invlpgb" diff --git a/gas/testsuite/gas/i386/x86-64-amx-movrs-intel.d b/gas/testsuite/gas/i386/x86-64-amx-movrs-intel.d new file mode 100644 index 00000000000..f4cd0bd0911 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-movrs-intel.d @@ -0,0 +1,23 @@ +#objdump: -dw -Mintel +#name: x86_64 AMX-MOVRS insns (Intel disassembly) +#source: x86-64-amx-movrs.s + +.*: +file format .* + +Disassembly of section \.text: + +#... +[a-f0-9]+ <_intel>: +\s*[a-f0-9]+:\s*c4 a5 78 f8 b4 f5 00 00 00 10\s+t2rpntlvwz0rs tmm6,\[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 c5 78 f8 14 21\s+t2rpntlvwz0rs tmm2,\[r9\+riz\*1\] +\s*[a-f0-9]+:\s*c4 a5 78 f9 b4 f5 00 00 00 10\s+t2rpntlvwz0rst1 tmm6,\[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 c5 78 f9 14 21\s+t2rpntlvwz0rst1 tmm2,\[r9\+riz\*1\] +\s*[a-f0-9]+:\s*c4 a5 79 f8 b4 f5 00 00 00 10\s+t2rpntlvwz1rs tmm6,\[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 c5 79 f8 14 21\s+t2rpntlvwz1rs tmm2,\[r9\+riz\*1\] +\s*[a-f0-9]+:\s*c4 a5 79 f9 b4 f5 00 00 00 10\s+t2rpntlvwz1rst1 tmm6,\[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 c5 79 f9 14 21\s+t2rpntlvwz1rst1 tmm2,\[r9\+riz\*1\] +\s*[a-f0-9]+:\s*c4 a2 7b 4a b4 f5 00 00 00 10\s+tileloaddrs tmm6,\[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 c2 7b 4a 1c 21\s+tileloaddrs tmm3,\[r9\+riz\*1\] +\s*[a-f0-9]+:\s*c4 a2 79 4a b4 f5 00 00 00 10\s+tileloaddrst1 tmm6,\[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 c2 79 4a 1c 21\s+tileloaddrst1 tmm3,\[r9\+riz\*1\] +#pass diff --git a/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.l b/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.l new file mode 100644 index 00000000000..0f7d72e73a9 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.l @@ -0,0 +1,5 @@ +.* Assembler messages: +.*:6: Error: `t2rpntlvwz0rs' is not supported on `x86_64.noamx_transpose' +.*:7: Error: `t2rpntlvwz0rst1' is not supported on `x86_64.noamx_transpose' +.*:8: Error: `t2rpntlvwz1rs' is not supported on `x86_64.noamx_transpose' +.*:9: Error: `t2rpntlvwz1rst1' is not supported on `x86_64.noamx_transpose' diff --git a/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.s b/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.s new file mode 100644 index 00000000000..b6499c81c05 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-movrs-inval.s @@ -0,0 +1,9 @@ +# Check Invalid 64bit AMX-MOVRS instructions + + .text + .arch .noamx_transpose +_start: + t2rpntlvwz0rs (%r9), %tmm3 + t2rpntlvwz0rst1 (%r9), %tmm3 + t2rpntlvwz1rs (%r9), %tmm3 + t2rpntlvwz1rst1 (%r9), %tmm3 diff --git a/gas/testsuite/gas/i386/x86-64-amx-movrs.d b/gas/testsuite/gas/i386/x86-64-amx-movrs.d new file mode 100644 index 00000000000..b0bc77e8f15 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-movrs.d @@ -0,0 +1,21 @@ +#objdump: -dw +#name: x86_64 AMX-MOVRS insns + +.*: +file format .* + +Disassembly of section \.text: + +0+ <_start>: +\s*[a-f0-9]+:\s*c4 a5 78 f8 b4 f5 00 00 00 10\s+t2rpntlvwz0rs 0x10000000\(%rbp,%r14,8\),%tmm6 +\s*[a-f0-9]+:\s*c4 c5 78 f8 14 21\s+t2rpntlvwz0rs \(%r9,%riz,1\),%tmm2 +\s*[a-f0-9]+:\s*c4 a5 78 f9 b4 f5 00 00 00 10\s+t2rpntlvwz0rst1 0x10000000\(%rbp,%r14,8\),%tmm6 +\s*[a-f0-9]+:\s*c4 c5 78 f9 14 21\s+t2rpntlvwz0rst1 \(%r9,%riz,1\),%tmm2 +\s*[a-f0-9]+:\s*c4 a5 79 f8 b4 f5 00 00 00 10\s+t2rpntlvwz1rs 0x10000000\(%rbp,%r14,8\),%tmm6 +\s*[a-f0-9]+:\s*c4 c5 79 f8 14 21\s+t2rpntlvwz1rs \(%r9,%riz,1\),%tmm2 +\s*[a-f0-9]+:\s*c4 a5 79 f9 b4 f5 00 00 00 10\s+t2rpntlvwz1rst1 0x10000000\(%rbp,%r14,8\),%tmm6 +\s*[a-f0-9]+:\s*c4 c5 79 f9 14 21\s+t2rpntlvwz1rst1 \(%r9,%riz,1\),%tmm2 +\s*[a-f0-9]+:\s*c4 a2 7b 4a b4 f5 00 00 00 10\s+tileloaddrs 0x10000000\(%rbp,%r14,8\),%tmm6 +\s*[a-f0-9]+:\s*c4 c2 7b 4a 1c 21\s+tileloaddrs \(%r9,%riz,1\),%tmm3 +\s*[a-f0-9]+:\s*c4 a2 79 4a b4 f5 00 00 00 10\s+tileloaddrst1 0x10000000\(%rbp,%r14,8\),%tmm6 +\s*[a-f0-9]+:\s*c4 c2 79 4a 1c 21\s+tileloaddrst1 \(%r9,%riz,1\),%tmm3 +#pass diff --git a/gas/testsuite/gas/i386/x86-64-amx-movrs.s b/gas/testsuite/gas/i386/x86-64-amx-movrs.s new file mode 100644 index 00000000000..10bc9bc1299 --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-amx-movrs.s @@ -0,0 +1,31 @@ +# Check 64bit AMX-MOVRS instructions + + .text +_start: + t2rpntlvwz0rs 0x10000000(%rbp, %r14, 8), %tmm6 + t2rpntlvwz0rs (%r9), %tmm3 + t2rpntlvwz0rst1 0x10000000(%rbp, %r14, 8), %tmm6 + t2rpntlvwz0rst1 (%r9), %tmm3 + t2rpntlvwz1rs 0x10000000(%rbp, %r14, 8), %tmm6 + t2rpntlvwz1rs (%r9), %tmm3 + t2rpntlvwz1rst1 0x10000000(%rbp, %r14, 8), %tmm6 + t2rpntlvwz1rst1 (%r9), %tmm3 + tileloaddrs 0x10000000(%rbp, %r14, 8), %tmm6 + tileloaddrs (%r9), %tmm3 + tileloaddrst1 0x10000000(%rbp, %r14, 8), %tmm6 + tileloaddrst1 (%r9), %tmm3 + +_intel: + .intel_syntax noprefix + t2rpntlvwz0rs tmm6, [rbp+r14*8+0x10000000] + t2rpntlvwz0rs tmm3, [r9] + t2rpntlvwz0rst1 tmm6, [rbp+r14*8+0x10000000] + t2rpntlvwz0rst1 tmm3, [r9] + t2rpntlvwz1rs tmm6, [rbp+r14*8+0x10000000] + t2rpntlvwz1rs tmm3, [r9] + t2rpntlvwz1rst1 tmm6, [rbp+r14*8+0x10000000] + t2rpntlvwz1rst1 tmm3, [r9] + tileloaddrs tmm6, [rbp+r14*8+0x10000000] + tileloaddrs tmm3, [r9] + tileloaddrst1 tmm6, [rbp+r14*8+0x10000000] + tileloaddrst1 tmm3, [r9] diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp index 7d3f7ebe2b1..751c376ea84 100644 --- a/gas/testsuite/gas/i386/x86-64.exp +++ b/gas/testsuite/gas/i386/x86-64.exp @@ -535,6 +535,9 @@ run_list_test "x86-64-amx-tf32-inval" run_dump_test "x86-64-amx-fp8" run_dump_test "x86-64-amx-fp8-intel" run_list_test "x86-64-amx-fp8-inval" +run_dump_test "x86-64-amx-movrs" +run_dump_test "x86-64-amx-movrs-intel" +run_list_test "x86-64-amx-movrs-inval" run_dump_test "x86-64-clzero" run_dump_test "x86-64-mwaitx-bdver4" run_list_test "x86-64-mwaitx-reg" diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c index 0fe2a5b48ec..f859bcce853 100644 --- a/opcodes/i386-dis.c +++ b/opcodes/i386-dis.c @@ -964,8 +964,11 @@ enum MOD_0F38F8, MOD_VEX_0F3849_X86_64_L_0_W_0, + MOD_VEX_0F384A_X86_64, MOD_VEX_0F386E_X86_64, MOD_VEX_0F386F_X86_64, + MOD_VEX_MAP5_F8_X86_64, + MOD_VEX_MAP5_F9_X86_64, MOD_EVEX_MAP4_60, MOD_EVEX_MAP4_61, @@ -1135,6 +1138,7 @@ enum PREFIX_VEX_0F3848_X86_64_L_0_W_0, PREFIX_VEX_0F3849_X86_64_L_0_W_0_M_0, PREFIX_VEX_0F3849_X86_64_L_0_W_0_M_1, + PREFIX_VEX_0F384A_X86_64_M_0_L_0_W_0, PREFIX_VEX_0F384B_X86_64_L_0_W_0, PREFIX_VEX_0F3850_W_0, PREFIX_VEX_0F3851_W_0, @@ -1160,6 +1164,8 @@ enum PREFIX_VEX_0F38F6_L_0, PREFIX_VEX_0F38F7_L_0, PREFIX_VEX_0F3AF0_L_0, + PREFIX_VEX_MAP5_F8_X86_64_M_0_L_0_W_0, + PREFIX_VEX_MAP5_F9_X86_64_M_0_L_0_W_0, PREFIX_VEX_MAP5_FD_X86_64_L_0_W_0, PREFIX_VEX_MAP7_F6_L_0_W_0_R_0_X86_64, PREFIX_VEX_MAP7_F8_L_0_W_0_R_0_X86_64, @@ -1358,6 +1364,7 @@ enum X86_64_VEX_0F3848, X86_64_VEX_0F3849, + X86_64_VEX_0F384A, X86_64_VEX_0F384B, X86_64_VEX_0F385C, X86_64_VEX_0F385E, @@ -1368,6 +1375,8 @@ enum X86_64_VEX_0F386F, X86_64_VEX_0F38Ex, + X86_64_VEX_MAP5_F8, + X86_64_VEX_MAP5_F9, X86_64_VEX_MAP5_FD, X86_64_VEX_MAP7_F6_L_0_W_0_R_0, X86_64_VEX_MAP7_F8_L_0_W_0_R_0, @@ -1453,6 +1462,7 @@ enum VEX_LEN_0F3841, VEX_LEN_0F3848_X86_64, VEX_LEN_0F3849_X86_64, + VEX_LEN_0F384A_X86_64_M_0, VEX_LEN_0F384B_X86_64, VEX_LEN_0F385A, VEX_LEN_0F385C_X86_64, @@ -1500,6 +1510,8 @@ enum VEX_LEN_0F3ADE_W_0, VEX_LEN_0F3ADF, VEX_LEN_0F3AF0, + VEX_LEN_MAP5_F8_X86_64_M_0, + VEX_LEN_MAP5_F9_X86_64_M_0, VEX_LEN_MAP5_FD_X86_64, VEX_LEN_MAP7_F6, VEX_LEN_MAP7_F8, @@ -1630,6 +1642,7 @@ enum VEX_W_0F3846, VEX_W_0F3848_X86_64_L_0, VEX_W_0F3849_X86_64_L_0, + VEX_W_0F384A_X86_64_M_0_L_0, VEX_W_0F384B_X86_64_L_0, VEX_W_0F3850, VEX_W_0F3851, @@ -1677,6 +1690,8 @@ enum VEX_W_0F3ACE, VEX_W_0F3ACF, VEX_W_0F3ADE, + VEX_W_MAP5_F8_X86_64_M_0_L_0, + VEX_W_MAP5_F9_X86_64_M_0_L_0, VEX_W_MAP5_FD_X86_64_L_0, VEX_W_MAP7_F6_L_0, VEX_W_MAP7_F8_L_0, @@ -4118,6 +4133,14 @@ static const struct dis386 prefix_table[][4] = { { RM_TABLE (RM_VEX_0F3849_X86_64_L_0_W_0_M_1_P_3) }, }, + /* PREFIX_VEX_0F384A_X86_64_M_0_L_0_W_0 */ + { + { Bad_Opcode }, + { Bad_Opcode }, + { "tileloaddrst1", { TMM, MVexSIBMEM }, 0 }, + { "tileloaddrs", { TMM, MVexSIBMEM }, 0 }, + }, + /* PREFIX_VEX_0F384B_X86_64_L_0_W_0 */ { { Bad_Opcode }, @@ -4302,6 +4325,20 @@ static const struct dis386 prefix_table[][4] = { { "%XErorxS", { Gdq, Edq, Ib }, 0 }, }, + /* PREFIX_VEX_MAP5_F8_X86_64_M_0_L_0_W_0 */ + { + { "t2rpntlvwz0rs", { TMM, MVexSIBMEM }, 0 }, + { Bad_Opcode }, + { "t2rpntlvwz1rs", { TMM, MVexSIBMEM }, 0 }, + }, + + /* PREFIX_VEX_MAP5_F9_X86_64_M_0_L_0_W_0 */ + { + { "t2rpntlvwz0rst1", { TMM, MVexSIBMEM }, 0 }, + { Bad_Opcode }, + { "t2rpntlvwz1rst1", { TMM, MVexSIBMEM }, 0 }, + }, + /* PREFIX_VEX_MAP5_FD_X86_64_L_0_W_0 */ { { "tdpbf8ps", { TMM, Rtmm, VexTmm }, 0 }, @@ -4658,6 +4695,12 @@ static const struct dis386 x86_64_table[][2] = { { VEX_LEN_TABLE (VEX_LEN_0F3849_X86_64) }, }, + /* X86_64_VEX_0F384A */ + { + { Bad_Opcode }, + { MOD_TABLE (MOD_VEX_0F384A_X86_64) }, + }, + /* X86_64_VEX_0F384B */ { { Bad_Opcode }, @@ -4712,6 +4755,18 @@ static const struct dis386 x86_64_table[][2] = { { "%XEcmp%CCxadd", { Mdq, Gdq, VexGdq }, PREFIX_DATA }, }, + /* X86_64_VEX_MAP5_F8 */ + { + { Bad_Opcode }, + { MOD_TABLE (MOD_VEX_MAP5_F8_X86_64) }, + }, + + /* X86_64_VEX_MAP5_F9 */ + { + { Bad_Opcode }, + { MOD_TABLE (MOD_VEX_MAP5_F9_X86_64) }, + }, + /* X86_64_VEX_MAP5_FD */ { { Bad_Opcode }, @@ -6573,7 +6628,7 @@ static const struct dis386 vex_table[][256] = { /* 48 */ { X86_64_TABLE (X86_64_VEX_0F3848) }, { X86_64_TABLE (X86_64_VEX_0F3849) }, - { Bad_Opcode }, + { X86_64_TABLE (X86_64_VEX_0F384A) }, { X86_64_TABLE (X86_64_VEX_0F384B) }, { Bad_Opcode }, { Bad_Opcode }, @@ -7351,8 +7406,8 @@ static const struct dis386 vex_table[][256] = { { Bad_Opcode }, { Bad_Opcode }, /* f8 */ - { Bad_Opcode }, - { Bad_Opcode }, + { X86_64_TABLE (X86_64_VEX_MAP5_F8) }, + { X86_64_TABLE (X86_64_VEX_MAP5_F9) }, { Bad_Opcode }, { Bad_Opcode }, { Bad_Opcode }, @@ -7552,6 +7607,11 @@ static const struct dis386 vex_len_table[][2] = { { VEX_W_TABLE (VEX_W_0F3849_X86_64_L_0) }, }, + /* VEX_LEN_0F384A_X86_64_M_0 */ + { + { VEX_W_TABLE (VEX_W_0F384A_X86_64_M_0_L_0) }, + }, + /* VEX_LEN_0F384B_X86_64 */ { { VEX_W_TABLE (VEX_W_0F384B_X86_64_L_0) }, @@ -7799,6 +7859,16 @@ static const struct dis386 vex_len_table[][2] = { { PREFIX_TABLE (PREFIX_VEX_0F3AF0_L_0) }, }, + /* VEX_LEN_MAP5_F8_X86_64_M_0 */ + { + { VEX_W_TABLE (VEX_W_MAP5_F8_X86_64_M_0_L_0) }, + }, + + /* VEX_LEN_MAP5_F9_X86_64_M_0 */ + { + { VEX_W_TABLE (VEX_W_MAP5_F9_X86_64_M_0_L_0) }, + }, + /* VEX_LEN_MAP5_FD_X86_64 */ { { VEX_W_TABLE (VEX_W_MAP5_FD_X86_64_L_0) }, @@ -8246,6 +8316,10 @@ static const struct dis386 vex_w_table[][2] = { /* VEX_W_0F3849_X86_64_L_0 */ { MOD_TABLE (MOD_VEX_0F3849_X86_64_L_0_W_0) }, }, + { + /* VEX_W_0F384A_X86_64_M_0_L_0 */ + { PREFIX_TABLE (PREFIX_VEX_0F384A_X86_64_M_0_L_0_W_0) }, + }, { /* VEX_W_0F384B_X86_64_L_0 */ { PREFIX_TABLE (PREFIX_VEX_0F384B_X86_64_L_0_W_0) }, @@ -8440,6 +8514,14 @@ static const struct dis386 vex_w_table[][2] = { /* VEX_W_0F3ADE */ { VEX_LEN_TABLE (VEX_LEN_0F3ADE_W_0) }, }, + { + /* VEX_W_MAP5_F8_X86_64_M_0 */ + { PREFIX_TABLE (PREFIX_VEX_MAP5_F8_X86_64_M_0_L_0_W_0) }, + }, + { + /* VEX_W_MAP5_F9_X86_64_M_0 */ + { PREFIX_TABLE (PREFIX_VEX_MAP5_F9_X86_64_M_0_L_0_W_0) }, + }, { /* VEX_W_MAP5_FD_X86_64 */ { PREFIX_TABLE (PREFIX_VEX_MAP5_FD_X86_64_L_0_W_0) }, @@ -8804,6 +8886,10 @@ static const struct dis386 mod_table[][2] = { { PREFIX_TABLE (PREFIX_VEX_0F3849_X86_64_L_0_W_0_M_0) }, { PREFIX_TABLE (PREFIX_VEX_0F3849_X86_64_L_0_W_0_M_1) }, }, + { + /* MOD_VEX_0F384A_X86_64 */ + { VEX_LEN_TABLE (VEX_LEN_0F384A_X86_64_M_0) }, + }, { /* MOD_VEX_0F386E_X86_64 */ { VEX_LEN_TABLE (VEX_LEN_0F386E_X86_64_M_0) }, @@ -8812,6 +8898,14 @@ static const struct dis386 mod_table[][2] = { /* MOD_VEX_0F386F_X86_64 */ { VEX_LEN_TABLE (VEX_LEN_0F386F_X86_64_M_0) }, }, + { + /* MOD_VEX_MAP5_F8_X86_64 */ + { VEX_LEN_TABLE (VEX_LEN_MAP5_F8_X86_64_M_0) }, + }, + { + /* MOD_VEX_MAP5_F9_X86_64 */ + { VEX_LEN_TABLE (VEX_LEN_MAP5_F9_X86_64_M_0) }, + }, #include "i386-dis-evex-mod.h" }; diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c index 965eca2e640..17073bec401 100644 --- a/opcodes/i386-gen.c +++ b/opcodes/i386-gen.c @@ -271,6 +271,8 @@ static const dependency isa_dependencies[] = "AMX_TILE" }, { "AMX_FP8", "AMX_TILE" }, + { "AMX_MOVRS", + "AMX_TILE" }, { "KL", "SSE2" }, { "WIDEKL", @@ -441,6 +443,7 @@ static bitfield cpu_flags[] = BITFIELD (AMX_AVX512), BITFIELD (AMX_TF32), BITFIELD (AMX_FP8), + BITFIELD (AMX_MOVRS), BITFIELD (AMX_TILE), BITFIELD (MOVDIRI), BITFIELD (MOVDIR64B), diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h index a16ec56d355..27de44908cc 100644 --- a/opcodes/i386-opc.h +++ b/opcodes/i386-opc.h @@ -258,6 +258,8 @@ enum i386_cpu CpuAMX_TF32, /* AMX-FP8 instructions required */ CpuAMX_FP8, + /* Intel AMX-MOVRS Instructions support required. */ + CpuAMX_MOVRS, /* AMX-TILE instructions required */ CpuAMX_TILE, /* GFNI instructions required */ @@ -509,6 +511,7 @@ typedef union i386_cpu_flags unsigned int cpuamx_avx512:1; unsigned int cpuamx_tf32:1; unsigned int cpuamx_fp8:1; + unsigned int cpuamx_movrs:1; unsigned int cpuamx_tile:1; unsigned int cpugfni:1; unsigned int cpuvaes:1; diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl index 22c5653c313..4dc08e1e105 100644 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -3197,6 +3197,11 @@ t2rpntlvwz0t1, 0x6f, AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW t2rpntlvwz1, 0x666e, AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } t2rpntlvwz1t1, 0x666f, AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } +t2rpntlvwz0rs, 0xf8, AMX_MOVRS&AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } +t2rpntlvwz0rst1, 0xf9, AMX_MOVRS&AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } +t2rpntlvwz1rs, 0x66f8, AMX_MOVRS&AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } +t2rpntlvwz1rst1, 0x66f9, AMX_MOVRS&AMX_TRANSPOSE, TMMPairOperand1|Sibmem|Vex128|xVexMap5|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } + tcmmimfp16ps, 0x666c, AMX_COMPLEX, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } tcmmrlfp16ps, 0x6c, AMX_COMPLEX, Modrm|Vex128|Space0F38|Src2VVVV|VexW0|NoSuf, { RegTMM, RegTMM, RegTMM } @@ -3230,6 +3235,8 @@ tdphf8ps, 0x66fd, AMX_FP8, Modrm|Vex128|xVexMap5|Src2VVVV|VexW0|NoSuf, { RegTMM, tileloadd, 0xf24b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } tileloaddt1, 0x664b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } +tileloaddrs, 0xf24a, AMX_MOVRS, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } +tileloaddrst1, 0x664a, AMX_MOVRS, Sibmem|Vex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM } tilemovrow, 0x664a, AMX_AVX512, Modrm|EVex512|Space0F38|Src2VVVV|VexW0|NoSuf, { Reg32, RegTMM, RegZMM } tilemovrow, 0x6607, AMX_AVX512, Modrm|EVex512|Space0F3A|VexW0|NoSuf, { Imm8, RegTMM, RegZMM } tilestored, 0xf34b, APX_F(AMX_TILE), Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { RegTMM, Unspecified|BaseIndex }