From patchwork Fri Sep 15 08:47:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 76105 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6A5A33858005 for ; Fri, 15 Sep 2023 08:48:33 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6A5A33858005 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1694767713; bh=FOke4AUXtx7Zexxt0e5uwAhtatnVqOVDDW+xupID1pU=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=d4r5pwAplJPxfWy3h46NlgeZTU5y7v8jrNSm4iulOMKYNLXGQwp2uQJ9vX1x+vLcu ptQfCzgzmcmlttfpZ055GUhJy2vdYHAMJAThmLYM19F+liqDoyVLjzPVE/4oMwL/bQ XQVllHFiMIn6ZtuG4k+r48jz5HGY6SKF5h5SQlBE= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2089.outbound.protection.outlook.com [40.107.7.89]) by sourceware.org (Postfix) with ESMTPS id 6395D3857738 for ; Fri, 15 Sep 2023 08:47:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6395D3857738 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=F8iMPG0X8nzDlsx13JP8Zmil9D/bFfQ6PylSPBg4cZ4FHrHRGI1WwgLgawx/t46b6EIkkOyl6lTvx8CD0+HLljwG3gwei4a3yZMACe1fYdEGCtzz+xihedmOvLrmK9SIBsht41GcFeYeN6NuR+VeC7HIIVF+BSRMmZzj0VZLGUt90TybNfnTrJomzmP+jKEoB6uAQDpeFgPW6mHCInlAIMujXzSU9bYyYCcj9x1r+JKP9Vns0QxAZwkIpMtt30Scn0v4L1BMEzxdckHzPTUANEPZHl7Qk+J3aiX1p5M8WlTGVadeAyA8VF6lBBB2rJxwcc2AirmcJEfhcoA7gkwh3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FOke4AUXtx7Zexxt0e5uwAhtatnVqOVDDW+xupID1pU=; b=aD2+Cbuz5Klz0+spjGqm4SCNSdFNAtNIqyDBiQJw5TXNpMs7FAao+blON3MIrzg5DzqukZodZtprsbgkJsHq5FDQQjMSFB9daMm/h3vlQB4AbIvznbly5AhWoSccVeLP/ofbe3A42HUhcQbe8iVufYA/iYLeJxzr+26oLeit7yj70sHl5GakLLVF7mtck+AZc7OTxW+SxEX/XFZo+xRC+BsVR0ErXcD8REoqRl14+Tu/saSgCzNVqsTDTdVbDafS73bRMUBKybZKqFfHDeg33l497FPf3+E5R2n/iCUFx3rOTjHQDjjVN7/QzkDAADlMQNPArEWZ4qBRijFHI2d5zw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) by AM9PR04MB8382.eurprd04.prod.outlook.com (2603:10a6:20b:3ea::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6792.21; Fri, 15 Sep 2023 08:47:44 +0000 Received: from DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654]) by DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654%6]) with mapi id 15.20.6792.020; Fri, 15 Sep 2023 08:47:44 +0000 Message-ID: Date: Fri, 15 Sep 2023 10:47:42 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: [PATCH 1/4] x86: fold certain VEX and EVEX templates Content-Language: en-US To: Binutils Cc: "H.J. Lu" References: <0690c179-ac98-d127-5ff4-b5abb725b6ae@suse.com> In-Reply-To: <0690c179-ac98-d127-5ff4-b5abb725b6ae@suse.com> X-ClientProxiedBy: FR0P281CA0241.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:af::19) To DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DU2PR04MB8790:EE_|AM9PR04MB8382:EE_ X-MS-Office365-Filtering-Correlation-Id: a0f1ff5e-1952-4aba-efd9-08dbb5c86d9c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: x4OPLY+TK4w//5Mh8GouyNIsq+IEXbzcTkYzckhGOQBsTEgeHes9h5x8uMxxjkWE7hXTNzhifaKyZz5KW7vFEHuSI+nGUO0y2lQMtm6w73gLVAgXa2QRi8HedIELspItxGvt7KWd7rM1AL3+xvDbPNNbWDmcb5B6CJTYhIm/nlYfX18IS6iJ0UU8+eSmvppgq3yI5OP52XHVbW+O+Y3rTnJNN2GU1vf565OaFcBOqJU8mLZTiAJtwuucQrTuU7gsmW2KPLoRXyZxpi6aT5Q8nXVkkBNcDQtFFLdMDc8fTjQcZsfx4uH0djxp2Ngxt7BlsnkOEj5EVDHjGLPKubVraAt1EhA+IFv5+jj8n26PrNAPHQ2mVU0tMm2XciXT2baqBxux4j9gTWnnZRv2Wo05c8zXre/TWDZfZmINElFV4fmPtgx9rvXAQJJOukcTtBn7PSa+kCv7nA4/D/w+7ps9kkTdZKISYMKfKQbPJhN0IoOLEik40ewsG7huTAUTYUHLFNGQt/qjIkL0WoYzBGpVjXuX1Tuek+TuSnBsSSmghWlRnq7gfPs0VivPRl8GECkeZsvkYOnEbOeqXartBZeY/Hv99+FuNpDNfSPX/dbVp3oQsmck1ccWoleZczbl6RdzHXW0x/veSjzUtlZhEMYOTQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DU2PR04MB8790.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(396003)(376002)(346002)(366004)(136003)(39860400002)(1800799009)(451199024)(186009)(31686004)(5660300002)(41300700001)(6916009)(316002)(478600001)(86362001)(31696002)(66946007)(66556008)(66476007)(36756003)(19627235002)(38100700002)(8676002)(4326008)(8936002)(6486002)(6506007)(26005)(2906002)(83380400001)(6512007)(30864003)(2616005)(43740500002)(45980500001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?80fqriHP6CkepLnnq5F/+6Vdv3K9?= =?utf-8?q?SASsrRsvRMpeTMpPzexjdbcpyDUr7z8bqLCY5kEnmaOrZ0g5Yo0SuhN9Mj+YIgDC1?= =?utf-8?q?e256YgWbH3dfJMJ0ZD+yYfi6aPt/uTRWzVatuaVRgFhNdlHZyri+ArXovOTzHfVZP?= =?utf-8?q?pe+5FsEmSN5KjgMx0jrpku0K6KYJ+BZPIkCpfcqYnhnA8Ceh9fkrCnO0bM/XFpqsx?= =?utf-8?q?41SpQUX3qN3Z5t8HxJsv0RI3Ay7vpIu0n3ih8e1el0ooVWnUD3kAzfOt/aD2gfDlW?= =?utf-8?q?i2kAn759YoVhWZaom6G6egaYlH0UU5WTHiv+y/tDB++gRsbWwfcIXs1dd4BhCRK9f?= =?utf-8?q?sClgzYjAfJVfU4W+HN9saqFLpJ9gdIOemWNvZ3CL3kRGKBJsvlOVuPZ1EGRUzX+hf?= =?utf-8?q?+CCTfy3oS0o93VCZfVUjZUmEAd4gFEwXt3sqQ+PHxL49ovfEF4lpinwLpSUTH0Yhj?= =?utf-8?q?ODnQCqGFI3656voHZmZfrQdEmCB0utAdvM1Wmpm1/9K2CN2W12lXWs70rB3cHO51G?= =?utf-8?q?uN+X3KypLOq+VV3VFif1KAha/DYMnrhhCg0g5YCCpi/SYcBs55E8ckDx+cDMZG+wc?= =?utf-8?q?3WfmU2vt5S9FYPGhgovN2Sd0zSR6IttuHO8EdZ8KZZHt8xTe/4SvZ3fVjF9AMlHcF?= =?utf-8?q?lw4C/O8cBpNTaHBjRbOPbe9aUM+/lhxnz6noKwZonsT0i7LWhaslKglm+v8NKOxEk?= =?utf-8?q?8rjUDoy/6IWfN2QdaK98XN6csvQjNbHVw3FmgprdLlYeTIIBSeXMZwuEMaPLUfYer?= =?utf-8?q?8UMaboN4pReCawBV6genIpl2QjtFKKutGjAO7u/D24OKMgWf1fGj5WhgLWVD8Doqm?= =?utf-8?q?pgcciG8sOZcgRW17s++MQL3QvlxM5/b5XHEOYQ7QS57zIPTWaxLkbrz+Zx81G6YT+?= =?utf-8?q?7g7JNbZxWmwpMDZ5tPfOMhAvYPGEMjNRfymuQE/3TF2lxNCisBbhmNCQ9yGb1uwBo?= =?utf-8?q?JVlyd/WRuHx8XFXH6vDwRYW3BamKrae0JdXTC0NRv24SMI/8rncv/9ZveDZ6OvHew?= =?utf-8?q?9qX0aLvVvynpOOQjdrA1dRGmNsUAsoGt2LwvNDXKTyVt8y0R6MGJu301vFLBu4TBY?= =?utf-8?q?MqJ7aArj3JQJDJwO/D/m+LRk4ptvgwBnzugWwsbExCBa83OiAJksCMezdJlT5/gMg?= =?utf-8?q?GkKqAKrwZGBdut/axZ3pHUt7Zn4mJRy49t1fjILlGhjW4TnGmdFsHMx9uw/E/lm9N?= =?utf-8?q?vXYpn0jF03TuSwtkAX3Neho+8muCUKFbsT4hEDOi37kf9AUu1NSnAy9G0QOMqJJoZ?= =?utf-8?q?NBOcOCdWEF63qVHfPjJXjltrPfcGOsYFi041bandgQc5aDNwSkkv537WCuX6MvKO5?= =?utf-8?q?U36DhVox0RS6F2Kur8hcY4ntucPrsjFpAETvCEh/Hum41KqgAr27Z/HPNd4KTietq?= =?utf-8?q?2dsrD2BFQPyGVJTZqIJN2R9A/WEM6qVDszruSmJ/8eZsxFgFUfBw0KIKsqLNm+g4V?= =?utf-8?q?9I6Z0YR/12iTMo9XLqv9+1EYbPvNwgsFC4vnPXAW19+jqnJeAvfKTWKIs9dwIRJkQ?= =?utf-8?q?GHbqwgMdafZl?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: a0f1ff5e-1952-4aba-efd9-08dbb5c86d9c X-MS-Exchange-CrossTenant-AuthSource: DU2PR04MB8790.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Sep 2023 08:47:44.1698 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 2nTaDxi4fJTcrorz7NBtkiBnGVmgIIspuPSAlPaqaaw9WZegjMTNvnOGA103e0wq0atqVL+wC6AkCuiDsffckA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM9PR04MB8382 X-Spam-Status: No, score=-3026.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Beulich via Binutils From: Jan Beulich Reply-To: Jan Beulich Errors-To: binutils-bounces+patchwork=sourceware.org@sourceware.org Sender: "Binutils" In anticipation of APX introduce logic to reduce the number of templates we have now, allowing to limit some the number of ones we then need to gain. The fundamental requirements are that - attributes be compatible, which specifically means VexW needs to be the same in the templates (which often isn't the case, for VEX encodings having far more WIG tha, EVEX ones), - the EVEX form being AVX512F (with or without AVX512VL), not any of its extensions (the same will then be required for APX - it'll need to be APX_F). Note that in check_register() there's now a redundant zmm check. Since this logic will need revisiting for APX anyway, I'd like to keep it that way for now. (Similarly a couple of if()-s which could be folded are kept separate, to reduce code churn when adding APX support.) --- RFC: Of course there are quite a few code changes, so there is the question of the savings being worth it. The AVX512F constraint may be possible to relax, but the change is big enough already. --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -436,6 +436,7 @@ struct _i386_insn vex_encoding_vex, vex_encoding_vex3, vex_encoding_evex, + vex_encoding_evex512, vex_encoding_error } vec_encoding; @@ -1872,6 +1873,13 @@ cpu_flags_and_not (i386_cpu_flags x, i38 static const i386_cpu_flags avx512 = CPU_ANY_AVX512F_FLAGS; +static INLINE bool need_evex_encoding (void) +{ + return i.vec_encoding == vex_encoding_evex + || i.vec_encoding == vex_encoding_evex512 + || i.mask.reg; +} + #define CPU_FLAGS_ARCH_MATCH 0x1 #define CPU_FLAGS_64BIT_MATCH 0x2 @@ -1899,6 +1907,27 @@ cpu_flags_match (const insn_template *t) /* This instruction is available only on some archs. */ i386_cpu_flags cpu = cpu_arch_flags; + /* Dual VEX/EVEX templates may need stripping of one of the flags. */ + if (t->opcode_modifier.vex && t->opcode_modifier.evex) + { + /* Dual AVX/AVX512F templates need to retain AVX512F only if we already + know that EVEX encoding will be needed. */ + if ((x.bitfield.cpuavx || x.bitfield.cpuavx2) + && x.bitfield.cpuavx512f) + { + if (need_evex_encoding ()) + { + x.bitfield.cpuavx = 0; + x.bitfield.cpuavx2 = 0; + } + else + { + x.bitfield.cpuavx512f = 0; + x.bitfield.cpuavx512vl = 0; + } + } + } + /* AVX512VL is no standalone feature - match it and then strip it. */ if (x.bitfield.cpuavx512vl && !cpu.bitfield.cpuavx512vl) return match; @@ -3646,6 +3675,27 @@ install_template (const insn_template *t i.tm = *t; + /* Dual VEX/EVEX templates need stripping one of the possible variants. */ + if (t->opcode_modifier.vex && t->opcode_modifier.evex) + { + if ((is_cpu (t, CpuAVX) || is_cpu (t, CpuAVX2)) + && is_cpu (t, CpuAVX512F)) + { + if (need_evex_encoding ()) + { + i.tm.opcode_modifier.vex = 0; + i.tm.cpu.bitfield.cpuavx = 0; + if (is_cpu (&i.tm, CpuAVX2)) + i.tm.cpu.bitfield.isa = 0; + } + else + { + i.tm.opcode_modifier.evex = 0; + i.tm.cpu.bitfield.cpuavx512f = 0; + } + } + } + /* Note that for pseudo prefixes this produces a length of 1. But for them the length isn't interesting at all. */ for (l = 1; l < 4; ++l) @@ -4553,6 +4603,8 @@ optimize_encoding (void) i.tm.opcode_modifier.vex = VEX128; i.tm.opcode_modifier.vexw = VEXW0; i.tm.opcode_modifier.evex = 0; + i.vec_encoding = vex_encoding_vex; + i.mask.reg = NULL; } else if (optimize > 1) i.tm.opcode_modifier.evex = EVEX128; @@ -5438,6 +5490,11 @@ md_assemble (char *line) if (optimize && !i.no_optimize && i.tm.opcode_modifier.optimize) optimize_encoding (); + /* Past optimization there's no need to distinguish vex_encoding_evex and + vex_encoding_evex512 anymore. */ + if (i.vec_encoding == vex_encoding_evex512) + i.vec_encoding = vex_encoding_evex; + if (use_unaligned_vector_move) encode_with_unaligned_vector_move (); @@ -5467,6 +5524,7 @@ md_assemble (char *line) if (i.tm.operand_types[j].bitfield.tmmword) i.xstate |= xstate_tmm; else if (i.tm.operand_types[j].bitfield.zmmword + && !i.tm.opcode_modifier.vex && vector_size >= VSZ512) i.xstate |= xstate_zmm; else if (i.tm.operand_types[j].bitfield.ymmword @@ -6468,7 +6526,8 @@ check_VecOperands (const insn_template * cpu = cpu_flags_and (cpu_flags_from_attr (t->cpu), avx512); if (!cpu_flags_all_zero (&cpu) && !is_cpu (t, CpuAVX512VL) - && !cpu_arch_flags.bitfield.cpuavx512vl) + && !cpu_arch_flags.bitfield.cpuavx512vl + && (!t->opcode_modifier.vex || need_evex_encoding())) { for (op = 0; op < t->operands; ++op) { @@ -6779,6 +6838,8 @@ check_VecOperands (const insn_template * /* Check vector Disp8 operand. */ if (t->opcode_modifier.disp8memshift + && (!t->opcode_modifier.vex + || need_evex_encoding ()) && i.disp_encoding <= disp_encoding_8bit) { if (i.broadcast.type || i.broadcast.bytes) @@ -6874,7 +6935,8 @@ VEX_check_encoding (const insn_template return 1; } - if (i.vec_encoding == vex_encoding_evex) + if (i.vec_encoding == vex_encoding_evex + || i.vec_encoding == vex_encoding_evex512) { /* This instruction must be encoded with EVEX prefix. */ if (!is_evex_encoding (t)) @@ -11211,6 +11273,10 @@ s_insn (int dummy ATTRIBUTE_UNUSED) goto done; } + /* No need to distinguish vex_encoding_evex and vex_encoding_evex512. */ + if (i.vec_encoding == vex_encoding_evex512) + i.vec_encoding = vex_encoding_evex; + /* Are we to emit ModR/M encoding? */ if (!i.short_form && (i.mem_operands @@ -11633,6 +11699,12 @@ RC_SAE_specifier (const char *pstr) return NULL; } + if (i.vec_encoding == vex_encoding_default) + i.vec_encoding = vex_encoding_evex512; + else if (i.vec_encoding != vex_encoding_evex + && i.vec_encoding != vex_encoding_evex512) + return NULL; + i.rounding.type = RC_NamesTable[j].type; return (char *)(pstr + RC_NamesTable[j].len); @@ -11692,6 +11764,12 @@ check_VecOperations (char *op_string) } op_string++; + if (i.vec_encoding == vex_encoding_default) + i.vec_encoding = vex_encoding_evex; + else if (i.vec_encoding != vex_encoding_evex + && i.vec_encoding != vex_encoding_evex512) + goto unknown_vec_op; + i.broadcast.type = bcst_type; i.broadcast.operand = this_operand; @@ -13953,8 +14031,17 @@ static bool check_register (const reg_en } } - if (vector_size < VSZ512 && r->reg_type.bitfield.zmmword) - return false; + if (r->reg_type.bitfield.zmmword) + { + if (vector_size < VSZ512) + return false; + + if (i.vec_encoding == vex_encoding_default) + i.vec_encoding = vex_encoding_evex512; + else if (i.vec_encoding != vex_encoding_evex + && i.vec_encoding != vex_encoding_evex512) + i.vec_encoding = vex_encoding_error; + } if (vector_size < VSZ256 && r->reg_type.bitfield.ymmword) return false; @@ -13979,7 +14066,8 @@ static bool check_register (const reg_en || flag_code != CODE_64BIT) return false; - if (i.vec_encoding == vex_encoding_default) + if (i.vec_encoding == vex_encoding_default + || i.vec_encoding == vex_encoding_evex512) i.vec_encoding = vex_encoding_evex; else if (i.vec_encoding != vex_encoding_evex) i.vec_encoding = vex_encoding_error; --- a/gas/config/tc-i386-intel.c +++ b/gas/config/tc-i386-intel.c @@ -209,6 +209,11 @@ operatorT i386_operator (const char *nam || i386_types[j].sz[0] > 8 || (i386_types[j].sz[0] & (i386_types[j].sz[0] - 1))) return O_illegal; + if (i.vec_encoding == vex_encoding_default) + i.vec_encoding = vex_encoding_evex; + else if (i.vec_encoding != vex_encoding_evex + && i.vec_encoding != vex_encoding_evex512) + return O_illegal; if (!i.broadcast.bytes && !i.broadcast.type) { i.broadcast.bytes = i386_types[j].sz[0]; --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -131,6 +131,8 @@ #define EVexLIG EVex=EVEXLIG #define EVexDYN EVex=EVEXDYN +#define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL + #define Vsz256 Vsz=VSZ256 #define Vsz512 Vsz=VSZ512 @@ -1518,8 +1520,8 @@ vdivs, 0x5e, AVX, Modrm|Vex vdppd, 0x6641, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM } vdpps, 0x6640, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vextractf128, 0x6619, AVX, Modrm|Vex=2|Space0F3A|VexW=1|NoSuf, { Imm8, RegYMM, Unspecified|BaseIndex|RegXMM } -vextractps, 0x6617, AVX, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex } -vextractps, 0x6617, AVX|x64, RegMem|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg64 } +vextractps, 0x6617, AVX|AVX512F, Modrm|Vex128|EVex128|Space0F3A|VexWIG|Disp8MemShift=2|NoSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex } +vextractps, 0x6617, AVX|AVX512F|x64, RegMem|Vex128|EVex128|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg64 } vhaddpd, 0x667c, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vhaddps, 0xf27c, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vhsubpd, 0x667d, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } @@ -1541,7 +1543,7 @@ vmovap, 0x28, AVX, D|Modrm| // by Intel AVX spec). To avoid extra template in gcc x86 backend and // support assembler for AMD64, we accept 64bit operand on vmovd so // that we can use one template for both SSE and AVX instructions. -vmovd, 0x666e, AVX, D|Modrm|Vex=1|Space0F|NoSuf, { Reg32|Unspecified|BaseIndex, RegXMM } +vmovd, 0x666e, AVX|AVX512F, D|Modrm|Vex128|EVex128|Space0F|Disp8MemShift=2|NoSuf, { Reg32|Unspecified|BaseIndex, RegXMM } vmovd, 0x667e, AVX|x64, D|RegMem|Vex=1|Space0F|VexW=2|NoSuf|Size64, { RegXMM, Reg64 } vmovddup, 0xf212, AVX, Modrm|Vex|Space0F|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } vmovddup, 0xf212, AVX, Modrm|Vex=2|Space0F|VexWIG|NoSuf, { Unspecified|BaseIndex|RegYMM, RegYMM } @@ -1559,7 +1561,7 @@ vmovntdqa, 0x662a, AVX|AVX2, Modrm|Vex|S vmovntp, 0x2b, AVX, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex } vmovq, 0xf37e, AVX, Load|Modrm|Vex=1|Space0F|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } vmovq, 0x66d6, AVX, Modrm|Vex=1|Space0F|VexWIG|NoSuf, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM } -vmovq, 0x666e, AVX|x64, D|Modrm|Vex=1|Space0F|VexW=2|NoSuf, { Reg64|Unspecified|BaseIndex, RegXMM } +vmovq, 0x666e, AVX|AVX512F|x64, D|Modrm|Vex128|EVex128|Space0F|VexW1|Disp8MemShift=3|NoSuf, { Reg64|Unspecified|BaseIndex, RegXMM } vmovs, 0x10, AVX, D|Modrm|VexLIG|Space0F|VexWIG|NoSuf, { |Unspecified|BaseIndex, RegXMM } vmovs, 0x10, AVX, D|Modrm|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf, { RegXMM, RegXMM, RegXMM } vmovshdup, 0xf316, AVX, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM } @@ -1599,8 +1601,10 @@ vpcmpgtq, 0x6637, AVX|AVX2, Modrm|Vex|Sp vpcmpistri, 0x6663, AVX, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM } vpcmpistrm, 0x6662, AVX, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM } vperm2f128, 0x6606, AVX, Modrm|Vex256|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM } -vpermilp, 0x660c | , AVX, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpermilp, 0x6604 | , AVX, Modrm|Vex|Space0F3A|VexW0|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM } +vpermilps, 0x660c, AVX|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpermilps, 0x6604, AVX|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F3A|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vpermilpd, 0x660d, AVX, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vpermilpd, 0x6605, AVX, Modrm|Vex|Space0F3A|VexW0|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM } vpextr, 0x6616, AVX|, Modrm|Vex|Space0F3A||NoSuf, { Imm8, RegXMM, |Unspecified|BaseIndex } vpextrw, 0x66c5, AVX, Load|Modrm|Vex|Space0F|VexWIG|No_bSuf|No_wSuf|No_sSuf, { Imm8, RegXMM, Reg32|Reg64 } vpextr, 0x6614 | , AVX, RegMem|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg32|Reg64 } @@ -1632,18 +1636,18 @@ vpminub, 0x66da, AVX|AVX2, Modrm|C|Vex|S vpminud, 0x663b, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpminuw, 0x663a, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpmovmskb, 0x66d7, AVX|AVX2, Modrm|Vex|Space0F|VexWIG|No_bSuf|No_wSuf|No_sSuf, { RegXMM|RegYMM, Reg32|Reg64 } -vpmovsxbd, 0x6621, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovsxbq, 0x6622, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Word|Unspecified|BaseIndex|RegXMM, RegXMM } +vpmovsxbd, 0x6621, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } +vpmovsxbq, 0x6622, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM } vpmovsxbw, 0x6620, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } vpmovsxdq, 0x6625, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovsxwd, 0x6623, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovsxwq, 0x6624, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovzxbd, 0x6631, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovzxbq, 0x6632, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Word|Unspecified|BaseIndex|RegXMM, RegXMM } +vpmovsxwd, 0x6623, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } +vpmovsxwq, 0x6624, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } +vpmovzxbd, 0x6631, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } +vpmovzxbq, 0x6632, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM } vpmovzxbw, 0x6630, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } vpmovzxdq, 0x6635, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovzxwd, 0x6633, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovzxwq, 0x6634, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM } +vpmovzxwd, 0x6633, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } +vpmovzxwq, 0x6634, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } vpmuldq, 0x6628, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpmulhrsw, 0x660b, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpmulhuw, 0x66e4, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } @@ -1710,39 +1714,40 @@ vzeroupper, 0x77, AVX, Vex|Space0F|VexWI // 256bit integer AVX2 instructions. -vpmovsxbd, 0x6621, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovsxbq, 0x6622, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegYMM } +vpmovsxbd, 0x6621, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } +vpmovsxbq, 0x6622, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } vpmovsxbw, 0x6620, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } vpmovsxdq, 0x6625, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovsxwd, 0x6623, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovsxwq, 0x6624, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovzxbd, 0x6631, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovzxbq, 0x6632, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegYMM } +vpmovsxwd, 0x6623, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } +vpmovsxwq, 0x6624, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } +vpmovzxbd, 0x6631, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } +vpmovzxbq, 0x6632, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } vpmovzxbw, 0x6630, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } vpmovzxdq, 0x6635, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovzxwd, 0x6633, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovzxwq, 0x6634, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegYMM } +vpmovzxwd, 0x6633, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } +vpmovzxwq, 0x6634, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } // New AVX2 instructions. vbroadcasti128, 0x665A, AVX2, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM } vbroadcastsd, 0x6619, AVX2, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { RegXMM, RegYMM } -vbroadcastss, 0x6618, AVX2, Modrm|Vex|Space0F38|VexW=1|NoSuf, { RegXMM, RegXMM|RegYMM } +vbroadcastss, 0x6618, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexW0|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vpblendd, 0x6602, AVX2, Modrm|Vex|Space0F3A|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpbroadcast, 0x6678 | , AVX2, Modrm|Vex|Space0F38|VexW0|NoSuf, { |Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM } -vpbroadcast, 0x6658 | , AVX2, Modrm|Vex|Space0F38|VexW0|NoSuf|Optimize, { |Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM } +vpbroadcastd, 0x6658, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexW0|Disp8MemShift|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vpbroadcastq, 0x6659, AVX2, Modrm|Vex|Space0F38|VexW0|NoSuf|Optimize, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM } vperm2i128, 0x6646, AVX2, Modrm|Vex=2|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM } -vpermd, 0x6636, AVX2, Modrm|Vex256|Space0F38|VexVVVV|VexW0|NoSuf, { Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM } -vpermpd, 0x6601, AVX2, Modrm|Vex=2|Space0F3A|VexW1|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM } -vpermps, 0x6616, AVX2, Modrm|Vex256|Space0F38|VexVVVV|VexW0|NoSuf, { Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM } -vpermq, 0x6600, AVX2, Modrm|Vex=2|Space0F3A|VexW1|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM } +vpermd, 0x6636, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } +vpermpd, 0x6601, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F3A|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM } +vpermps, 0x6616, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } +vpermq, 0x6600, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F3A|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM } vextracti128, 0x6639, AVX2, Modrm|Vex=2|Space0F3A|VexW=1|NoSuf, { Imm8, RegYMM, Unspecified|BaseIndex|RegXMM } vinserti128, 0x6638, AVX2, Modrm|Vex256|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegYMM, RegYMM } vpmaskmov, 0x668e, AVX2, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { RegXMM|RegYMM, RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex } vpmaskmov, 0x668c, AVX2, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } -vpsllv, 0x6647, AVX2, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpsravd, 0x6646, AVX2, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpsrlv, 0x6645, AVX2, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vpsllv, 0x6647, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpsravd, 0x6646, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpsrlv, 0x6645, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } // AVX gather instructions vgatherdpd, 0x6692, AVX2, Modrm|Vex|Space0F38|VexVVVV|VexW1|SwapSources|CheckOperandSize|NoSuf|VecSIB128, { RegXMM|RegYMM, Qword|Unspecified|BaseIndex, RegXMM|RegYMM } @@ -1779,7 +1784,7 @@ vpclmulhqhqdq, 0x6644/0x11, AVX|PCLMULQD vgf2p8affineinvqb, 0x66cf, AVX|GFNI, Modrm|Vex|Space0F3A|VexVVVV|VexW1|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vgf2p8affineqb, 0x66ce, AVX|GFNI, Modrm|Vex|Space0F3A|VexVVVV|VexW1|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vgf2p8mulb, 0x66cf, AVX|GFNI, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vgf2p8mulb, 0x66cf, GFNI|AVX|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } // FSGSBASE, RDRND and F16C @@ -2082,8 +2087,6 @@ vpclmulhqhqdq, 0x6644/0x11, VPCLMULQDQ, // AVX512F instructions. -#define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL - , 0x6615, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vprorv, 0x6614, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vpsllv, 0x6647, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vpsrav, 0x6646, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vpsrlv, 0x6645, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpsravq, 0x6646, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vpternlog, 0x6625, AVX512F, Modrm|Masking|Space0F3A|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vbroadcastf32x4, 0x661A, AVX512F, Modrm|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { XMMword|Unspecified|BaseIndex, RegYMM|RegZMM } @@ -2153,10 +2154,9 @@ vbroadcasti32x4, 0x665A, AVX512F, Modrm| vbroadcastf64x4, 0x661B, AVX512F, Modrm|EVex=1|Masking|Space0F38|VexW=2|Disp8MemShift=5|NoSuf, { YMMword|Unspecified|BaseIndex, RegZMM } vbroadcasti64x4, 0x665B, AVX512F, Modrm|EVex=1|Masking|Space0F38|VexW=2|Disp8MemShift=5|NoSuf, { YMMword|Unspecified|BaseIndex, RegZMM } -vbroadcastss, 0x6618, AVX512F, Modrm|Masking|Space0F38|VexW0|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vbroadcastsd, 0x6619, AVX512F, Modrm|Masking|Space0F38|VexW1|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM } -vpbroadcast, 0x6658 | , AVX512F, Modrm|Masking|Space0F38||Disp8MemShift|NoSuf, { RegXMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vpbroadcastq, 0x6659, AVX512F, Modrm|Masking|Space0F38|VexW1|Disp8MemShift|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vpbroadcast, 0x667c, AVX512F, Modrm|Masking|Space0F38||NoSuf, { , RegXMM|RegYMM|RegZMM } vcmpp, 0xC2/0x, AVX512F, Modrm|Masking|Space0F|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask } @@ -2246,9 +2246,6 @@ vextracti32x4, 0x6639, AVX512F, Modrm|Ma vextractf64x4, 0x661B, AVX512F, Modrm|EVex=1|Masking|Space0F3A|VexW=2|Disp8MemShift=5|NoSuf, { Imm8, RegZMM, RegYMM|Unspecified|BaseIndex } vextracti64x4, 0x663B, AVX512F, Modrm|EVex=1|Masking|Space0F3A|VexW=2|Disp8MemShift=5|NoSuf, { Imm8, RegZMM, RegYMM|Unspecified|BaseIndex } -vextractps, 0x6617, AVX512F, Modrm|EVex128|Space0F3A|VexWIG|Disp8MemShift=2|NoSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex } -vextractps, 0x6617, AVX512F|x64, RegMem|EVex128|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg64 } - vfixupimmp, 0x6654, AVX512F, Modrm|Masking|Space0F3A|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vfixupimms, 0x6655, AVX512F, Modrm|EVexLIG|Masking|Space0F3A|VexVVVV||Disp8MemShift|NoSuf|SAE, { Imm8|Imm8S, RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } @@ -2304,8 +2301,6 @@ vmovap, 0x28, AVX512F, D|Mo vmovntp, 0x2B, AVX512F, Modrm|Space0F||Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM, XMMword|YMMword|ZMMword|Unspecified|BaseIndex } vmovup, 0x10, AVX512F, D|Modrm|Masking|Space0F||Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vmovd, 0x666E, AVX512F, D|Modrm|EVex=2|Space0F|Disp8MemShift=2|NoSuf, { Reg32|Unspecified|BaseIndex, RegXMM } - vmovddup, 0xF212, AVX512F, Modrm|Masking|Space0F|VexW=2|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Unspecified|BaseIndex, RegYMM|RegZMM } vmovdqa64, 0x666F, AVX512F, D|Modrm|Masking|Space0F|VexW=2|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } @@ -2322,7 +2317,6 @@ vmovhp, 0x17, AVX512F, Modr vmovlp, 0x12, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV||Disp8MemShift=3|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM, RegXMM } vmovlp, 0x13, AVX512F, Modrm|EVexLIG|Space0F||Disp8MemShift=3|NoSuf, { RegXMM, Qword|Unspecified|BaseIndex } -vmovq, 0x666E, AVX512F|x64, D|Modrm|EVex128|Space0F|VexW1|Disp8MemShift=3|NoSuf, { Reg64|Unspecified|BaseIndex, RegXMM } vmovq, 0xF37E, AVX512F, Load|Modrm|EVex=2|Space0F|VexW1|Disp8MemShift=3|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } vmovq, 0x66D6, AVX512F, Modrm|EVex=2|Space0F|VexW1|Disp8MemShift=3|NoSuf, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM } @@ -2360,15 +2354,10 @@ vpcmpu, 0x661e/, AVX vptestm, 0x6627, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask } vptestnm, 0xf327, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask } -vpermd, 0x6636, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } -vpermps, 0x6616, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } - -vpermilp, 0x6604 | , AVX512F, Modrm|Masking|Space0F3A||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vpermilp, 0x660C | , AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpermilpd, 0x6605, AVX512F, Modrm|Masking|Space0F3A|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vpermilpd, 0x660d, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vpermpd, 0x6601, AVX512F, Modrm|Masking|Space0F3A|VexW=2|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM } vpermpd, 0x6616, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } -vpermq, 0x6600, AVX512F, Modrm|Masking|Space0F3A|VexW=2|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM } vpermq, 0x6636, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } vpmovdb, 0xF331, AVX512F, Modrm|EVex=1|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { RegZMM, RegXMM|Unspecified|BaseIndex } @@ -2593,31 +2582,11 @@ vpmovsqw, 0xF324, AVX512F|AVX512VL, Modr vpmovusqw, 0xF314, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexW0|Disp8MemShift=2|NoSuf, { RegXMM, RegXMM|Dword|Unspecified|BaseIndex } vpmovusqw, 0xF314, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexW0|Disp8MemShift=3|NoSuf, { RegYMM, RegXMM|Qword|Unspecified|BaseIndex } -vpmovsxbd, 0x6621, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } -vpmovsxbd, 0x6621, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } -vpmovzxbd, 0x6631, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } -vpmovzxbd, 0x6631, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } - -vpmovsxbq, 0x6622, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM } -vpmovsxbq, 0x6622, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } -vpmovzxbq, 0x6632, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM } -vpmovzxbq, 0x6632, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } - vpmovsxdq, 0x6625, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexW0|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } vpmovsxdq, 0x6625, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } vpmovzxdq, 0x6635, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexW0|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } vpmovzxdq, 0x6635, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } -vpmovsxwd, 0x6623, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } -vpmovsxwd, 0x6623, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } -vpmovzxwd, 0x6633, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } -vpmovzxwd, 0x6633, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } - -vpmovsxwq, 0x6624, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } -vpmovsxwq, 0x6624, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } -vpmovzxwq, 0x6634, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } -vpmovzxwq, 0x6634, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } - // AVX512VL instructions end. // AVX512BW instructions. @@ -2960,7 +2929,6 @@ vpshufbitqmb, 0x668f, AVX512_BITALG, Mod vgf2p8affineinvqb, 0x66cf, GFNI|AVX512F, Modrm|Masking|Space0F3A|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vgf2p8affineqb, 0x66ce, GFNI|AVX512F, Modrm|Masking|Space0F3A|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vgf2p8mulb, 0x66cf, GFNI|AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW0|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } // AVX512 + GFNI instructions end From patchwork Fri Sep 15 08:48:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 76106 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DE8A93858035 for ; Fri, 15 Sep 2023 08:48:43 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DE8A93858035 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1694767723; bh=2/MuA+RgV3QQvb7GRDyhKEzhcweFs1Wy/cm3matiJ4Q=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=JBGvGLLZZONuiLwhLuXq9MQ7S9AR+oTROCDS3yBxKTpnAs5EUpQcup1Kv70lzYZsQ oHoYdYnGIo6MyNIprxH9lR/CObrKeDYpMpj7Ruyc6JFLfqHG0recuIoElawPXHSc3q ReWUtFvA8DtGvpFouOVaINR3sgFppUYPJHzR7slA= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2057.outbound.protection.outlook.com [40.107.22.57]) by sourceware.org (Postfix) with ESMTPS id D4612385771B for ; Fri, 15 Sep 2023 08:48:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D4612385771B ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XA3u8r50WjfD2Xk4sWF2vz/n3pcXbnIKSEQAC+AD2wU0vQm+95seaXV6JkftIZleqZyHkwHMdAZfh3GN0N013QmGGcS0t4qjvmeNbkdU+pRGB7I7ug0uOsU5+7eyGm6tuThP4uzba9R5XVxKnrhs9gJZpDMweWFkzXiFYnyjdqMFeheFDZIasgzuYWYDp7o1IhQsCmmI+vdL0W/8AhbbAVuxsXmmXdgivP4Wwckp6DxR50uILXoNPCpHwgEoPJeWac33pEKclEH4TYis4eH0RnilytfuWK4NdbNWKylZibVrTCJZ5DvFl2UT5Gyn1HD0IUaGfJLA0aGKufn0JBVn2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2/MuA+RgV3QQvb7GRDyhKEzhcweFs1Wy/cm3matiJ4Q=; b=PRMEajdACfp5A/HZKNIsdoAcb/mkw3AMcyjX4OAqkcylOZBp3/yJZES5BRttNBogWmADFevEm/yFmiSe2uKfgOCcE7KK1abXHy7AnuI5DUBixBbJ7xlBQkfSqvqXCoO+fYqHA9gRhkt4bMFMlheSZle/H2mUIVCPVG73GcyIFPjLy4URrJNmC8X8tBvWpbfkyNfAlVzx3IHzJaCjrmGHyEBGNU1JQCjNIZB2nrQelIUshWKHwuqWmmF13fY+8Q2HRg4Y2t2xA0oSogJVrdp62hqKfqADcVFb5JyQPqsoebGLP4N6E+w4zfn4LOTKsmGxr0JRLNYVBko+e+u/DobT4w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) by AM9PR04MB8382.eurprd04.prod.outlook.com (2603:10a6:20b:3ea::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6792.21; Fri, 15 Sep 2023 08:48:08 +0000 Received: from DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654]) by DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654%6]) with mapi id 15.20.6792.020; Fri, 15 Sep 2023 08:48:08 +0000 Message-ID: <7fb71e79-5846-1ae6-6446-d91c507afef4@suse.com> Date: Fri, 15 Sep 2023 10:48:06 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: [PATCH 2/4] x86: fold VAES/VPCLMULQDQ VEX and EVEX templates Content-Language: en-US To: Binutils Cc: "H.J. Lu" References: <0690c179-ac98-d127-5ff4-b5abb725b6ae@suse.com> In-Reply-To: <0690c179-ac98-d127-5ff4-b5abb725b6ae@suse.com> X-ClientProxiedBy: FR0P281CA0243.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:af::10) To DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DU2PR04MB8790:EE_|AM9PR04MB8382:EE_ X-MS-Office365-Filtering-Correlation-Id: b888fc00-514f-4c9d-2408-08dbb5c87bef X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 4F0vYm1t7Wb8FK2p2IZkZUp1o6idS5eAjJ/dQoPjDw7kVaVasow8vwckf65dBfYZiE2ipKmn4jx8NHnin4nKIxXabu3b2HMS3vkUMfpJwrrqM5cA/SqqUhIpAsWZ+XCpyA8Ro8uH9Vs//LnH5Uj4t8VUEr3Pazfd73HDGtSMPjxCRtiIzhacgKqL87wCwpVehaNUkdDuaiHbqoASKrRgtVZwMnkoknMR9hR2XFSnOq9NQLEQK2cyimP6uJ4Vqu/WJOETQKJYhUER1h+m8fesCa9ni/waMrfGWtgeQPk/FMauqhf4deUQzVYJ1KdTfPU7x2jCxpUxOHIJUaheGVa7WUExBI8WzjIlySE/je3RYOVC44HtYR+ZD3UwYFvyaMq07xCzy8KzA41vQ+07RViVmOomM0PBCVAcr+CgJ630Uf4/oL/tGgnsJaEeB1NIQWroleO4Uk6ElCqBSAiu9sYHP9Q4ZK40uWFJ+UvRp5mdqFiaybFug26mFh0Lsj+lQe0B04hLrli8XfbEc2BDz+iUVlqfztq+etn+tlqQ7RWJnW3uHZdD9gf0S4Q9b+z8+8WxKpmlElRXM+hYK1vCPhZzlwuXxlzuAaeqGAxNqAHlQX9MQK5SD4ZT6wC6zimXQRXTsA6nLY/1JJ+wdEhAZdbtcQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DU2PR04MB8790.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(396003)(376002)(346002)(366004)(136003)(39860400002)(1800799009)(451199024)(186009)(31686004)(5660300002)(41300700001)(6916009)(316002)(478600001)(86362001)(31696002)(66946007)(66556008)(66476007)(36756003)(38100700002)(8676002)(4326008)(8936002)(6486002)(6506007)(26005)(2906002)(6512007)(2616005)(43740500002)(45980500001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?GoQUWwH7gR7KnsEfb/6rVdk2NQmc?= =?utf-8?q?Ftb9equ/b2HvI7SOkir9Gq+sjebHxGcEzqFYK7BXospjJ9YVxftX53CUKOzYv9bbd?= =?utf-8?q?5l8VV4VPLCKIKl0xUQ8YMD9GADBJ8vPY80MyVyAXaTRmAlxwi1ZwFuh9sAEpbeiqx?= =?utf-8?q?0nK/6uXylIZIY0oM2blFwxwlDX1yXBgppZckQgTE7m6polEdL1WJBOTXXuYlwmxDs?= =?utf-8?q?eDACBiK1185RZ79MOpUeLoggYSWNXiVFnsui2jy5pSTVCXJFyHswk1NwLuq854ySx?= =?utf-8?q?N4HgLVYk2fkS34I+rGXL8MD0anfBtGWZcsNifoNAyFYNiiclMzti/jY1tMlKt65Is?= =?utf-8?q?Az3ZzLMl+l82P/ZspeyujWGmpiIRl0pB3qhsmJh7/NX9scxB2SKv+HKZJ7Guy35QV?= =?utf-8?q?+tYKltdJ+RXqwJEyIdukAlFmZiFKu6iFbWzju8M5/1mgI4D0E1GmTaBGahs4RIyh0?= =?utf-8?q?2FsH0iAh62StExG6bpjoZbKprNGxMsaMeqemuphygdadDW6LYDkpcvIMYBTc/YE7K?= =?utf-8?q?az0QIbdOMcsqAdc6lkf0EgmiuXd7VKC90KyZ3f1rvCi8SwgMIC0wB2hmfgokF8OJT?= =?utf-8?q?5k+Ix8eeHrGfpQVhWyQ8ufL1xpvA7VtDZ2V99OvxX7RP0scKW8uYs3dceUQ6w+XXP?= =?utf-8?q?Vkq9FSHi8d3b0N2KvWg7PJPCuSTQkHLuMMItQ212GUexD8fd5Crz93gh0YuajomsA?= =?utf-8?q?2kcUTrHaINOReQV/Ho/dXHjtXRO2qQ8hNOk6Zv9i0Ht6TiULbufxpxpZ5b6uEn6MA?= =?utf-8?q?mQyfASUc/Q9rmEnrpU4QbeIZ/6phYqeWU41fPuMLv5GZLTi9VWsJ/yCtcar+US1SJ?= =?utf-8?q?UWXKzUh2+J2OPWyMj0yxe/qZiRfGKbzeUby+A8Qgr87Ubr6PSkknt6Xj/TA5yJ+mS?= =?utf-8?q?r/tAyFL9dbiGhqgDvgdDpdF1b7cBPkLOK5rHw+tXps/5gaqwXpBjbojwqyN7FYMNE?= =?utf-8?q?bPttwy3/syri/f9cLgTlZ2JVdYBu3XM9AejMtRPhoFaW5PzAduzxcGjHVWSK9utCH?= =?utf-8?q?czGdks16fxc3+YLW4df5sjLTgF+EjxCatT+peofrfGZ7PAIBohsDCabFs2zlRSr9e?= =?utf-8?q?2ymB59c5E/KvbvdBUlROkNWTopPvKDGNVo6vX/pQeUU6xEcd05ogLIcQEvUZsijTa?= =?utf-8?q?eLo6PATKYlU4k767YnXbaAk4cHdYlZ5CChia0H/jmDyuD9S0zgZ8aBv/RCMJfkmxq?= =?utf-8?q?4X8CqjHS0ZtgNyCeqV7OOQekITLlTTSdx+gcEJx9ZAZFE1sbXVcyFVw+TQpXvpSE5?= =?utf-8?q?Aerl1dF8ItKfTrUiaZo0e40Q5g0xUUw2fuZ/UsfrW+boew9BA+dwpQvAc1Dd5FVfW?= =?utf-8?q?PyAIhYs0VGxAFSF0GUGpixg4wOjLTg7QTd6bZ59h+Ejc0VNkN6eMKwvf92m7O8kDz?= =?utf-8?q?EpLKMtegBjGV8LHDoa1K2gI1MVw7kY6G+JjbhpiuvG8UB9sgHqw50CmrVB5EpRMrv?= =?utf-8?q?Hrho9M9C2veog9CykB7tsQJWu0K7xGWIMMOParzsbjH0lLEoDuJlXD0ShmRVDCNXg?= =?utf-8?q?m+o4tOc6TzHZ?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: b888fc00-514f-4c9d-2408-08dbb5c87bef X-MS-Exchange-CrossTenant-AuthSource: DU2PR04MB8790.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Sep 2023 08:48:08.2772 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: AnIsoqcknJG3UaGEiqyKM2K1I4m98FMAq/ZktNCYtPLz14h2uBGpfd9eq0UdJIeuTwjMR2tCNRgIbvMfOVPpiA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM9PR04MB8382 X-Spam-Status: No, score=-3026.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Beulich via Binutils From: Jan Beulich Reply-To: Jan Beulich Errors-To: binutils-bounces+patchwork=sourceware.org@sourceware.org Sender: "Binutils" Following the folding of some generic AVX/AVX2 templates with their AVX512F counterpart ones, do this for VAES and VPCLMULQDQ ones as well. --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -1942,7 +1942,17 @@ cpu_flags_match (const insn_template *t) cpu = cpu_flags_and (x, cpu); if (!cpu_flags_all_zero (&cpu)) { - if (x.bitfield.cpuavx) + if (t->cpu.bitfield.cpuavx && t->cpu.bitfield.cpuavx512f) + { + if ((need_evex_encoding () + ? cpu.bitfield.cpuavx512f + : cpu.bitfield.cpuavx) + && (!x.bitfield.cpugfni || cpu.bitfield.cpugfni) + && (!x.bitfield.cpuvaes || cpu.bitfield.cpuvaes) + && (!x.bitfield.cpuvpclmulqdq || cpu.bitfield.cpuvpclmulqdq)) + match |= CPU_FLAGS_ARCH_MATCH; + } + else if (x.bitfield.cpuavx) { /* We need to check a few extra flags with AVX. */ if (cpu.bitfield.cpuavx @@ -1957,9 +1967,7 @@ cpu_flags_match (const insn_template *t) { /* We need to check a few extra flags with AVX512F. */ if (cpu.bitfield.cpuavx512f - && (!x.bitfield.cpugfni || cpu.bitfield.cpugfni) - && (!x.bitfield.cpuvaes || cpu.bitfield.cpuvaes) - && (!x.bitfield.cpuvpclmulqdq || cpu.bitfield.cpuvpclmulqdq)) + && (!x.bitfield.cpugfni || cpu.bitfield.cpugfni)) match |= CPU_FLAGS_ARCH_MATCH; } else --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -2068,20 +2068,20 @@ vsm4rnds4, 0xf2da, SM4, Modrm|Space0F38| // VAES -vaesdec, 0x66de, VAES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } -vaesdeclast, 0x66df, VAES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } -vaesenc, 0x66dc, VAES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } -vaesenclast, 0x66dd, VAES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } +vaesdec, 0x66de, VAES|AVX|AVX512F, Modrm|Vex|EVexDYN|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vaesdeclast, 0x66df, VAES|AVX|AVX512F, Modrm|Vex|EVexDYN|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vaesenc, 0x66dc, VAES|AVX|AVX512F, Modrm|Vex|EVexDYN|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vaesenclast, 0x66dd, VAES|AVX|AVX512F, Modrm|Vex|EVexDYN|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } // VAES instructions end // VPCLMULQDQ instructions -vpclmulqdq, 0x6644, VPCLMULQDQ, Modrm|Vex|Space0F3A|VexWIG|VexVVVV|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpclmullqlqdq, 0x6644/0x00, VPCLMULQDQ, Modrm|Vex|Space0F3A|VexWIG|VexVVVV|CheckOperandSize|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpclmulhqlqdq, 0x6644/0x01, VPCLMULQDQ, Modrm|Vex|Space0F3A|VexWIG|VexVVVV|CheckOperandSize|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpclmullqhqdq, 0x6644/0x10, VPCLMULQDQ, Modrm|Vex|Space0F3A|VexWIG|VexVVVV|CheckOperandSize|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpclmulhqhqdq, 0x6644/0x11, VPCLMULQDQ, Modrm|Vex|Space0F3A|VexWIG|VexVVVV|CheckOperandSize|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vpclmulqdq, 0x6644, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpclmullqlqdq, 0x6644/0x00, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpclmulhqlqdq, 0x6644/0x01, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpclmullqhqdq, 0x6644/0x10, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpclmulhqhqdq, 0x6644/0x11, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } // VPCLMULQDQ instructions end @@ -2932,25 +2932,6 @@ vgf2p8affineqb, 0x66ce, GFNI|AVX512F, Mo // AVX512 + GFNI instructions end -// AVX512 + VAES instructions - -vaesdec, 0x66de, VAES|AVX512F, Modrm|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vaesdeclast, 0x66df, VAES|AVX512F, Modrm|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vaesenc, 0x66dc, VAES|AVX512F, Modrm|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vaesenclast, 0x66dd, VAES|AVX512F, Modrm|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } - -// AVX512 + VAES instructions end - -// AVX512 + VPCLMULQDQ instructions - -vpclmulqdq, 0x6644, VPCLMULQDQ|AVX512F, Modrm|Space0F3A|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vpclmullqlqdq, 0x6644/0x00, VPCLMULQDQ|AVX512F, Modrm|Space0F3A|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vpclmulhqlqdq, 0x6644/0x01, VPCLMULQDQ|AVX512F, Modrm|Space0F3A|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vpclmullqhqdq, 0x6644/0x10, VPCLMULQDQ|AVX512F, Modrm|Space0F3A|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vpclmulhqhqdq, 0x6644/0x11, VPCLMULQDQ|AVX512F, Modrm|Space0F3A|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } - -// AVX512 + VPCLMULQDQ instructions end - // INVLPGB instructions invlpgb, 0xf01fe, INVLPGB, NoSuf, {} From patchwork Fri Sep 15 08:48:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 76107 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 879973856090 for ; Fri, 15 Sep 2023 08:49:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 879973856090 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1694767791; bh=A1jFMW3M2qvOBBoe3qsBCGuqg2GrDSGur13ki+iGT0s=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=fWLI2vh7MGGZCtkbzQyRifiLvtjybR1pEAnw4qpVM6QCcfH+ZdWSaeJwIwMuAuXX6 z72PpbrtOvZU6KdP//ifkWkBbkNScwu/6FO/Mq/2RZrJQuiFM9c0bv+Fcf7/SE4f6k yIdeTBYZZhFijzJ9kwPeMA6gs8D/3ioHfdhEhqjs= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2045.outbound.protection.outlook.com [40.107.7.45]) by sourceware.org (Postfix) with ESMTPS id 157AF3856243 for ; Fri, 15 Sep 2023 08:48:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 157AF3856243 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Dw/LBjdNc35/0Fnav6PT8aArmFxgoWaXq1DqxQqy8I+YsDW90wPJgJ+PY0aeJzJ3kaRdMLeFjhQv2wLwshPjhEoInm6wxCUu8ZrcIaBucy6oxs6sxK6AMUHSpKWKup8TOO7diQgu5edeUbFe+w2T1KWcGCDgCsCanm7a4tNkYiQXr+6DXpslD7AAbFNGTaToR68W1xacmcGsilWnwYMu5uQ8yF64aMWrNY2Hfh3f7TTecPfosxTaaldD3w96227WtXC3Pu9dWTPUdmW/Ysfxp45A0Lyi89zn2FK+B8fJJPDi/s4vQJ4VAd9mlCrzCiF1qi5LmdoaXi3wnwTSpxOkfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=A1jFMW3M2qvOBBoe3qsBCGuqg2GrDSGur13ki+iGT0s=; b=ZIfp6Yq4157Qrsu+oGNEsd/al2IyO9O5gA4DJJvWxjdZg7vweLdUOrcqLw+DX/81ZTJsecZ1vwO5a1ZP+hkZs9rYjuA9+7g6vTDstEMfoil3MOyTsjrj7ERcEscCqdX1Xaz5TZZvV9H5HztYJh0i7sJjJolkNQ9QngfLNV7igAmtRi1rzbhZn61Oxhh7Hqg72X1z+T5rg9OaHcu1YBRkgkfd5dJn10wTcw/8VOh6J8u9B1s4P2Csq9QVvjg0s/0+HdvEIpfzha+5E0FxE5lKKt+x1xqo0jdS01NsauuVEQtuY5KXKIWECBRFZ0ZI4JM91L8MCV/9QEmWNa6hPSfC6A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) by DBAPR04MB7320.eurprd04.prod.outlook.com (2603:10a6:10:1a8::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6792.21; Fri, 15 Sep 2023 08:48:44 +0000 Received: from DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654]) by DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654%6]) with mapi id 15.20.6792.020; Fri, 15 Sep 2023 08:48:44 +0000 Message-ID: <737ed0bc-7dc2-19e6-5138-8891f0c2fd15@suse.com> Date: Fri, 15 Sep 2023 10:48:42 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: [PATCH RFC 3/4] x86: fold FMA VEX and EVEX templates Content-Language: en-US To: Binutils Cc: "H.J. Lu" References: <0690c179-ac98-d127-5ff4-b5abb725b6ae@suse.com> In-Reply-To: <0690c179-ac98-d127-5ff4-b5abb725b6ae@suse.com> X-ClientProxiedBy: FR2P281CA0130.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9e::11) To DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DU2PR04MB8790:EE_|DBAPR04MB7320:EE_ X-MS-Office365-Filtering-Correlation-Id: dc1f78d2-0905-49ff-7609-08dbb5c8916a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: PlslWCYOty/M4+W1NDZoXgcI+K0aZ7eMWHHeGzfL1jgSdEWItX6pO4LL0cvP1sNykYwhM0TxX6ak+CGnEdLVq9itOSnu0TgZD9sP9pU815+++x1B4QSWRNknFncK81ojXD4MdHlSI0BJExJCACIzTfpcz/0dJW4yEhz+RKnmW4NPV7mdlY8ntL5MikoW+gXyZ43ZC8aygpFDXOPh3aasT/ypK1AG/irDOVoMFCMTTspJz6jIFnMJB0p8vWkVC7Nkynzpi4lSnx3r0YTFs8qvbghSpc9RZSg5aKp4Rt+AfaBCAELpuRQsCGXDUvmus3iBZfdc9BIurpPxiwCrjpiw+Pur5G8/rxR4a2NYsFSt37gRVm20jOLSuCfxaUsmSVkEvHKcVIBhEjl8vvMeGHxidKcpRJYNPbnhbvsmAhQfXZkdDymmztQN9AI1aJflhVf9YrcgXWv2B2/zEF3z4f8k3KsD7EraAy0r2AI8bpH+2xjuXj5TNhwHU0aPEAnrA7Y7RKbIbJWUIsd/rVBOUiKSUJDKt2YYHA6o+lsE2/HgV3ic0BkOBpYh+hGSJ9jn2/KyVh96wHbEZHXqZoV6XMh0DNbL63Db8+WH3PoPJq/8crjCvjTALEwAsngY3REWk3Yizt9sT+k6ZnL75poiPax9mQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DU2PR04MB8790.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(136003)(366004)(39860400002)(346002)(376002)(396003)(451199024)(1800799009)(186009)(6506007)(6486002)(36756003)(31696002)(86362001)(26005)(66946007)(6512007)(2906002)(66556008)(38100700002)(478600001)(2616005)(31686004)(5660300002)(66476007)(4326008)(316002)(6916009)(41300700001)(8936002)(8676002)(45980500001)(43740500002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?C9ul4AXOHVwEUC69vEvXtZ8ghCNZ?= =?utf-8?q?QL6pOOi7wVQN9P6I988nx2Jq0FNlVPKSueEFMoh4HEPVAiYGHvLwboP5hZ+iu3aZY?= =?utf-8?q?U/A1AVUMUpYvMFrNuKiGF29GwI7lMqzJB0BYPgPrxEGK8C8ej0d3B1lm8yG95rNqS?= =?utf-8?q?DjwSWzzWGE0B8TPJe4biPZGSpPmCJl88iyZnR0pfZ/sq9kYuNesLei1pnfsImKWM4?= =?utf-8?q?C6/MlzjjTWnzXLs2G7Tpd5Cg+cv7HF/bE6bUGtUrNS3w/6LIuJp6RyuCk3MG10dni?= =?utf-8?q?6ZQ9p3n3BC83bHxTeVMpKdL7DKR9M5LVatQoz/BgaFdRf27GRs/AaKLujlk4XTjzg?= =?utf-8?q?6alGpeBlK2C9+Xx9di8YJZ/ODz1TCvsNyIUhJY0VmQgV2l4Bh0+aIQvDNkp2kaVnZ?= =?utf-8?q?s0DQ0cWxHRPRA5bjcyh92GRVppf61OChF52c6WPfRb0YMbw7QpxugCg884JNdZxh/?= =?utf-8?q?JFU8HRhs/t1tK0kU9K1LQtp7l90dhfxBzV4o3cGk2Z7OzBT7ig6iuX+5ChlnIS3QW?= =?utf-8?q?rWuYAb5tziQ7IR+8U0NMfX2Gy7L5Q+lKiAp6pnpk3IR4+AX+WPy+LluWWMY8XzLpk?= =?utf-8?q?OkudG17QsyeUQYMHaLyW8LIRRaWJOIRfa3hnvvPQDIUKf/rnckHEvzhRqldimN+v7?= =?utf-8?q?U1F+hVtuboXakTN+5hjFB019ihhyWFA6+Qv+3FpMKlCCc8YcCEpiTH+Dy/3milUox?= =?utf-8?q?9EiilWvvRTgHFi+xtUEotoNiGXECxBQ/nGfymDlniXTqxbUZeVRvqbgWqgbf9ChET?= =?utf-8?q?zGqhdKpDtdQg5wMV6Z3gGwzoU/3ZPH9c2pFsH+wrgOkviNKbewoTsotnutvA2axqW?= =?utf-8?q?MQ7ZvJzZt0KQEnvY6LBORmujbP1oaYPsialG72KkHRv8SlIUsdF/pVstKYKGTeZXj?= =?utf-8?q?+yPw+QTS2h5Ys5aDPt8KwOX+DWVjVYsYTLLtb5hh2VXgtB7R/ibsw7dIVb7SY7z8W?= =?utf-8?q?1P7vHwbkczS302+4McoANNneYO6YvDRP3tRGHD7jxPejFxgiBj2eUhsxRY8Ruhl+X?= =?utf-8?q?hdRBF74V7/gq7PVxOWDqNPMIW+se4Qs7k5MONM7k1qS2TseqtuRuhv/4EF7dkr6Q3?= =?utf-8?q?bSAq/0l7JtfhygfjOAf/g7hGChw5uVPYifCgNZE4fo4jAZaE0xKuLY75T3wEg647W?= =?utf-8?q?6W8M2Ay7qLgfLtQEaED4LkSRsmQ6iPAHmeWt4n2FzA0VrXkYL5a4lIRzldxN1rqgj?= =?utf-8?q?YCPjDsHECV4h/5LaPqI2ecqOaNmx/9iu/lpgOdctikqfsqhR4R+G9aEebRl0Zy2Da?= =?utf-8?q?2933V1tDhlBEOzY/rnH8YgGchn2yAYnehdY4EWsUoJncXG5j7ZMvfnSjR7TGFsbxb?= =?utf-8?q?7wSPsC95KBU49zRdkkFhjEzy2R/mZhVco1A+hJOerwxVFto+GLJaILouxxN5+DjLM?= =?utf-8?q?evE4rvFa1I7uPlXTpoS8d83wEfVbRUJ6RkYkZARfIACeimAggoQ693ebqrT5eWCt3?= =?utf-8?q?Bjimqw3YxuicWe1FmA7za3QKDVeAwkc6qxP6QIOLF+otL0YBbkMiJLmilNV4GNuhz?= =?utf-8?q?VspJiflFIlGH?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: dc1f78d2-0905-49ff-7609-08dbb5c8916a X-MS-Exchange-CrossTenant-AuthSource: DU2PR04MB8790.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Sep 2023 08:48:44.1976 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Sw/CgQBVMMSLQTvuMeKRIKsSQ3us5RcHHgcfBqIT4+wyySp2EW5/Wx2mT6mupmwd7hC9P3MKhTF84otqlI6q9w== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBAPR04MB7320 X-Spam-Status: No, score=-3026.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Beulich via Binutils From: Jan Beulich Reply-To: Jan Beulich Errors-To: binutils-bounces+patchwork=sourceware.org@sourceware.org Sender: "Binutils" Following the folding of some generic AVX/AVX2 templates with their AVX512F counterpart ones, do this for FMA ones as well, requiring one further adjustment to cpu_flags_match(). Note that this has a perhaps unexpected effect, resulting from FMA not being listed as a prereq of AVX512F: With just the latter enabled, VEX-encodings can now be emitted (but still not 128- or 256-bit EVEX-encodings, where AVX512VL of course continues to be required). --- RFC: Considering earlier discussion, the mentioned side effect likely means we don't really want this change, despite the significant reduction of the number of templates. --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -1947,6 +1947,8 @@ cpu_flags_match (const insn_template *t) if ((need_evex_encoding () ? cpu.bitfield.cpuavx512f : cpu.bitfield.cpuavx) + && (!x.bitfield.cpufma || cpu.bitfield.cpufma + || cpu_arch_flags.bitfield.cpuavx512f) && (!x.bitfield.cpugfni || cpu.bitfield.cpugfni) && (!x.bitfield.cpuvaes || cpu.bitfield.cpuvaes) && (!x.bitfield.cpuvpclmulqdq || cpu.bitfield.cpuvpclmulqdq)) --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -1802,16 +1802,21 @@ vcvtps2ph, 0x661d, F16C, Modrm|Vex=2|Spa -vfmaddp, 0x6688 | 0x, FMA, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vfmadds, 0x6689 | 0x, FMA, Modrm|VexLIG|Space0F38|VexVVVV||NoSuf, { |Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM } -vfmaddsubp, 0x6686 | 0x, FMA, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vfmsubp, 0x668a | 0x, FMA, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vfmsubs, 0x668b | 0x, FMA, Modrm|VexLIG|Space0F38|VexVVVV||NoSuf, { |Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM } -vfmsubaddp, 0x6687 | 0x, FMA, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vfnmaddp, 0x668c | 0x, FMA, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vfnmadds, 0x668d | 0x, FMA, Modrm|VexLIG|Space0F38|VexVVVV||NoSuf, { |Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM } -vfnmsubp, 0x668e | 0x, FMA, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vfnmsubs, 0x668f | 0x, FMA, Modrm|VexLIG|Space0F38|VexVVVV||NoSuf, { |Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM } + + +vfmaddp, 0x6688 | 0x, , Modrm||Masking||VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vfmadds, 0x6689 | 0x, , Modrm||Masking||VexVVVV||Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } +vfmaddsubp, 0x6686 | 0x, , Modrm||Masking||VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vfmsubp, 0x668a | 0x, , Modrm||Masking||VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vfmsubs, 0x668b | 0x, , Modrm||Masking||VexVVVV||Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } +vfmsubaddp, 0x6687 | 0x, , Modrm||Masking||VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vfnmaddp, 0x668c | 0x, , Modrm||Masking||VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vfnmadds, 0x668d | 0x, , Modrm||Masking||VexVVVV||Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } +vfnmsubp, 0x668e | 0x, , Modrm||Masking||VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vfnmsubs, 0x668f | 0x, , Modrm||Masking||VexVVVV||Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } // HLE prefixes @@ -2087,11 +2092,6 @@ vpclmulhqhqdq, 0x6644/0x11, VPCLMULQDQ|A // AVX512F instructions. - - // is used for EVEX instructions with x/y suffixes. , 0x27, vrndscalep, 0x08 | , , Modrm|Masking|Space0F3A||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8, RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vrndscales, 0x0a | , , Modrm|EVexLIG|Masking|Space0F3A|VexVVVV||Disp8MemShift|NoSuf|SAE, { Imm8, RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } -vfmaddp, 0x6688 | 0x, , Modrm|Masking||VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vfmadds, 0x6689 | 0x, , Modrm|EVexLIG|Masking||VexVVVV||Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } -vfmaddsubp, 0x6686 | 0x, , Modrm|Masking||VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vfmsubp, 0x668a | 0x, , Modrm|Masking||VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vfmsubs, 0x668b | 0x, , Modrm|EVexLIG|Masking||VexVVVV||Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } -vfmsubaddp, 0x6687 | 0x, , Modrm|Masking||VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vfnmaddp, 0x668c | 0x, , Modrm|Masking||VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vfnmadds, 0x668d | 0x, , Modrm|EVexLIG|Masking||VexVVVV||Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } -vfnmsubp, 0x668e | 0x, , Modrm|Masking||VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vfnmsubs, 0x668f | 0x, , Modrm|EVexLIG|Masking||VexVVVV||Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } - vscalefp, 0x662c, , Modrm|Masking||VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vscalefs, 0x662d, , Modrm|EVexLIG|Masking||VexVVVV||Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } From patchwork Fri Sep 15 08:49:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 76108 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8876D385840C for ; Fri, 15 Sep 2023 08:50:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8876D385840C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1694767821; bh=In4EbmEwIY8LqRxIhsKilKClMWQo3yittxI6QamXKV0=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=aBJPBsDlie6hF1RetXN2rqukNTzucJpRs6jCaNW9ycHK0orBOme0dUJcpNOjvKxfi OnioHa/tdFWEsUGnjXK3YVGRqdKzV8vDCn0kFzx4LirBLVwAvh+YDXRx59k1Yg3aQi 9kymqrJbixGadMVkFqrKhEnXYCiUb5CAUhNiDJ2s= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04on2084.outbound.protection.outlook.com [40.107.8.84]) by sourceware.org (Postfix) with ESMTPS id 1516D385CC83 for ; Fri, 15 Sep 2023 08:49:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1516D385CC83 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BUlMdQ1QHKZ2uXLzo+/OFn3gzTT+amZJkWOgEomDi9Bo7lFNL8ApMVD/ZY7tPji+I9GOQFwGrfxvcgF7s3UmPgZ6ci7nkmhkQgopqzKGI/SNE2Tc03aCA/75/fDAFOXNoJNGVLYHAGOsO/Nh7LI0IXeNHYNL60Qlqs5Wtp25N2PQSjKr6ExGI9H36xYIWmECXkrnVQMHRV60uWz68cpyNQikQCuCnQFXyfJMeuQPfehkxmsrM/2wYJHAyKj9UdMujkA/7auaP5/bLU/W7ZwMXPxTrv8o7MbYoFkA+r9gSUpYwHA0ZZeNzJtwm1G7UW1uuUZF2xZoMIaWEZhMc7hukQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=In4EbmEwIY8LqRxIhsKilKClMWQo3yittxI6QamXKV0=; b=agLP6h1PgXB3LnzGNPqs6XWqrNOy5AAEEVkNVIot9Rhg1K060PKrJhd9IBUfmJnDb6HVUVkMd3Nojgc+YGm5f+ys42vmsALWqtgp3diz0ZcHIqMnc2I3RDKhOhRO1YljRJ9CRFa8xHFN7CxGTi3WEm4GxWS+kf83cmPnO3U6LZ7su/jIUVtZXNbkeovlJ40KlaWdNl9gmxIUam33T8eKkjsysNXaLJIcUWuKA6BYCo1N/UJGR0b+mqGNeVifonjiPTjPTtR64Tn73Ity4utxe6G6/4osRY8bzH7yM+bl/FqBtgXD0/RCTY12cgRBfNBbMxcsFLeg9zTkOWC1aV6p9Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) by DBAPR04MB7320.eurprd04.prod.outlook.com (2603:10a6:10:1a8::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6792.21; Fri, 15 Sep 2023 08:49:10 +0000 Received: from DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654]) by DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654%6]) with mapi id 15.20.6792.020; Fri, 15 Sep 2023 08:49:10 +0000 Message-ID: Date: Fri, 15 Sep 2023 10:49:09 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: [PATCH RFC 4/4] x86: fold F16C VEX and EVEX templates Content-Language: en-US To: Binutils Cc: "H.J. Lu" References: <0690c179-ac98-d127-5ff4-b5abb725b6ae@suse.com> In-Reply-To: <0690c179-ac98-d127-5ff4-b5abb725b6ae@suse.com> X-ClientProxiedBy: FR2P281CA0124.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9d::18) To DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DU2PR04MB8790:EE_|DBAPR04MB7320:EE_ X-MS-Office365-Filtering-Correlation-Id: 131622e4-8c9c-4743-d9c4-08dbb5c8a13e X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: z+Z2RnpV9bnboCaTSVmo8oGda8x6xxO0KUS7bRgj6zFizHFONjaPG0dpJUdwJcIOLka7IMUQtJ5V/BOg/HKZQFAxYbrq0ZCK5TrwcP/MaKwLL0i4KOcJLRfZoEJElIGdqwYSHSIeLrj/3wTiMyYqk79OhxDNuhZ9uc9OW1dBez5NvhxHjEPXN6gmEaiNneSbiFiUpe6OsT+2gF+7BNkhmKkBRjOSgI3tY9PPC5Def0QHEZd7XnvdijNq3V9tR5i9mwacWsHJCKOxhDOSe2aBdlzXYNQMGdviSJ67U7B4WlPineqpqEBMJDrXPBJXKo2KVxxuax1LT4QG7wnx/mBEei2RvKNnnS2BbRq/ndVoP4Bs6nNBJE+G/ffIGIdy93TnGTq2fPjwWEzBw4mXTOP6nFz8PKcREtp90/3m4XCBC1E+cF00CBwvIBu0hV0h1parObdcNkNQf9JIO6iMEnExPCnSnPmjiRYebWnCiN8HgGRnxdz9rf/f5RIHy2woX1BmNfcZmNgDZ5C6JniOLvkc4k0f9IKQlu+QQz9E3xLWeKyKLumg88LLsstuPcuuAtD5dko/wm57g7GvNAaElU5joPaRzhxtLqk7aNGOYmmXEJCRkHV8OPPjqG2gyBD+JzYiZ6tkcE3stz47V5jC9Jm7zQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DU2PR04MB8790.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(136003)(366004)(39860400002)(346002)(376002)(396003)(451199024)(1800799009)(186009)(6506007)(6486002)(36756003)(31696002)(86362001)(26005)(66946007)(6512007)(2906002)(66556008)(38100700002)(478600001)(2616005)(31686004)(5660300002)(66476007)(4326008)(316002)(6916009)(41300700001)(8936002)(8676002)(45980500001)(43740500002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?huqbE/2NPRpYl4nz7H+KouPaeifn?= =?utf-8?q?CPK5FFG90og6MjhdPDd0+G0wouehjLL81HshTqC3nXWH4tgveoC//3HJ+Po23vixF?= =?utf-8?q?QbjxAY1Bm4ceEcVfJhyc6TVTZomSKaPqpv0vDPxel50hLnw14trP4fbZmJjshbcqk?= =?utf-8?q?uB/TwESX64Scv7CR1MzN++oliUET2TULYC1L1+qn5cZeacX1ZRmn2CEaLsLeOSQQi?= =?utf-8?q?nv6PSeS7pqh0yDfd4n9fCnKXum8ZP6uR9wlSGKvJvqvIDR37dlrMQt0qeUA6Mu0zH?= =?utf-8?q?Fw4pY/3F/jRt08f69hFIhpMe9x6Jx7thkqnV/JRj7A2CaS26erAl+pnFFVGW/Ymr2?= =?utf-8?q?HwPdrUY5wkSP/WGVNNw/L5rUmkf24v8KONzvSnQ0BlyX2/KIbu6NuDbjGeUF8g74J?= =?utf-8?q?0xMeA7cDSh8jYZ+jiDHlG/FidoEXzkgjNRzB5QW9WWBjUWBRC8vc1OjpvGazsWXB/?= =?utf-8?q?4Nf/gGIgZ5l6ZxCzvEnI7jflNFl6i8NVqjMVICV7FuNv6gkxeorcwr7uITT1hbOIl?= =?utf-8?q?GgfywuZTb6r80rM45lLBVI4MBShgduycwxC2RcaxGmRG1wwcRwKh2lZndD6fFXhBp?= =?utf-8?q?Gd+JI35zUx6sl8KZW777T3g2Jq67awMJvrzuJRIYrFmoiAscqcMTk0LfOP+Muf6KK?= =?utf-8?q?1r3+Ntur0HbljV78CjIk713KlOlQHTcwzdKJ9NHemce0nPMj4FKu+rUxMbvJvZgYQ?= =?utf-8?q?9zDbLT0Rfb9Y0ieC5ngYmWmt7+JRL9ODcf6uyISBzseagqUURC9RyYcruHwgExAsp?= =?utf-8?q?NynyhprBe7GtXyC8FIlXqBsQN92dIU1T/8zRz0cEkkcxJPEceM0O9fDcmpxbOppou?= =?utf-8?q?oqN17ujaIMZuWNbr31fN901DC5i7MaEAetfyNN2ojWFbklo5HLcgH9gtXmu5nsFH4?= =?utf-8?q?2BTLqREkGqoz7erKViwOZ0gViiWsupNplcJcHPckB2iEbpEVtzS7TcnNzjidibeWj?= =?utf-8?q?/2KTrvquwsO+8jdYC5wjMxVGSRjHc9SzlqivakmegORy4e8brZXaQR6+0YWaSqnW5?= =?utf-8?q?nthxoTmrMgrj2B1V/ozLCkDme3ip5srhFv9re1ugMegnY0LGfwqvjE2F8nOhZWz41?= =?utf-8?q?v7J9PhWjgplpiUk2NPFQb5S+TC3RNwSpu0E4iEqHJm+tgz+yE1ESffBFAfLIBK4Mf?= =?utf-8?q?+HlqkEM46gRp9+6Z8YuwTCciMRvohNa+LsO0qqbYjpxdco2UB8dcqHtzDJiz1eTKj?= =?utf-8?q?KVuGAgstGqJJ7rsEsgWNqr2mhOL+lHEW40PIewV5DppQEYzcCzhkptcEUJmqxswTz?= =?utf-8?q?CEXICG6AL0BD8J2VQZKN/8Ma2A7oLet+0fJHh1I7+UzKzgwKJ+Ltm/fZYdBdKxRaG?= =?utf-8?q?a6AxOdxo7LXheGybztcgkj3LRVK04iZw1yo3INjpCZivfnkS8VrMGMAmDoJ7U1RQR?= =?utf-8?q?scrxcPI21Jt25+Ket7FTit5sHtJguiIJWte34OlhjmAOkGQwD9g0gUUAWl8Nk3Uyy?= =?utf-8?q?IHaGmnOTxfGzzeFwMT8dw9HlgaKq5FD6kKKd/b0Ap6TIaFinXF/+wpn1yUGw4xSOC?= =?utf-8?q?dyRcz6QUAdNy?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 131622e4-8c9c-4743-d9c4-08dbb5c8a13e X-MS-Exchange-CrossTenant-AuthSource: DU2PR04MB8790.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Sep 2023 08:49:10.7391 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: PVuCFpP2X0XfAGptVgEYylefSCHSd+o7aP/TB//OPuv9SwpSa2OX6RMbSNakdGszHF+JR49ajx6JlDyNsZ5KjQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBAPR04MB7320 X-Spam-Status: No, score=-3026.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Beulich via Binutils From: Jan Beulich Reply-To: Jan Beulich Errors-To: binutils-bounces+patchwork=sourceware.org@sourceware.org Sender: "Binutils" Following the folding of some generic AVX/AVX2 templates with their AVX512F counterpart ones, do this for F16C ones as well, requiring one further adjustment to cpu_flags_match(). Note that this has a perhaps unexpected effect, resulting from F16C not being listed as a prereq of AVX512F: With just the latter enabled, VEX-encodings can now be emitted (but still not 128- or 256-bit EVEX-encodings, where AVX512VL of course continues to be required). --- RFC: Considering earlier discussion, the mentioned side effect likely means we don't really want this change. --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -1949,6 +1949,8 @@ cpu_flags_match (const insn_template *t) : cpu.bitfield.cpuavx) && (!x.bitfield.cpufma || cpu.bitfield.cpufma || cpu_arch_flags.bitfield.cpuavx512f) + && (!x.bitfield.cpuf16c || cpu.bitfield.cpuf16c + || cpu_arch_flags.bitfield.cpuavx512f) && (!x.bitfield.cpugfni || cpu.bitfield.cpugfni) && (!x.bitfield.cpuvaes || cpu.bitfield.cpuvaes) && (!x.bitfield.cpuvpclmulqdq || cpu.bitfield.cpuvpclmulqdq)) --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -1793,10 +1793,10 @@ rdgsbase, 0xf30fae/1, FSGSBase, Modrm|Ig rdrand, 0xfc7/6, RdRnd, Modrm|NoSuf, { Reg16|Reg32|Reg64 } wrfsbase, 0xf30fae/2, FSGSBase, Modrm|IgnoreSize|NoSuf, { Reg32|Reg64 } wrgsbase, 0xf30fae/3, FSGSBase, Modrm|IgnoreSize|NoSuf, { Reg32|Reg64 } -vcvtph2ps, 0x6613, F16C, Modrm|Vex|Space0F38|VexW0|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } -vcvtph2ps, 0x6613, F16C, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } -vcvtps2ph, 0x661d, F16C, Modrm|Vex|Space0F3A|VexW0|NoSuf, { Imm8, RegXMM, Qword|Unspecified|BaseIndex|RegXMM } -vcvtps2ph, 0x661d, F16C, Modrm|Vex=2|Space0F3A|VexW=1|NoSuf, { Imm8, RegYMM, Unspecified|BaseIndex|RegXMM } +vcvtph2ps, 0x6613, F16C|AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexW0|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } +vcvtph2ps, 0x6613, F16C|AVX|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexW0|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } +vcvtps2ph, 0x661D, F16C|AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F3A|VexW0|Disp8MemShift=3|NoSuf, { Imm8, RegXMM, RegXMM|Qword|Unspecified|BaseIndex } +vcvtps2ph, 0x661D, F16C|AVX|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F3A|VexW0|Disp8MemShift=4|NoSuf, { Imm8, RegYMM, RegXMM|Unspecified|BaseIndex } // FMA instructions @@ -2525,15 +2525,9 @@ vcvtdq2pd, 0xF3E6, AVX512F|AVX512VL, Mod vcvtudq2pd, 0xF37A, AVX512F|AVX512VL, Modrm|EVex128|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM } vcvtudq2pd, 0xF37A, AVX512F|AVX512VL, Modrm|EVex256|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=4|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } -vcvtph2ps, 0x6613, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexW0|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } -vcvtph2ps, 0x6613, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } - vcvtps2pd, 0x5A, AVX512F|AVX512VL, Modrm|EVex128|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM } vcvtps2pd, 0x5A, AVX512F|AVX512VL, Modrm|EVex256|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=4|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } -vcvtps2ph, 0x661D, AVX512F|AVX512VL, Modrm|EVex128|Masking|Space0F3A|VexW0|Disp8MemShift=3|NoSuf, { Imm8, RegXMM, RegXMM|Qword|Unspecified|BaseIndex } -vcvtps2ph, 0x661D, AVX512F|AVX512VL, Modrm|EVex256|Masking|Space0F3A|VexW0|Disp8MemShift=4|NoSuf, { Imm8, RegYMM, RegXMM|Unspecified|BaseIndex } - vmovddup, 0xF212, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F|VexW1|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } vpmovdb, 0xF331, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexW0|Disp8MemShift=2|NoSuf, { RegXMM, RegXMM|Dword|Unspecified|BaseIndex }