From patchwork Fri Jun 21 12:53:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 92650 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 31A363898389 for ; Fri, 21 Jun 2024 12:55:14 +0000 (GMT) X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mail-lj1-x235.google.com (mail-lj1-x235.google.com [IPv6:2a00:1450:4864:20::235]) by sourceware.org (Postfix) with ESMTPS id 7D39C389901D for ; Fri, 21 Jun 2024 12:54:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7D39C389901D Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7D39C389901D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::235 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718974445; cv=none; b=LzzZvGCEUytESjFRUfnbTV0EAQA4E8ObhHvPbcK5MmhExiYBFUs7GClzgUP5q33IOxvYEkIgMyiK1KNH7IwpEAJfluKIt/Ckx6nJ8+LTjAxeOxHfJlIIL/VdGDfEL/SMk5pX00wI4s9B8VR1xlTiEblOzrHFCcvFwzvt+IUHuZ8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718974445; c=relaxed/simple; bh=iy5cAT3XSuv5dUNp+qoOYVlLuUbz9lPRL+hRLimAemg=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:From:To; b=LVUlAEQIUx+hzk0XkN7vdE4nexrMkmhk40BzCt54sZqozMDGCgDsVjkYC/n6+0+bZ5rI6d+3u/St0TEejUl61cT5S+BAkjse82s/vSyV7+n8Pp+kPjaiIWjRa41CQoZvf7Rq66ufNP+FTSPV8jWtWnbVwURDSos9gpajzE80Yuo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lj1-x235.google.com with SMTP id 38308e7fff4ca-2eabd22d3f4so25320381fa.1 for ; Fri, 21 Jun 2024 05:54:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1718974442; x=1719579242; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=nPRd2FB2iUNkgjpWUdTtCAtAHhl4SrXPQxAVgQMM8jo=; b=Qb8BE2LhiP6/szevQquQPtSIcl04smxcZdhhEu3DJtAzUV++OULLhDutIYOV9vjKzl ntLfhyX1RqqaixLmE94L15YdiZdtb2HuZTWr5FYociptK6LEtlZQfd7MqxSiqVa7T9l0 8INgMBqYU0mq4vwYYAqpz6OvK1Al640qn0Nx6rRS0U32aGE7tQjXHKTuZJcwNovnCdQZ onYN3t+xhkmJlWWn04yilAGnOjw+BSnvnJfpAvXPH4a/i5EMIPgvQJdsArxX48Iqfq/S riuzgr6W7zDLq0Jp1zPJwQnmzAc1D3OHcUkkKdDn85vwUtT0TDI3535oTx7fvPV2hXas K7gQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718974442; x=1719579242; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=nPRd2FB2iUNkgjpWUdTtCAtAHhl4SrXPQxAVgQMM8jo=; b=trPCeY0TdBbyrDROIAKCswI0L/QPKtepUfunYPTuB2mBA1iv5w+mgVht4/R++4yuSV L02DwrRwBG2TC9v4eW7K7SlDEM0tTLWgDyGi//azEw7ecCNPXOoK+OHQn+AUD1Uf9WLF B5XLX8N98LBiQdvvE1FkCrp5NiTRB6v71v9jEJloI7D1XRUlndjuHoiaXjjD4ZxhTyTb My6jZz1K0iQpQ+c2DSvgc+7I/QHo9p6khnAksTGZJU/vUS7rccoQznoJh6Nz2TMT48Oq ZhHC3XgbFF4HeE76pj1HZwEYug2vMctF/9injec20R6VXAFLqM/rXQCucVQl9MEhhHpE EGcg== X-Gm-Message-State: AOJu0YyBFGQIMgELTvHnw1q7OX7hJYGBhaMsOdbPqpT9QX3NQ8f7DUWp TU5b8WZCGVTUQW8v/8CwhyPeiVcpYlUPTjuR+RCauZ5504sed2sc50uC5dQq7EPeR+N02V6eyqU = X-Google-Smtp-Source: AGHT+IF50RFXuYGwFf+4wEZS52ZDYLSBE48Y6PpeDy0N9KfkCg4elOyevd+vzM2NRGs3DFpD5lYHJQ== X-Received: by 2002:a2e:8346:0:b0:2ec:18e5:e68f with SMTP id 38308e7fff4ca-2ec3cff2e5cmr50741041fa.33.1718974441892; Fri, 21 Jun 2024 05:54:01 -0700 (PDT) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7065124dbc3sm1307388b3a.120.2024.06.21.05.53.58 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 21 Jun 2024 05:54:01 -0700 (PDT) Message-ID: Date: Fri, 21 Jun 2024 14:53:55 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v2 8/8] x86/APX: apply NDD-to-legacy transformation to further CMOVcc forms From: Jan Beulich To: Binutils Cc: "H.J. Lu" , Lili Cui , "Jiang, Haochen" References: <72726722-9d28-4d82-84ef-320e1786b0e4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <72726722-9d28-4d82-84ef-320e1786b0e4@suse.com> X-Spam-Status: No, score=-3024.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: binutils-bounces+patchwork=sourceware.org@sourceware.org With both sources being registers, these insns are almost commutative; the only extra adjustment needed is inversion of the encoded condition. --- Down the road the same will want doing for register-only 3-operand CFCMOVcc, just that there it'll likely be less desirable to re-use the NDD-to-legacy logic. --- v2: New. --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -456,6 +456,9 @@ struct _i386_insn /* Disable instruction size optimization. */ bool no_optimize; + /* Invert the condition encoded in a base opcode. */ + bool invert_cond; + /* How to encode instructions. */ enum { @@ -3918,6 +3921,11 @@ install_template (const insn_template *t i.tm.base_opcode >>= 8; } + /* For CMOVcc having undergone NDD-to-legacy optimization with its source + operands being swapped, we need to invert the encoded condition. */ + if (i.invert_cond) + i.tm.base_opcode ^= 1; + /* Note that for pseudo prefixes this produces a length of 1. But for them the length isn't interesting at all. */ for (l = 1; l < 4; ++l) @@ -9952,7 +9960,14 @@ match_template (char mnem_suffix) && !i.op[i.operands - 1].regs->reg_type.bitfield.qword))) { if (i.operands > 2 && match_dest_op == i.operands - 3) - swap_2_operands (match_dest_op, i.operands - 2); + { + swap_2_operands (match_dest_op, i.operands - 2); + + /* CMOVcc is marked commutative, but then also needs its + encoded condition inverted. */ + if ((t->base_opcode | 0xf) == 0x4f) + i.invert_cond = true; + } --i.operands; --i.reg_operands; --- a/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d +++ b/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d @@ -118,6 +118,22 @@ Disassembly of section .text: \s*[a-f0-9]+:\s*67 0f 4d 90 90 90 90 90 cmovge -0x6f6f6f70\(%eax\),%edx \s*[a-f0-9]+:\s*67 0f 4e 90 90 90 90 90 cmovle -0x6f6f6f70\(%eax\),%edx \s*[a-f0-9]+:\s*67 0f 4f 90 90 90 90 90 cmovg -0x6f6f6f70\(%eax\),%edx +\s*[a-f0-9]+:\s*0f 41 d1 cmovno %ecx,%edx +\s*[a-f0-9]+:\s*0f 40 d1 cmovo %ecx,%edx +\s*[a-f0-9]+:\s*0f 43 d1 cmovae %ecx,%edx +\s*[a-f0-9]+:\s*0f 42 d1 cmovb %ecx,%edx +\s*[a-f0-9]+:\s*0f 45 d1 cmovne %ecx,%edx +\s*[a-f0-9]+:\s*0f 44 d1 cmove %ecx,%edx +\s*[a-f0-9]+:\s*0f 47 d1 cmova %ecx,%edx +\s*[a-f0-9]+:\s*0f 46 d1 cmovbe %ecx,%edx +\s*[a-f0-9]+:\s*0f 49 d1 cmovns %ecx,%edx +\s*[a-f0-9]+:\s*0f 48 d1 cmovs %ecx,%edx +\s*[a-f0-9]+:\s*0f 4b d1 cmovnp %ecx,%edx +\s*[a-f0-9]+:\s*0f 4a d1 cmovp %ecx,%edx +\s*[a-f0-9]+:\s*0f 4d d1 cmovge %ecx,%edx +\s*[a-f0-9]+:\s*0f 4c d1 cmovl %ecx,%edx +\s*[a-f0-9]+:\s*0f 4f d1 cmovg %ecx,%edx +\s*[a-f0-9]+:\s*0f 4e d1 cmovle %ecx,%edx \s*[a-f0-9]+:\s*62 f4 7d 08 60 c0 movbe %ax,%ax \s*[a-f0-9]+:\s*49 0f c8 bswap %r8 \s*[a-f0-9]+:\s*d5 98 c8 bswap %r16 --- a/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s +++ b/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s @@ -112,6 +112,22 @@ cmovl 0x90909090(%eax),%edx,%edx cmovge 0x90909090(%eax),%edx,%edx cmovle 0x90909090(%eax),%edx,%edx cmovg 0x90909090(%eax),%edx,%edx +cmovo %edx,%ecx,%edx +cmovno %edx,%ecx,%edx +cmovc %edx,%ecx,%edx +cmovnc %edx,%ecx,%edx +cmovz %edx,%ecx,%edx +cmovnz %edx,%ecx,%edx +cmovna %edx,%ecx,%edx +cmovnbe %edx,%ecx,%edx +cmovs %edx,%ecx,%edx +cmovns %edx,%ecx,%edx +cmovpe %edx,%ecx,%edx +cmovpo %edx,%ecx,%edx +cmovnge %edx,%ecx,%edx +cmovnl %edx,%ecx,%edx +cmovng %edx,%ecx,%edx +cmovnle %edx,%ecx,%edx movbe %ax,%ax movbe %r8,%r8 movbe %r16,%r16 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -985,7 +985,10 @@ ud2b, 0xfb9, i186, Modrm|CheckOperandSiz // 3rd official undefined instr (older CPUs don't take a ModR/M byte) ud0, 0xfff, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 } -cmov, 0x4, CMOV&APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64 } +// C (commutative) isn't quite correct here on its own; the condition also +// needs inverting when source operands are swapped in order to convert to +// legacy encoding. The assembler will take care of that. +cmov, 0x4, CMOV&APX_F, C|Modrm|CheckOperandSize|No_bSuf|No_sSuf|DstVVVV|EVexMap4|Optimize, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64 } cmov, 0xf4, CMOV, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 } fcmovb, 0xda/0, i687, Modrm|NoSuf, { FloatReg, FloatAcc }