From patchwork Mon Feb 3 11:41:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 105903 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0D1BF3858401 for ; Mon, 3 Feb 2025 11:41:59 +0000 (GMT) X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by sourceware.org (Postfix) with ESMTPS id 19CDA3858282 for ; Mon, 3 Feb 2025 11:41:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 19CDA3858282 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 19CDA3858282 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::52c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1738582866; cv=none; b=VwYMZIkwqJ8gXdhaJUGFAf7pUjOeiCd4MGBkWIHCShZiNMv0ho4Fq0i20BsxkqCbGkVazyfbCRLAidXYkuOhuU7SrQvwDwVVdm6KuMUsZWbypWAOgj44HOEsIRPirMdWxPCkse5S4fsD6AegZbN6+aZ2e//r800ZQmlDw6wQT5k= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1738582866; c=relaxed/simple; bh=gg0uUX1xgQzD59iNck9eZFKxbZvqxoRKo9lHrjFs4MY=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:From:To; b=wAb+odiuKNBnDThM988TgAHvjTL8oE0Ck8syYE9JjPawf7fRCN5FhK5ZPEthckpy7lD+BHa1MG7+yAOkHsUhv+KbW1kWCXW6PXUmWPec5x/hUQAnjrAkbqRBdn8XkOcG49p7lsRTbbpPNNhgTXQUErYpbezydKzTEByi1Ot9Mk8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ed1-x52c.google.com with SMTP id 4fb4d7f45d1cf-5d3f57582a2so10536315a12.1 for ; Mon, 03 Feb 2025 03:41:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1738582862; x=1739187662; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=rWC2pfgTrkxbolgrRxW8n/6YN2DmUPBXYkqaLSK8br0=; b=WvPDpIa0Bix3RN4rrcdA4SsqGzYflWzT5tZslK6NZsvP0sAFUD6ZHv+hm1WvGV9WLU 2OL1PSIcdwXUpgJbO/Zes8ePbt2uZhTCpDUnTh4yLU6VyUoaaJbsoYwa0RLI88syJDor AOwT0FXkCiPlO4rJrdOeiCA/NAmAkmIdtlVXDDjoG7e4LTBH7qATFebu7NKixz5Gv21k +9WzvNkktibtstxIlp6C4lcIFQbul52T1lw5NO2stZlleVlEYrcEhpFE2yowLbSeh4rb wh+wRyeul+2f8DHoxgU9K1NZKw+4n6sbqbpcpVj2pP7gNDLhj2ULnZCfsaGsx89DWDmj o06A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738582862; x=1739187662; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=rWC2pfgTrkxbolgrRxW8n/6YN2DmUPBXYkqaLSK8br0=; b=pu3ssunLvO8AuC5KBnLFx42zsNSfeSroofsM58ed6ULMXuBIaG1nEVOQTy1OEEUCek K0ud+AUD1xAxIrwu7EA4Sai/1wrPkKPXJawc+Q2OavusvIYhicpfXaPeojNVAzNjPkU0 RQoYFrdyYptLHrFvr4t5hkGNYNwKo2XbgRvN2Pxf7YYfRmod8RMZvs9YtnK/zoAtHFhM k7Jl2tytzXvbSwOzVt/fAMV6QDCCkuB7btTq48gs5gOWrcfPUSDuUBc4lb3ltG/GHmx9 nUa6mpJloCRM484wAaGdB69vbWtSRkMoufiTkllsmrh011dw8GNIoUByIYfuSuL7tCG0 Ga4Q== X-Gm-Message-State: AOJu0YxQEgyTH1aZb1sZhndAZe+BzGFxpyQYeID0xF02LDCvMd3aj2T7 5JQ+Ig6F/OZpI9WHIC3ADy4n/zuV1foa1f+/G9kYScXcGcvHdQmHFkpr4CaonXQO5WRYrIgYAw0 = X-Gm-Gg: ASbGncvkfUJMWuEkVQQ7kMerlXnu+KjKBt+CO4bd7QkC7IzuEWQKdf2/qBps2ntIlu1 wN+54gt+/k5SEN/0/S8kdqZ/8WuIEJN74BcB+5vZ4e+ZS/JO7OtWohGYrcsCGphy2X9oDFD6lPA m9npwAuf6C6J+1uKyNBCXCeRCXCgf9a0KAxe9oC0nGKaXJqsQLk72d2O5mvfPC8pTOTVIg8ygHB NNzkBRPhs5Mq8req9cyKDYec3eyjOnXCAZW2iScfFFdT+ixWVvQSXdIRRAXuUdnsuCJUiExfQAJ Yp8ewd7z/sjC/71GOSEG5IhFGipZghVBDvLBZHfvbvr7svLu55mlUjazjT38Lx9D0CE+7A2LU9f W X-Google-Smtp-Source: AGHT+IGnvU2CNe2b7KUteHHhliiMZu5t7iinFIDCIzwkfIgKb0G94lMCGTH5A243IbIJOg6s8h8Chw== X-Received: by 2002:a05:6402:34c2:b0:5d9:f0d8:22d5 with SMTP id 4fb4d7f45d1cf-5dc6f606260mr20210089a12.13.1738582861503; Mon, 03 Feb 2025 03:41:01 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5dc724aa0e8sm7531344a12.49.2025.02.03.03.41.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 03 Feb 2025 03:41:01 -0800 (PST) Message-ID: <441a0de1-bbac-4235-a58e-f0393b218f1a@suse.com> Date: Mon, 3 Feb 2025 12:41:03 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH 3/5] x86: widen @got{,pcrel} support to PUSH and APX IMUL From: Jan Beulich To: Binutils Cc: "H.J. Lu" References: <5bc20aeb-77a8-49c5-9aa1-3548ae39d462@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <5bc20aeb-77a8-49c5-9aa1-3548ae39d462@suse.com> X-Spam-Status: No, score=-3021.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: binutils-bounces~patchwork=sourceware.org@sourceware.org With us doing the transformation to an immediate operand for MOV and various ALU insns, there's little reason to then not support the same conversion for the other two insns which have respective immediate operand forms. Unfortunately for IMUL (due to the 0F opcode prefix) there's no suitable relocation, so the pre-APX forms cannot be marked for relaxation in the assembler. --- TBD: Is REX2 really permitted with PUSH ? --- a/bfd/elf32-i386.c +++ b/bfd/elf32-i386.c @@ -1209,6 +1209,10 @@ elf_i386_tls_transition (struct bfd_link to test $foo, %reg1 and convert + push foo@GOT[(%reg)] + to + push $foo + and convert binop foo@GOT[(%reg1)], %reg2 to binop $foo, %reg2 @@ -1233,7 +1237,7 @@ elf_i386_convert_load_reloc (bfd *abfd, unsigned int addend; unsigned int nop; bfd_vma nop_offset; - bool is_pic; + bool is_pic, is_branch = false; bool to_reloc_32; bool abs_symbol; unsigned int r_type; @@ -1301,6 +1305,23 @@ elf_i386_convert_load_reloc (bfd *abfd, opcode = bfd_get_8 (abfd, contents + roff - 2); + if (opcode == 0xff) + { + switch (modrm & 0x38) + { + case 0x10: /* CALL */ + case 0x20: /* JMP */ + is_branch = true; + break; + + case 0x30: /* PUSH */ + break; + + default: + return true; + } + } + /* Convert to R_386_32 if PIC is false or there is no base register. */ to_reloc_32 = !is_pic || baseless; @@ -1311,7 +1332,7 @@ elf_i386_convert_load_reloc (bfd *abfd, reloc. */ if (h == NULL) { - if (opcode == 0x0ff) + if (is_branch) /* Convert "call/jmp *foo@GOT[(%reg)]". */ goto convert_branch; else @@ -1327,7 +1348,7 @@ elf_i386_convert_load_reloc (bfd *abfd, && !eh->linker_def && local_ref) { - if (opcode == 0xff) + if (is_branch) { /* No direct branch to 0 for PIC. */ if (is_pic) @@ -1343,7 +1364,7 @@ elf_i386_convert_load_reloc (bfd *abfd, } } - if (opcode == 0xff) + if (is_branch) { /* We have "call/jmp *foo@GOT[(%reg)]". */ if ((h->root.type == bfd_link_hash_defined @@ -1399,7 +1420,8 @@ elf_i386_convert_load_reloc (bfd *abfd, else { /* We have "mov foo@GOT[(%re1g)], %reg2", - "test %reg1, foo@GOT(%reg2)" and + "test %reg1, foo@GOT(%reg2)", + "push foo@GOT[(%reg)]", or "binop foo@GOT[(%reg1)], %reg2". Avoid optimizing _DYNAMIC since ld.so may use its @@ -1460,6 +1482,13 @@ elf_i386_convert_load_reloc (bfd *abfd, modrm = 0xc0 | ((modrm & 0x38) >> 3) | (opcode & 0x38); opcode = 0x81; } + else if (opcode == 0xff) + { + /* Convert "push foo@GOT(%reg)" to + "push $foo". */ + modrm = 0x68; /* Really the opcode. */ + opcode = 0x26; /* Really a meaningless %es: prefix. */ + } else return true; --- a/bfd/elf64-x86-64.c +++ b/bfd/elf64-x86-64.c @@ -1739,13 +1739,16 @@ elf_x86_64_need_pic (struct bfd_link_inf } /* Move the R bits to the B bits in EVEX payload byte 1. */ -static unsigned int evex_move_r_to_b (unsigned int byte1) +static unsigned int evex_move_r_to_b (unsigned int byte1, bool copy) { byte1 = (byte1 & ~(1 << 5)) | ((byte1 & (1 << 7)) >> 2); /* R3 -> B3 */ byte1 = (byte1 & ~(1 << 3)) | ((~byte1 & (1 << 4)) >> 1); /* R4 -> B4 */ /* Set both R bits, as they're inverted. */ - return byte1 | (1 << 4) | (1 << 7); + if (!copy) + byte1 |= (1 << 4) | (1 << 7); + + return byte1; } /* With the local symbol, foo, we convert @@ -1762,10 +1765,14 @@ static unsigned int evex_move_r_to_b (un to test $foo, %reg and convert + push foo@GOTPCREL(%rip) + to + push $foo + and convert binop foo@GOTPCREL(%rip), %reg to binop $foo, %reg - where binop is one of adc, add, and, cmp, or, sbb, sub, xor + where binop is one of adc, add, and, cmp, imul, or, sbb, sub, xor instructions. */ static bool @@ -1781,6 +1788,7 @@ elf_x86_64_convert_load_reloc (bfd *abfd bool is_pic; bool no_overflow; bool relocx; + bool is_branch = false; bool to_reloc_pc32; bool abs_symbol; bool local_ref; @@ -1873,6 +1881,23 @@ elf_x86_64_convert_load_reloc (bfd *abfd r_symndx = htab->r_sym (irel->r_info); opcode = bfd_get_8 (abfd, contents + roff - 2); + modrm = bfd_get_8 (abfd, contents + roff - 1); + if (opcode == 0xff) + { + switch (modrm & 0x38) + { + case 0x10: /* CALL */ + case 0x20: /* JMP */ + is_branch = true; + break; + + case 0x30: /* PUSH */ + break; + + default: + return true; + } + } /* Convert mov to lea since it has been done for a while. */ if (opcode != 0x8b) @@ -1890,7 +1915,7 @@ elf_x86_64_convert_load_reloc (bfd *abfd 3. no_overflow is true. 4. PIC. */ - to_reloc_pc32 = (opcode == 0xff + to_reloc_pc32 = (is_branch || !relocx || no_overflow || is_pic); @@ -1942,7 +1967,7 @@ elf_x86_64_convert_load_reloc (bfd *abfd && !eh->linker_def && local_ref)) { - if (opcode == 0xff) + if (is_branch) { /* Skip for branch instructions since R_X86_64_PC32 may overflow. */ @@ -2009,7 +2034,7 @@ elf_x86_64_convert_load_reloc (bfd *abfd return true; convert: - if (opcode == 0xff) + if (is_branch) { /* We have "call/jmp *foo@GOTPCREL(%rip)". */ unsigned int nop; @@ -2018,7 +2043,6 @@ elf_x86_64_convert_load_reloc (bfd *abfd /* Convert R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX to R_X86_64_PC32. */ - modrm = bfd_get_8 (abfd, contents + roff - 1); if (modrm == 0x25) { /* Convert to "jmp foo nop". */ @@ -2064,11 +2088,12 @@ elf_x86_64_convert_load_reloc (bfd *abfd } else if (r_type == R_X86_64_CODE_6_GOTPCRELX && opcode != 0x8b) { + bool move_v_r = false; + /* R_X86_64_PC32 isn't supported. */ if (to_reloc_pc32) return true; - modrm = bfd_get_8 (abfd, contents + roff - 1); if (opcode == 0x85) { /* Convert "ctest %reg, foo@GOTPCREL(%rip)" to @@ -2094,6 +2119,23 @@ elf_x86_64_convert_load_reloc (bfd *abfd modrm = 0xc0 | ((modrm & 0x38) >> 3) | (opcode & 0x38); opcode = 0x81; } + else if (opcode == 0xaf) + { + if (!(evex[2] & 0x10)) + { + /* Convert "imul foo@GOTPCREL(%rip), %reg" to + "imul $foo, %reg, %reg". */ + modrm = 0xc0 | ((modrm & 0x38) >> 3) | (modrm & 0x38); + } + else + { + /* Convert "imul foo@GOTPCREL(%rip), %reg1, %reg2" to + "imul $foo, %reg1, %reg2". */ + modrm = 0xc0 | ((modrm & 0x38) >> 3) | (~evex[1] & 0x38); + move_v_r = true; + } + opcode = 0x69; + } else return true; @@ -2119,7 +2161,23 @@ elf_x86_64_convert_load_reloc (bfd *abfd bfd_put_8 (abfd, opcode, contents + roff - 2); bfd_put_8 (abfd, modrm, contents + roff - 1); - evex[0] = evex_move_r_to_b (evex[0]); + evex[0] = evex_move_r_to_b (evex[0], opcode == 0x69 && !move_v_r); + if (move_v_r) + { + /* Move the top two V bits to the R bits in EVEX payload byte 1. + Note that evex_move_r_to_b() set both R bits. */ + if (!(evex[1] & (1 << 6))) + evex[0] &= ~(1 << 7); /* V3 -> R3 */ + if (!(evex[2] & (1 << 3))) + evex[0] &= ~(1 << 4); /* V4 -> R4 */ + /* Set all V bits, as they're inverted. */ + evex[1] |= 0xf << 3; + evex[2] |= 1 << 3; + /* Clear the ND (ZU) bit (it ought to be ignored anyway). */ + evex[2] &= ~(1 << 4); + bfd_put_8 (abfd, evex[2], contents + roff - 3); + bfd_put_8 (abfd, evex[1], contents + roff - 4); + } bfd_put_8 (abfd, evex[0], contents + roff - 5); /* No addend for R_X86_64_32/R_X86_64_32S relocations. */ @@ -2162,7 +2220,10 @@ elf_x86_64_convert_load_reloc (bfd *abfd { if (bfd_get_8 (abfd, contents + roff - 4) == 0xd5) { - rex2 = bfd_get_8 (abfd, contents + roff - 3); + /* Make sure even an all-zero payload leaves a non-zero value + in the variable. */ + rex2 = bfd_get_8 (abfd, contents + roff - 3) | 0x100; + rex2_mask |= 0x100; rex_w = (rex2 & REX_W) != 0; } else if (bfd_get_8 (abfd, contents + roff - 4) == 0x0f) @@ -2195,7 +2256,6 @@ elf_x86_64_convert_load_reloc (bfd *abfd /* Convert "mov foo@GOTPCREL(%rip), %reg" to "mov $foo, %reg". */ opcode = 0xc7; - modrm = bfd_get_8 (abfd, contents + roff - 1); modrm = 0xc0 | (modrm & 0x38) >> 3; if (rex_w && ABI_64_P (link_info->output_bfd)) { @@ -2222,7 +2282,6 @@ elf_x86_64_convert_load_reloc (bfd *abfd if (to_reloc_pc32) return true; - modrm = bfd_get_8 (abfd, contents + roff - 1); if (opcode == 0x85) { /* Convert "test %reg, foo@GOTPCREL(%rip)" to @@ -2237,6 +2296,39 @@ elf_x86_64_convert_load_reloc (bfd *abfd modrm = 0xc0 | ((modrm & 0x38) >> 3) | (opcode & 0x38); opcode = 0x81; } + else if (opcode == 0xaf && (rex2 & (REX2_M << 4))) + { + /* Convert "imul foo@GOTPCREL(%rip), %reg" to + "imul $foo, %reg, %reg". */ + modrm = 0xc0 | ((modrm & 0x38) >> 3) | (modrm & 0x38); + rex_mask = 0; + rex2_mask = REX2_M << 4; + opcode = 0x69; + } + else if (opcode == 0xff && !(rex2 & (REX2_M << 4))) + { + /* Convert "push foo@GOTPCREL(%rip)" to + "push $foo". */ + bfd_put_8 (abfd, 0x68, contents + roff - 1); + if (rex) + { + bfd_put_8 (abfd, 0x26, contents + roff - 3); + bfd_put_8 (abfd, rex, contents + roff - 2); + } + else if (rex2) + { + bfd_put_8 (abfd, 0x26, contents + roff - 4); + bfd_put_8 (abfd, 0xd5, contents + roff - 3); + bfd_put_8 (abfd, rex2, contents + roff - 2); + } + else + bfd_put_8 (abfd, 0x26, contents + roff - 2); + + r_type = R_X86_64_32S; + /* No addend for R_X86_64_32S relocations. */ + irel->r_addend = 0; + goto finish; + } else return true; @@ -2297,6 +2389,7 @@ elf_x86_64_convert_load_reloc (bfd *abfd } } + finish: *r_type_p = r_type; irel->r_info = htab->r_info (r_symndx, r_type | R_X86_64_converted_reloc_bit); @@ -4378,7 +4471,7 @@ elf_x86_64_relocate_section (bfd *output continue; } - byte1 = evex_move_r_to_b (byte1); + byte1 = evex_move_r_to_b (byte1, false); bfd_put_8 (output_bfd, byte1, contents + roff - 5); bfd_put_8 (output_bfd, 0x81, contents + roff - 2); bfd_put_8 (output_bfd, 0xc0 | reg, contents + roff - 1); --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -12928,9 +12928,9 @@ output_disp (fragS *insn_start_frag, off else if (object_64bit) continue; - /* Check for "call/jmp *mem", "mov mem, %reg", "movrs mem, %reg", - "test %reg, mem" and "binop mem, %reg" where binop - is one of adc, add, and, cmp, or, sbb, sub, xor + /* Check for "call/jmp *mem", "push mem", "mov mem, %reg", + "movrs mem, %reg", "test %reg, mem" and "binop mem, %reg" where + binop is one of adc, add, and, cmp, or, sbb, sub, xor, or imul instructions without data prefix. Always generate R_386_GOT32X for "sym*GOT" operand in 32-bit mode. */ unsigned int space = dot_insn () ? i.insn_opcode_space @@ -12940,7 +12940,7 @@ output_disp (fragS *insn_start_frag, off || (i.rm.mode == 0 && i.rm.regmem == 5)) && ((space == SPACE_BASE && i.tm.base_opcode == 0xff - && (i.rm.reg == 2 || i.rm.reg == 4)) + && (i.rm.reg == 2 || i.rm.reg == 4 || i.rm.reg == 6)) || ((space == SPACE_BASE || space == SPACE_0F38 || space == SPACE_MAP4) @@ -12949,7 +12949,13 @@ output_disp (fragS *insn_start_frag, off || space == SPACE_MAP4) && (i.tm.base_opcode == 0x85 || (i.tm.base_opcode - | (i.operands > 2 ? 0x3a : 0x38)) == 0x3b)))) + | (i.operands > 2 ? 0x3a : 0x38)) == 0x3b)) + || (((space == SPACE_0F + /* Because of the 0F prefix, no suitable relocation + exists for this unless it's REX2-encoded. */ + && is_apx_rex2_encoding ()) + || space == SPACE_MAP4) + && i.tm.base_opcode == 0xaf))) { if (object_64bit) { --- a/ld/testsuite/ld-i386/i386.exp +++ b/ld/testsuite/ld-i386/i386.exp @@ -373,6 +373,8 @@ run_dump_test "load5a" run_dump_test "load5b" run_dump_test "load6" run_dump_test "load7" +run_dump_test "load8a" +run_dump_test "load8b" run_dump_test "pr19175" run_dump_test "pr19615" run_dump_test "pr19636-1a" --- /dev/null +++ b/ld/testsuite/ld-i386/load8.s @@ -0,0 +1,20 @@ + .data + .type bar, @object +bar: + .byte 1 + .size bar, .-bar + .globl foo + .type foo, @object +foo: + .byte 1 + .size foo, .-foo + .text + .globl _start + .type _start, @function +_start: + push bar@GOT(%ecx) + push foo@GOT(%edx) + .ifndef PIC + push foo@GOT + .endif + .size _start, .-_start --- /dev/null +++ b/ld/testsuite/ld-i386/load8a.d @@ -0,0 +1,14 @@ +#source: load8.s +#as: --32 -mrelax-relocations=yes +#ld: -melf_i386 -z noseparate-code +#objdump: -dw + +.*: +file format .* + +Disassembly of section .text: + +0+8048074 <_start>: +[ ]*[a-f0-9]+: 26 68 86 90 04 08 es push \$0x8049086 +[ ]*[a-f0-9]+: 26 68 87 90 04 08 es push \$0x8049087 +[ ]*[a-f0-9]+: 26 68 87 90 04 08 es push \$0x8049087 +#pass --- /dev/null +++ b/ld/testsuite/ld-i386/load8b.d @@ -0,0 +1,13 @@ +#source: load8.s +#as: --32 -mshared -mrelax-relocations=yes --defsym PIC=1 +#ld: -melf_i386 -shared -z noseparate-code +#objdump: -dw + +.*: +file format .* + +Disassembly of section .text: + +0+[0-9a-f]+ <_start>: +[ ]*[0-9a-f]+: ff b1 f8 ff ff ff push -0x8\(%ecx\) +[ ]*[0-9a-f]+: ff b2 fc ff ff ff push -0x4\(%edx\) +#pass --- a/ld/testsuite/ld-x86-64/apx-load1.s +++ b/ld/testsuite/ld-x86-64/apx-load1.s @@ -118,5 +118,11 @@ _start: sub %rbp, bar@GOTPCREL(%rip), %r21 xor %rsi, bar@GOTPCREL(%rip), %r22 + imul bar@GOTPCREL(%rip), %r17 + {nf} imul bar@GOTPCREL(%rip), %r17 + imul bar@GOTPCREL(%rip), %r17, %rdx + imul bar@GOTPCREL(%rip), %rcx, %r18 + {rex2} pushq bar@GOTPCREL(%rip) + .size _start, .-_start .p2align 12, 0x90 --- a/ld/testsuite/ld-x86-64/apx-load1a.d +++ b/ld/testsuite/ld-x86-64/apx-load1a.d @@ -115,4 +115,9 @@ Disassembly of section .text: +[a-f0-9]+: 62 f4 dc 10 19 25 74 0c 20 00 sbb %rsp,0x200c74\(%rip\),%r20 # 602000 <.*> +[a-f0-9]+: 62 f4 d4 10 29 2d 6a 0c 20 00 sub %rbp,0x200c6a\(%rip\),%r21 # 602000 <.*> +[a-f0-9]+: 62 f4 cc 10 81 f6 20 20 60 00 xor \$0x602020,%rsi,%r22 + +[a-f0-9]+: d5 58 69 c9 20 20 60 00 imul \$0x602020,%r17,%r17 + +[a-f0-9]+: 62 ec fc 0c 69 c9 20 20 60 00 \{nf\} imul \$0x602020,%r17,%r17 + +[a-f0-9]+: 62 fc fc 08 69 d1 20 20 60 00 imul \$0x602020,%r17,%rdx + +[a-f0-9]+: 62 e4 fc 08 69 d1 20 20 60 00 imul \$0x602020,%rcx,%r18 + +[a-f0-9]+: 26 d5 00 68 20 20 60 00 es \{rex2 0x0\} push \$0x602020 #pass --- a/ld/testsuite/ld-x86-64/apx-load1c.d +++ b/ld/testsuite/ld-x86-64/apx-load1c.d @@ -108,4 +108,9 @@ Disassembly of section .text: +[a-f0-9]+: 62 f4 dc 10 19 25 54 0d 20 00 sbb %rsp,0x200d54\(%rip\),%r20 # 2020e0 <.*> +[a-f0-9]+: 62 f4 d4 10 29 2d 4a 0d 20 00 sub %rbp,0x200d4a\(%rip\),%r21 # 2020e0 <.*> +[a-f0-9]+: 62 f4 cc 10 31 35 40 0d 20 00 xor %rsi,0x200d40\(%rip\),%r22 # 2020e0 <.*> + +[a-f0-9]+: d5 c8 af 0d 38 0d 20 00 imul 0x200d38\(%rip\),%r17 # 2020e0 <.*> + +[a-f0-9]+: 62 e4 fc 0c af 0d 2e 0d 20 00 \{nf\} imul 0x200d2e\(%rip\),%r17 # 2020e0 <.*> + +[a-f0-9]+: 62 e4 ec 18 af 0d 24 0d 20 00 imul 0x200d24\(%rip\),%r17,%rdx # 2020e0 <.*> + +[a-f0-9]+: 62 f4 ec 10 af 0d 1a 0d 20 00 imul 0x200d1a\(%rip\),%rcx,%r18 # 2020e0 <.*> + +[a-f0-9]+: d5 00 ff 35 12 0d 20 00 \{rex2 0x0\} push 0x200d12\(%rip\) # 2020e0 <.*> #pass --- a/ld/testsuite/ld-x86-64/apx-load1d.d +++ b/ld/testsuite/ld-x86-64/apx-load1d.d @@ -108,4 +108,9 @@ Disassembly of section .text: +[a-f0-9]+: 62 f4 dc 10 19 25 e4 0c 20 00 sbb %rsp,0x200ce4\(%rip\),%r20 # 202070 <.*> +[a-f0-9]+: 62 f4 d4 10 29 2d da 0c 20 00 sub %rbp,0x200cda\(%rip\),%r21 # 202070 <.*> +[a-f0-9]+: 62 f4 cc 10 31 35 d0 0c 20 00 xor %rsi,0x200cd0\(%rip\),%r22 # 202070 <.*> + +[a-f0-9]+: d5 c8 af 0d c8 0c 20 00 imul 0x200cc8\(%rip\),%r17 # 202070 <.*> + +[a-f0-9]+: 62 e4 fc 0c af 0d be 0c 20 00 \{nf\} imul 0x200cbe\(%rip\),%r17 # 202070 <.*> + +[a-f0-9]+: 62 e4 ec 18 af 0d b4 0c 20 00 imul 0x200cb4\(%rip\),%r17,%rdx # 202070 <.*> + +[a-f0-9]+: 62 f4 ec 10 af 0d aa 0c 20 00 imul 0x200caa\(%rip\),%rcx,%r18 # 202070 <.*> + +[a-f0-9]+: d5 00 ff 35 a2 0c 20 00 \{rex2 0x0\} push 0x200ca2\(%rip\) # 202070 <.*> #pass --- /dev/null +++ b/ld/testsuite/ld-x86-64/load5.s @@ -0,0 +1,17 @@ + .data + .type bar, @object +bar: + .byte 1 + .size bar, .-bar + .globl foo + .type foo, @object +foo: + .byte 1 + .size foo, .-foo + .text + .globl _start + .type _start, @function +_start: + pushq bar@GOTPCREL(%rip) + {rex} pushq foo@GOTPCREL(%rip) + .size _start, .-_start --- /dev/null +++ b/ld/testsuite/ld-x86-64/load5a.d @@ -0,0 +1,15 @@ +#source: load5.s +#as: --64 -mrelax-relocations=yes +#ld: -melf_x86_64 +#objdump: -dw + +.*: +file format .* + + +Disassembly of section .text: + +#... +[a-f0-9]+ <_start>: +[ ]*[a-f0-9]+: 26 68 ([0-9a-f]{2} ){4} * es push \$0x[a-f0-9]+ +[ ]*[a-f0-9]+: 26 40 68 ([0-9a-f]{2} ){4} * es rex push \$0x[a-f0-9]+ +#pass --- /dev/null +++ b/ld/testsuite/ld-x86-64/load5b.d @@ -0,0 +1,15 @@ +#source: load5.s +#as: --64 -mrelax-relocations=yes +#ld: -shared -melf_x86_64 +#objdump: -dw + +.*: +file format .* + + +Disassembly of section .text: + +#... +[a-f0-9]+ <_start>: +[ ]*[a-f0-9]+: ff 35 ([0-9a-f]{2} ){4} * push +0x[a-f0-9]+\(%rip\) # [a-f0-9]+ <.*> +[ ]*[a-f0-9]+: 40 ff 35 ([0-9a-f]{2} ){4} * rex push 0x[a-f0-9]+\(%rip\) # [a-f0-9]+ <.*> +#pass --- a/ld/testsuite/ld-x86-64/x86-64.exp +++ b/ld/testsuite/ld-x86-64/x86-64.exp @@ -654,6 +654,8 @@ run_dump_test "load2" run_dump_test "load3a" run_dump_test "load3b" run_dump_test "load4" +run_dump_test "load5a" +run_dump_test "load5b" run_dump_test "call1a" run_dump_test "call1b" run_dump_test "call1c"