[1/2] x86/APX: squash REX prefix when REX2 is being emitted
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_binutils_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 |
success
|
Test passed
|
linaro-tcwg-bot/tcwg_binutils_check--master-arm |
success
|
Test passed
|
Commit Message
We should not (silently) emit a REX prefix ahead of a REX2-encoded insn;
such encodings are illegal. Best we can do is fold the REX bits into the
REX2 prefix, and then zap the REX one from i.prefix[].
Comments
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Tuesday, October 8, 2024 6:26 PM
> To: Binutils <binutils@sourceware.org>
> Cc: Cui, Lili <lili.cui@intel.com>; H.J. Lu <hjl.tools@gmail.com>
> Subject: [PATCH 1/2] x86/APX: squash REX prefix when REX2 is being emitted
>
> We should not (silently) emit a REX prefix ahead of a REX2-encoded insn;
> such encodings are illegal. Best we can do is fold the REX bits into the
> REX2 prefix, and then zap the REX one from i.prefix[].
>
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -4526,7 +4526,8 @@ build_rex2_prefix (void)
> i.vex.bytes[0] = 0xd5;
> /* For the W R X B bits, the variables of rex prefix will be reused. */
> i.vex.bytes[1] = ((i.tm.opcode_space << 7)
> - | (i.rex2 << 4) | i.rex);
> + | (i.rex2 << 4)
> + | ((i.rex | i.prefix[REX_PREFIX]) & 0xf));
> }
>
> /* Build the EVEX prefix (4-byte) for evex insn @@ -4675,6 +4676,7 @@
> static void establish_rex (void)
> build_rex2_prefix ();
> /* The individual REX.RXBW bits got consumed. */
> i.rex &= REX_OPCODE;
> + i.prefix[REX_PREFIX] = 0;
> }
> else if (i.rex != 0)
> add_prefix (REX_OPCODE | i.rex);
> --- a/gas/testsuite/gas/i386/x86-64-apx-rex2.d
> +++ b/gas/testsuite/gas/i386/x86-64-apx-rex2.d
> @@ -80,4 +80,8 @@ Disassembly of section .text:
> [ ]*[a-f0-9]+:[ ]*d5 76 8d 7c 20 01 lea
> 0x1\(%r16,%r28,1\),%r31d
> [ ]*[a-f0-9]+:[ ]*d5 12 8d 84 04 81 00 00 00 lea
> 0x81\(%r20,%r8,1\),%eax
> [ ]*[a-f0-9]+:[ ]*d5 57 8d bc 04 81 00 00 00 lea
> 0x81\(%r28,%r8,1\),%r31d
> +[ ]*[a-f0-9]+:[ ]*d5 14 f7 14 24 \{rex2 0x14\} notl \(%r20\)
> +[ ]*[a-f0-9]+:[ ]*d5 12 f7 14 24 notl \(%r20,%r12,1\)
> +[ ]*[a-f0-9]+:[ ]*d5 11 f7 14 24 notl \(%r28\)
> +[ ]*[a-f0-9]+:[ ]*d5 18 f7 14 24 notq \(%r20\)
> #pass
> --- a/gas/testsuite/gas/i386/x86-64-apx-rex2.s
> +++ b/gas/testsuite/gas/i386/x86-64-apx-rex2.s
> @@ -83,3 +83,9 @@ _start:
> leal 1(%r16, %r28), %r31d
> leal 129(%r20, %r8), %eax
> leal 129(%r28, %r8), %r31d
> +
> +## explicit REX prefix
> + rex.r notl (%r20)
> + rex.x notl (%r20)
> + rex.b notl (%r20)
First of all, the compiler will not generate such instructions. For hand-written assembly, this usage is very dangerous, as Egpr appears later but the old prefix is still used (Of course, there could be various reasons.), but if this is a bug, or a problem that occurred after introducing Egpr and the assembler changes the register to 28, It is difficult to locate the error in the program. I think it would be better to report an error and ask users to use a sensible rex2 prefix.
Lili.
> + rex.w not (%r20)
On 09.10.2024 10:24, Cui, Lili wrote:
>
>
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Tuesday, October 8, 2024 6:26 PM
>> To: Binutils <binutils@sourceware.org>
>> Cc: Cui, Lili <lili.cui@intel.com>; H.J. Lu <hjl.tools@gmail.com>
>> Subject: [PATCH 1/2] x86/APX: squash REX prefix when REX2 is being emitted
>>
>> We should not (silently) emit a REX prefix ahead of a REX2-encoded insn;
>> such encodings are illegal. Best we can do is fold the REX bits into the
>> REX2 prefix, and then zap the REX one from i.prefix[].
>>
>> --- a/gas/config/tc-i386.c
>> +++ b/gas/config/tc-i386.c
>> @@ -4526,7 +4526,8 @@ build_rex2_prefix (void)
>> i.vex.bytes[0] = 0xd5;
>> /* For the W R X B bits, the variables of rex prefix will be reused. */
>> i.vex.bytes[1] = ((i.tm.opcode_space << 7)
>> - | (i.rex2 << 4) | i.rex);
>> + | (i.rex2 << 4)
>> + | ((i.rex | i.prefix[REX_PREFIX]) & 0xf));
>> }
>>
>> /* Build the EVEX prefix (4-byte) for evex insn @@ -4675,6 +4676,7 @@
>> static void establish_rex (void)
>> build_rex2_prefix ();
>> /* The individual REX.RXBW bits got consumed. */
>> i.rex &= REX_OPCODE;
>> + i.prefix[REX_PREFIX] = 0;
>> }
>> else if (i.rex != 0)
>> add_prefix (REX_OPCODE | i.rex);
>> --- a/gas/testsuite/gas/i386/x86-64-apx-rex2.d
>> +++ b/gas/testsuite/gas/i386/x86-64-apx-rex2.d
>> @@ -80,4 +80,8 @@ Disassembly of section .text:
>> [ ]*[a-f0-9]+:[ ]*d5 76 8d 7c 20 01 lea
>> 0x1\(%r16,%r28,1\),%r31d
>> [ ]*[a-f0-9]+:[ ]*d5 12 8d 84 04 81 00 00 00 lea
>> 0x81\(%r20,%r8,1\),%eax
>> [ ]*[a-f0-9]+:[ ]*d5 57 8d bc 04 81 00 00 00 lea
>> 0x81\(%r28,%r8,1\),%r31d
>> +[ ]*[a-f0-9]+:[ ]*d5 14 f7 14 24 \{rex2 0x14\} notl \(%r20\)
>> +[ ]*[a-f0-9]+:[ ]*d5 12 f7 14 24 notl \(%r20,%r12,1\)
>> +[ ]*[a-f0-9]+:[ ]*d5 11 f7 14 24 notl \(%r28\)
>> +[ ]*[a-f0-9]+:[ ]*d5 18 f7 14 24 notq \(%r20\)
>> #pass
>> --- a/gas/testsuite/gas/i386/x86-64-apx-rex2.s
>> +++ b/gas/testsuite/gas/i386/x86-64-apx-rex2.s
>> @@ -83,3 +83,9 @@ _start:
>> leal 1(%r16, %r28), %r31d
>> leal 129(%r20, %r8), %eax
>> leal 129(%r28, %r8), %r31d
>> +
>> +## explicit REX prefix
>> + rex.r notl (%r20)
>> + rex.x notl (%r20)
>> + rex.b notl (%r20)
>
> First of all, the compiler will not generate such instructions. For hand-written assembly, this usage is very dangerous, as Egpr appears later but the old prefix is still used (Of course, there could be various reasons.), but if this is a bug, or a problem that occurred after introducing Egpr and the assembler changes the register to 28, It is difficult to locate the error in the program. I think it would be better to report an error and ask users to use a sensible rex2 prefix.
What is "a sensible rex2 prefix" here? And how is the use here any different
from similar uses not involving eGPR-s? They're as dangerous there, imo. I
have no insight into why these explicit rex* prefixes were introduced back
at the time. But they're there, so they ought to function.
That said, diagnosing the uses instead is an option. Yet why would we then
want to be inconsistent and diagnose the 3 forms above, while not diagnosing
the 4th (rex.w) that you didn't comment on (and chopped off)?
Jan
@@ -4526,7 +4526,8 @@ build_rex2_prefix (void)
i.vex.bytes[0] = 0xd5;
/* For the W R X B bits, the variables of rex prefix will be reused. */
i.vex.bytes[1] = ((i.tm.opcode_space << 7)
- | (i.rex2 << 4) | i.rex);
+ | (i.rex2 << 4)
+ | ((i.rex | i.prefix[REX_PREFIX]) & 0xf));
}
/* Build the EVEX prefix (4-byte) for evex insn
@@ -4675,6 +4676,7 @@ static void establish_rex (void)
build_rex2_prefix ();
/* The individual REX.RXBW bits got consumed. */
i.rex &= REX_OPCODE;
+ i.prefix[REX_PREFIX] = 0;
}
else if (i.rex != 0)
add_prefix (REX_OPCODE | i.rex);
@@ -80,4 +80,8 @@ Disassembly of section .text:
[ ]*[a-f0-9]+:[ ]*d5 76 8d 7c 20 01 lea 0x1\(%r16,%r28,1\),%r31d
[ ]*[a-f0-9]+:[ ]*d5 12 8d 84 04 81 00 00 00 lea 0x81\(%r20,%r8,1\),%eax
[ ]*[a-f0-9]+:[ ]*d5 57 8d bc 04 81 00 00 00 lea 0x81\(%r28,%r8,1\),%r31d
+[ ]*[a-f0-9]+:[ ]*d5 14 f7 14 24 \{rex2 0x14\} notl \(%r20\)
+[ ]*[a-f0-9]+:[ ]*d5 12 f7 14 24 notl \(%r20,%r12,1\)
+[ ]*[a-f0-9]+:[ ]*d5 11 f7 14 24 notl \(%r28\)
+[ ]*[a-f0-9]+:[ ]*d5 18 f7 14 24 notq \(%r20\)
#pass
@@ -83,3 +83,9 @@ _start:
leal 1(%r16, %r28), %r31d
leal 129(%r20, %r8), %eax
leal 129(%r28, %r8), %r31d
+
+## explicit REX prefix
+ rex.r notl (%r20)
+ rex.x notl (%r20)
+ rex.b notl (%r20)
+ rex.w not (%r20)