[v2,1/4] x86: zap value-less Disp8MemShift from non-EVEX templates

Message ID 41696d73-4114-4adc-8491-bffddc5d72cc@suse.com
State New
Headers
Series x86/APX: respect -msse2avx |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_binutils_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_binutils_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 success Testing passed

Commit Message

Jan Beulich April 19, 2024, 9:37 a.m. UTC
  In order to allow to continue to use templatized SSE2AVX templates when
enhancing those to also cover eGPR usage, Disp8MemShift wants using to
deviate from what general template attributes supply. That requires
using Disp8MemShift in a way also affecting non-EVEX templates, yet
having this attribute set would so far implicitly mean EVEX encoding.
Recognize the case and instead zap the attribute if no other attribute
indicates EVEX encoding.

No change in generated tables.
---
This could be folded into the subsequent "x86/APX: extend SSE2AVX
coverage", but I think it's better kept separate for being a little
subtle.
  

Comments

Cui, Lili April 24, 2024, 6:49 a.m. UTC | #1
> In order to allow to continue to use templatized SSE2AVX templates when
> enhancing those to also cover eGPR usage, Disp8MemShift wants using to
> deviate from what general template attributes supply. That requires using
> Disp8MemShift in a way also affecting non-EVEX templates, yet having this
> attribute set would so far implicitly mean EVEX encoding.
> Recognize the case and instead zap the attribute if no other attribute
> indicates EVEX encoding.
> 

I'm confused about this patch, is it related to the movsd template? You removed the "Masking" for it and only left Disp8MemShift, but I thought it still belongs to EVEX template. 

+movsd, 0xf210, AVX512F, D|Modrm|EVexLIG|Space0F|VexW1|Disp8MemShift=3|NoSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }

> No change in generated tables.
> ---
> This could be folded into the subsequent "x86/APX: extend SSE2AVX
> coverage", but I think it's better kept separate for being a little subtle.
> 
> --- a/opcodes/i386-gen.c
> +++ b/opcodes/i386-gen.c
> @@ -1126,6 +1126,7 @@ process_i386_opcode_modifier (FILE *tabl
>  			      char **opnd, int lineno, bool rex2_disallowed)  {
>    char *str, *next, *last;
> +  bool disp8_shift_derived = false;
>    bitfield modifiers [ARRAY_SIZE (opcode_modifiers)];
>    static const char *const spaces[] = {  #define SPACE(n) [SPACE_##n] = #n
> @@ -1190,7 +1191,10 @@ process_i386_opcode_modifier (FILE *tabl
>  	      if (strcasecmp(str, "Broadcast") == 0)
>  		val = get_element_size (opnd, lineno) + BYTE_BROADCAST;
>  	      else if (strcasecmp(str, "Disp8MemShift") == 0)
> -		val = get_element_size (opnd, lineno);
> +		{
> +		  val = get_element_size (opnd, lineno);
> +		  disp8_shift_derived = true;
> +		}
> 
>  	      set_bitfield (str, modifiers, val, ARRAY_SIZE (modifiers),
>  			    lineno);
> @@ -1243,13 +1247,21 @@ process_i386_opcode_modifier (FILE *tabl
> 
>    /* Rather than evaluating multiple conditions at runtime to determine
>       whether an EVEX encoding is being dealt with, derive that information
> -     right here.  A missing EVex attribute means "dynamic".  */
> -  if (!modifiers[EVex].value
> -      && (modifiers[Disp8MemShift].value
> -	  || modifiers[Broadcast].value
> +     right here.  A missing EVex attribute means "dynamic".  There's one
> +     exception though: A value-less Disp8MemShift needs zapping rather than
> +     respecting if no other attribute indicates EVEX encoding.  This is for
> +     certain SSE2AVX templatized templates to work reasonably.  */  if
> + (!modifiers[EVex].value)
> +    {
> +      if (modifiers[Broadcast].value
>  	  || modifiers[Masking].value
> -	  || modifiers[SAE].value))
> -    modifiers[EVex].value = EVEXDYN;
> +	  || modifiers[SAE].value)
> +	modifiers[EVex].value = EVEXDYN;
> +      else if (disp8_shift_derived)
> +	modifiers[Disp8MemShift].value = 0;
> +      else if (modifiers[Disp8MemShift].value)
> +	modifiers[EVex].value = EVEXDYN;
> +    }
> 
 
Why not just delete the Disp8MemShift? Maybe I missed something.

  if (!modifiers[EVex].value
      && (modifiers[Broadcast].value
          || modifiers[Masking].value
          || modifiers[SAE].value))
    modifiers[EVex].value = EVEXDYN;

Lili.
  
Jan Beulich April 24, 2024, 7:15 a.m. UTC | #2
On 24.04.2024 08:49, Cui, Lili wrote:
>> In order to allow to continue to use templatized SSE2AVX templates when
>> enhancing those to also cover eGPR usage, Disp8MemShift wants using to
>> deviate from what general template attributes supply. That requires using
>> Disp8MemShift in a way also affecting non-EVEX templates, yet having this
>> attribute set would so far implicitly mean EVEX encoding.
>> Recognize the case and instead zap the attribute if no other attribute
>> indicates EVEX encoding.
>>
> 
> I'm confused about this patch, is it related to the movsd template? You removed the "Masking" for it and only left Disp8MemShift, but I thought it still belongs to EVEX template. 
> 
> +movsd, 0xf210, AVX512F, D|Modrm|EVexLIG|Space0F|VexW1|Disp8MemShift=3|NoSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }

There's no "masking" in an SSE2AVX template. Use of masking in a legacy-form
input instruction is simply wrong, and wants rejecting.

>> --- a/opcodes/i386-gen.c
>> +++ b/opcodes/i386-gen.c
>> @@ -1126,6 +1126,7 @@ process_i386_opcode_modifier (FILE *tabl
>>  			      char **opnd, int lineno, bool rex2_disallowed)  {
>>    char *str, *next, *last;
>> +  bool disp8_shift_derived = false;
>>    bitfield modifiers [ARRAY_SIZE (opcode_modifiers)];
>>    static const char *const spaces[] = {  #define SPACE(n) [SPACE_##n] = #n
>> @@ -1190,7 +1191,10 @@ process_i386_opcode_modifier (FILE *tabl
>>  	      if (strcasecmp(str, "Broadcast") == 0)
>>  		val = get_element_size (opnd, lineno) + BYTE_BROADCAST;
>>  	      else if (strcasecmp(str, "Disp8MemShift") == 0)
>> -		val = get_element_size (opnd, lineno);
>> +		{
>> +		  val = get_element_size (opnd, lineno);
>> +		  disp8_shift_derived = true;
>> +		}
>>
>>  	      set_bitfield (str, modifiers, val, ARRAY_SIZE (modifiers),
>>  			    lineno);
>> @@ -1243,13 +1247,21 @@ process_i386_opcode_modifier (FILE *tabl
>>
>>    /* Rather than evaluating multiple conditions at runtime to determine
>>       whether an EVEX encoding is being dealt with, derive that information
>> -     right here.  A missing EVex attribute means "dynamic".  */
>> -  if (!modifiers[EVex].value
>> -      && (modifiers[Disp8MemShift].value
>> -	  || modifiers[Broadcast].value
>> +     right here.  A missing EVex attribute means "dynamic".  There's one
>> +     exception though: A value-less Disp8MemShift needs zapping rather than
>> +     respecting if no other attribute indicates EVEX encoding.  This is for
>> +     certain SSE2AVX templatized templates to work reasonably.  */  if
>> + (!modifiers[EVex].value)
>> +    {
>> +      if (modifiers[Broadcast].value
>>  	  || modifiers[Masking].value
>> -	  || modifiers[SAE].value))
>> -    modifiers[EVex].value = EVEXDYN;
>> +	  || modifiers[SAE].value)
>> +	modifiers[EVex].value = EVEXDYN;
>> +      else if (disp8_shift_derived)
>> +	modifiers[Disp8MemShift].value = 0;
>> +      else if (modifiers[Disp8MemShift].value)
>> +	modifiers[EVex].value = EVEXDYN;
>> +    }
>>
>  
> Why not just delete the Disp8MemShift? Maybe I missed something.
> 
>   if (!modifiers[EVex].value
>       && (modifiers[Broadcast].value
>           || modifiers[Masking].value
>           || modifiers[SAE].value))
>     modifiers[EVex].value = EVEXDYN;

There are templates where Disp8MemShift is the only indication of EVEX
encoding, e.g. VMOVNTDQ and VMOVNTDQA. Those could all gain EVexDYN (I
think), but that would be a somewhat more intrusive change then. I'm
willing to be convinced of going that route (assuming it actually
would work out).

Jan
  
Cui, Lili April 24, 2024, 1:15 p.m. UTC | #3
> On 24.04.2024 08:49, Cui, Lili wrote:
> >> In order to allow to continue to use templatized SSE2AVX templates
> >> when enhancing those to also cover eGPR usage, Disp8MemShift wants
> >> using to deviate from what general template attributes supply. That
> >> requires using Disp8MemShift in a way also affecting non-EVEX
> >> templates, yet having this attribute set would so far implicitly mean EVEX
> encoding.
> >> Recognize the case and instead zap the attribute if no other
> >> attribute indicates EVEX encoding.
> >>
> >
> > I'm confused about this patch, is it related to the movsd template? You
> removed the "Masking" for it and only left Disp8MemShift, but I thought it
> still belongs to EVEX template.
> >
> > +movsd, 0xf210, AVX512F,
> > +D|Modrm|EVexLIG|Space0F|VexW1|Disp8MemShift=3|NoSuf|SSE2AVX,
> {
> > +Qword|Unspecified|BaseIndex, RegXMM }
> 
> There's no "masking" in an SSE2AVX template. Use of masking in a legacy-
> form input instruction is simply wrong, and wants rejecting.
> 

Is this patch intended to fix the following situation? If not, could you give an example?

<sse41:cpu:attr:scal:vvvv, $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV, $sse:SSE4_1:::>

+<SSE41D:cpu:attr:scal:vvvv, +
+    $avx:AVX|AVX512VL:Vex128|EVex128|VexW0|Disp8MemShift=4|SSE2AVX:VexLIG|EVexLIG|VexW0|Disp8MemShift=2|SSE2AVX:VexVVVV, +
+    $sse:SSE4_1:::>

-insertps<sse41>, 0x660f3a21, <sse41:cpu>, Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
+insertps<SSE41D>, 0x660f3a21, <SSE41D:cpu>, Modrm|<SSE41D:attr>|<SSE41D:vvvv>|Disp8MemShift|NoSuf, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM }

I'm confused why the Disp8MemShift is added here.

> >> --- a/opcodes/i386-gen.c
> >> +++ b/opcodes/i386-gen.c
> >> @@ -1126,6 +1126,7 @@ process_i386_opcode_modifier (FILE *tabl
> >>  			      char **opnd, int lineno, bool rex2_disallowed)  {
> >>    char *str, *next, *last;
> >> +  bool disp8_shift_derived = false;
> >>    bitfield modifiers [ARRAY_SIZE (opcode_modifiers)];
> >>    static const char *const spaces[] = {  #define SPACE(n)
> >> [SPACE_##n] = #n @@ -1190,7 +1191,10 @@
> process_i386_opcode_modifier (FILE *tabl
> >>  	      if (strcasecmp(str, "Broadcast") == 0)
> >>  		val = get_element_size (opnd, lineno) + BYTE_BROADCAST;
> >>  	      else if (strcasecmp(str, "Disp8MemShift") == 0)
> >> -		val = get_element_size (opnd, lineno);
> >> +		{
> >> +		  val = get_element_size (opnd, lineno);
> >> +		  disp8_shift_derived = true;
> >> +		}
> >>
> >>  	      set_bitfield (str, modifiers, val, ARRAY_SIZE (modifiers),
> >>  			    lineno);
> >> @@ -1243,13 +1247,21 @@ process_i386_opcode_modifier (FILE *tabl
> >>
> >>    /* Rather than evaluating multiple conditions at runtime to determine
> >>       whether an EVEX encoding is being dealt with, derive that information
> >> -     right here.  A missing EVex attribute means "dynamic".  */
> >> -  if (!modifiers[EVex].value
> >> -      && (modifiers[Disp8MemShift].value
> >> -	  || modifiers[Broadcast].value
> >> +     right here.  A missing EVex attribute means "dynamic".  There's one
> >> +     exception though: A value-less Disp8MemShift needs zapping rather
> than
> >> +     respecting if no other attribute indicates EVEX encoding.  This is for
> >> +     certain SSE2AVX templatized templates to work reasonably.  */
> >> + if
> >> + (!modifiers[EVex].value)
> >> +    {
> >> +      if (modifiers[Broadcast].value
> >>  	  || modifiers[Masking].value
> >> -	  || modifiers[SAE].value))
> >> -    modifiers[EVex].value = EVEXDYN;
> >> +	  || modifiers[SAE].value)
> >> +	modifiers[EVex].value = EVEXDYN;
> >> +      else if (disp8_shift_derived)
> >> +	modifiers[Disp8MemShift].value = 0;
> >> +      else if (modifiers[Disp8MemShift].value)
> >> +	modifiers[EVex].value = EVEXDYN;
> >> +    }
> >>
> >
> > Why not just delete the Disp8MemShift? Maybe I missed something.
> >
> >   if (!modifiers[EVex].value
> >       && (modifiers[Broadcast].value
> >           || modifiers[Masking].value
> >           || modifiers[SAE].value))
> >     modifiers[EVex].value = EVEXDYN;
> 
> There are templates where Disp8MemShift is the only indication of EVEX
> encoding, e.g. VMOVNTDQ and VMOVNTDQA. Those could all gain EVexDYN
> (I think), but that would be a somewhat more intrusive change then. I'm
> willing to be convinced of going that route (assuming it actually would work
> out).
> 
Oh, got it, they still need Disp8MemShift for EVEX, that makes sense.

Lili.
  
Jan Beulich April 24, 2024, 1:51 p.m. UTC | #4
On 24.04.2024 15:15, Cui, Lili wrote:
>> On 24.04.2024 08:49, Cui, Lili wrote:
>>>> In order to allow to continue to use templatized SSE2AVX templates
>>>> when enhancing those to also cover eGPR usage, Disp8MemShift wants
>>>> using to deviate from what general template attributes supply. That
>>>> requires using Disp8MemShift in a way also affecting non-EVEX
>>>> templates, yet having this attribute set would so far implicitly mean EVEX
>> encoding.
>>>> Recognize the case and instead zap the attribute if no other
>>>> attribute indicates EVEX encoding.
>>>>
>>>
>>> I'm confused about this patch, is it related to the movsd template? You
>> removed the "Masking" for it and only left Disp8MemShift, but I thought it
>> still belongs to EVEX template.
>>>
>>> +movsd, 0xf210, AVX512F,
>>> +D|Modrm|EVexLIG|Space0F|VexW1|Disp8MemShift=3|NoSuf|SSE2AVX,
>> {
>>> +Qword|Unspecified|BaseIndex, RegXMM }
>>
>> There's no "masking" in an SSE2AVX template. Use of masking in a legacy-
>> form input instruction is simply wrong, and wants rejecting.
>>
> 
> Is this patch intended to fix the following situation?

Yes, this is one of the cases where the change particularly matters.

> If not, could you give an example?
> 
> <sse41:cpu:attr:scal:vvvv, $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV, $sse:SSE4_1:::>
> 
> +<SSE41D:cpu:attr:scal:vvvv, +
> +    $avx:AVX|AVX512VL:Vex128|EVex128|VexW0|Disp8MemShift=4|SSE2AVX:VexLIG|EVexLIG|VexW0|Disp8MemShift=2|SSE2AVX:VexVVVV, +
> +    $sse:SSE4_1:::>
> 
> -insertps<sse41>, 0x660f3a21, <sse41:cpu>, Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
> +insertps<SSE41D>, 0x660f3a21, <SSE41D:cpu>, Modrm|<SSE41D:attr>|<SSE41D:vvvv>|Disp8MemShift|NoSuf, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
> 
> I'm confused why the Disp8MemShift is added here.

<SSE41D:attr> includes Disp8MemShift=4, which would be wrong here. And
<SSE41D:scal> also cannot be used, for specifying VexLIG|EVexLIG. Hence
Disp8MemShift without a value is used to override the earlier setting
(coming from <SSE41D:attr>). Yet by having it here, the legacy (non-
SSE2AVX) template would also have that attribute, which would be wrong
(we'd infer EVEX encoding from its presence). Hence why such specific
instances need zapping in i386-gen.

Jan
  
Cui, Lili April 25, 2024, 5:51 a.m. UTC | #5
> On 24.04.2024 15:15, Cui, Lili wrote:
> >> On 24.04.2024 08:49, Cui, Lili wrote:
> >>>> In order to allow to continue to use templatized SSE2AVX templates
> >>>> when enhancing those to also cover eGPR usage, Disp8MemShift wants
> >>>> using to deviate from what general template attributes supply. That
> >>>> requires using Disp8MemShift in a way also affecting non-EVEX
> >>>> templates, yet having this attribute set would so far implicitly
> >>>> mean EVEX
> >> encoding.
> >>>> Recognize the case and instead zap the attribute if no other
> >>>> attribute indicates EVEX encoding.
> >>>>
> >>>
> >>> I'm confused about this patch, is it related to the movsd template?
> >>> You
> >> removed the "Masking" for it and only left Disp8MemShift, but I
> >> thought it still belongs to EVEX template.
> >>>
> >>> +movsd, 0xf210, AVX512F,
> >>> +D|Modrm|EVexLIG|Space0F|VexW1|Disp8MemShift=3|NoSuf|SSE2AVX,
> >> {
> >>> +Qword|Unspecified|BaseIndex, RegXMM }
> >>
> >> There's no "masking" in an SSE2AVX template. Use of masking in a
> >> legacy- form input instruction is simply wrong, and wants rejecting.
> >>
> >
> > Is this patch intended to fix the following situation?
> 
> Yes, this is one of the cases where the change particularly matters.
> 
> > If not, could you give an example?
> >
> > <sse41:cpu:attr:scal:vvvv,
> > $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV,
> > $sse:SSE4_1:::>
> >
> > +<SSE41D:cpu:attr:scal:vvvv, +
> > +
> $avx:AVX|AVX512VL:Vex128|EVex128|VexW0|Disp8MemShift=4|SSE2AVX:Ve
> xLIG|EVexLIG|VexW0|Disp8MemShift=2|SSE2AVX:VexVVVV, +
> > +    $sse:SSE4_1:::>
> >
> > -insertps<sse41>, 0x660f3a21, <sse41:cpu>,
> > Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf, { Imm8,
> > Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
> > +insertps<SSE41D>, 0x660f3a21, <SSE41D:cpu>,
> > +Modrm|<SSE41D:attr>|<SSE41D:vvvv>|Disp8MemShift|NoSuf, { Imm8,
> > +Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
> >
> > I'm confused why the Disp8MemShift is added here.
> 
> <SSE41D:attr> includes Disp8MemShift=4, which would be wrong here. And
> <SSE41D:scal> also cannot be used, for specifying VexLIG|EVexLIG. Hence
> Disp8MemShift without a value is used to override the earlier setting (coming
> from <SSE41D:attr>). Yet by having it here, the legacy (non-
> SSE2AVX) template would also have that attribute, which would be wrong (we'd
> infer EVEX encoding from its presence). Hence why such specific instances need
> zapping in i386-gen.
> 

Merging more templates is a trade-off. It reduces the file size but makes it more difficult to view the attributes of the instruction. This patch is ok for me, maybe we'll find a better way to improve this in the future.

Lili.
  

Patch

--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -1126,6 +1126,7 @@  process_i386_opcode_modifier (FILE *tabl
 			      char **opnd, int lineno, bool rex2_disallowed)
 {
   char *str, *next, *last;
+  bool disp8_shift_derived = false;
   bitfield modifiers [ARRAY_SIZE (opcode_modifiers)];
   static const char *const spaces[] = {
 #define SPACE(n) [SPACE_##n] = #n
@@ -1190,7 +1191,10 @@  process_i386_opcode_modifier (FILE *tabl
 	      if (strcasecmp(str, "Broadcast") == 0)
 		val = get_element_size (opnd, lineno) + BYTE_BROADCAST;
 	      else if (strcasecmp(str, "Disp8MemShift") == 0)
-		val = get_element_size (opnd, lineno);
+		{
+		  val = get_element_size (opnd, lineno);
+		  disp8_shift_derived = true;
+		}
 
 	      set_bitfield (str, modifiers, val, ARRAY_SIZE (modifiers),
 			    lineno);
@@ -1243,13 +1247,21 @@  process_i386_opcode_modifier (FILE *tabl
 
   /* Rather than evaluating multiple conditions at runtime to determine
      whether an EVEX encoding is being dealt with, derive that information
-     right here.  A missing EVex attribute means "dynamic".  */
-  if (!modifiers[EVex].value
-      && (modifiers[Disp8MemShift].value
-	  || modifiers[Broadcast].value
+     right here.  A missing EVex attribute means "dynamic".  There's one
+     exception though: A value-less Disp8MemShift needs zapping rather than
+     respecting if no other attribute indicates EVEX encoding.  This is for
+     certain SSE2AVX templatized templates to work reasonably.  */
+  if (!modifiers[EVex].value)
+    {
+      if (modifiers[Broadcast].value
 	  || modifiers[Masking].value
-	  || modifiers[SAE].value))
-    modifiers[EVex].value = EVEXDYN;
+	  || modifiers[SAE].value)
+	modifiers[EVex].value = EVEXDYN;
+      else if (disp8_shift_derived)
+	modifiers[Disp8MemShift].value = 0;
+      else if (modifiers[Disp8MemShift].value)
+	modifiers[EVex].value = EVEXDYN;
+    }
 
   /* Vex, legacy map2 and map3 and rex2_disallowed do not support EGPR.
      For templates supporting both Vex and EVex allowing EGPR.  */