[00/11] x86: NOP emission adjustments

Message ID 7ce54bc2-fef2-d2e4-21fd-202fdead0c20@suse.com
Headers
Series x86: NOP emission adjustments |

Message

Jan Beulich Sept. 27, 2023, 3:46 p.m. UTC
  I've noticed a number of issues and inefficiencies.

01: x86: record flag_code in tc_frag_data
02: x86: i386_generate_nops() may not derive decisions from global variables
03: x86: don't use 32-bit LEA as NOP surrogate in 64-bit code
04: x86: don't use operand size override with NOP in 16-bit code
05: x86: respect ".arch nonop" when selecting which NOPs to emit
06: x86: i686 != PentiumPro
07: x86: don't record full i386_cpu_flags in struct i386_tc_frag_data
08: x86: add a few more NOP patterns
09: x86: fold a few of the "alternative" NOP patterns
10: x86: fold NOP testcase expecations where possible
11: gas: make .nops output visible in listing

Jan
  

Comments

Jan Beulich Sept. 27, 2023, 3:59 p.m. UTC | #1
On 27.09.2023 17:46, Jan Beulich via Binutils wrote:
> I've noticed a number of issues and inefficiencies.
> 
> 01: x86: record flag_code in tc_frag_data
> 02: x86: i386_generate_nops() may not derive decisions from global variables
> 03: x86: don't use 32-bit LEA as NOP surrogate in 64-bit code
> 04: x86: don't use operand size override with NOP in 16-bit code
> 05: x86: respect ".arch nonop" when selecting which NOPs to emit
> 06: x86: i686 != PentiumPro
> 07: x86: don't record full i386_cpu_flags in struct i386_tc_frag_data
> 08: x86: add a few more NOP patterns
> 09: x86: fold a few of the "alternative" NOP patterns
> 10: x86: fold NOP testcase expecations where possible
> 11: gas: make .nops output visible in listing

I shall have mentioned one further observation: When we use LEA as NOP-
surrogate, we always use %{,e,r}si as destination. I was suspecting this
might not be optimal when these actually end up executing, and indeed on
one of the three systems I checked (a Skylake) there was a reliably
measurable difference between that and alternating the destination
registers used. Question is whether that's enough of a concern, when
generally we expect people to build 64-bit code and not use .arch .nonop.

Jan