[V4,10/14] gas: synthesize CFI for hand-written asm

Message ID 20240103071526.3846985-11-indu.bhagat@oracle.com
State New
Headers
Series Synthesize CFI for hand-written asm |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_binutils_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_binutils_check--master-arm success Testing passed

Commit Message

Indu Bhagat Jan. 3, 2024, 7:15 a.m. UTC
  [Changes from V3 to V4]
  - x86 backend: detect if any unhandled instuctions affect the stack
    pointer implicitly.  Bail out if so.
  - x86 backend: get rid of insn as argument in all calls for ginsn
    creation.
  - x86 backend: bugfix in add insn translation.
  - scfi: ginsn: all symbol usages are const.  Update args to reflect
    the same.
  - scfi: ginsn: print info about GINSN_TYPE_OTHER in ginsn_print ().
  - scfi: include GINSN_TYPE_OTHER in handling to detect stack pointer
    update.
  - scfi: treat failures in SCFI forward and backward passes as errors.
  - Fix build failure with --target=x86_64-w64-mingw32 (reported by
    Nick) by guarding the offending construct with
    (OBJ_ELF || MAYBE_OBJ_ELF).
  - ginsn: Add new API gcfg_print () to print CFG for (development time)
    debugging.
  - gas: hand-crafting instructions with .byte together with SCFI is not
    supported.
  - Other code comments fixup.
[End of changes from V3 to V4]

[Changes from V2 to V3]
  - Some bugfixes in the SCFI machinery.
  - Addressed most of Jan's review comments. Few highlights:
    + Handle more instructions: LOOP*, J{E,R}CXZ, ENTER, opcodes 0x3 and
      0x2b and 0x8a, popf.
    + Removed some gas_assert, now that we have more clarity on how to
      completely handle all variants of some opcodes.
    + Handled some specific cases brought up in the review (e.g., add
      %eax, symbol etc.)
    + Add operand size override for all push/pop variants: push reg, pop
      reg, push fs, push gs, push imm, push from mem, pop to mem.
    + Add a new function to skip some instructions from SCFI: Skip cmp
      and test insns.
    + ginsn representation: Use offsetT datatype for imm/disp, use
      unsigned int for reg.  Rename some args from src_val to use a more
      appropriate name of src_reg.
    + Other improvements in code layout, code comments and formatting.
      Added a comment with x86_ginsn_new () for explaining what
      instructions are necessary to be visited for SCFI and other details.
    + Other review comments.
  - Pending items:
    + "Adjust the config/tc-i386.h such that e.g. COFF targets don't
      needlessly have a large set of dead code compiled in."
    + Streamline comments in scfi.c so that all constraints are easy to
      find.
[End of changes from V2 to V3]

This patch adds support in GAS to create generic GAS instructions
(a.k.a., the ginsn) for the x86 backend (AMD64 ABI only at this time).
Using this ginsn infrastructure, GAS can then synthesize CFI for
hand-written asm for x86_64.

A ginsn is a target-independent representation of the machine
instructions.  One machine instruction may need one or more ginsn.

This patch also adds skeleton support for printing ginsn in the listing
output for debugging purposes.

Since the current use-case of ginsn is to synthesize CFI, the x86 target
needs to generate ginsns necessary for the following machine
instructions only:

 - All change of flow instructions, including all conditional and
   unconditional branches, call and return from functions.
 - All register saves and unsaves to the stack.
 - All instructions affecting the two registers that could potentially
   be used as the base register for CFA tracking.  For SCFI, the base
   register for CFA tracking is limited to REG_SP and REG_FP only for
   now.

The representation of ginsn is kept simple:

- GAS instruction has GINSN_NUM_SRC_OPNDS (defined to be 2 at this time)
  number of source operands and one destination operand at this time.
- GAS instruction uses DWARF register numbers in its representation and
  does not track register size.
- GAS instructions carry location information (file name and line
  number).
- GAS instructions are ID's with a natural number in order of their
  addtion to the list.  This can be used as a proxy for the static
  program order of the corresponding machine instructions.

Note that, GAS instruction (ginsn) format does not support
GINSN_TYPE_PUSH and GINSN_TYPE_POP.  Some architectures, like aarch64,
do not have push and pop instructions, but rather STP/LDP/STR/LDR etc.
instructions.  Further these instructions have a variety of addressing
modes, like offset, pre-indexing and post-indexing etc.  Among other
things, one of differences in these addressing modes is _when_ the addr
register is updated with the result of the address calculation: before
or after the memory operation.  To best support such needs, the generic
instructions like GINSN_TYPE_LOAD, GINSN_TYPE_STORE together with
GINSN_TYPE_ADD, and GINSN_TYPE_SUB may be used.

The functionality provided in ginsn.c and scfi.c is compiled in when a
target defines TARGET_USE_SCFI and TARGET_USE_GINSN.  This can be
revisited later when there are other use-cases of creating ginsn's in
GAS, apart from the current use-case of synthesizing CFI for
hand-written asm.

Support is added only for AMD64 ABI at this time.  If the user
specifies, --scfi --32, GAS issues an error:

  "Fatal error: Synthesizing CFI is not supported for this ABI"

For synthesizing (DWARF) CFI, the SCFI machinery requires the programmer
to adhere to some pre-requisites for their asm:
   - Hand-written asm block must begin with a .type   foo, @function
It is highly recommended to, additionally, also ensure that:
   - Hand-written asm block ends with a .size foo, .-foo

The SCFI machinery encodes some rules which align with the standard
calling convention specified by the ABI.  Apart from the rules, the SCFI
machinery employs some heuristics.  For example:
   - The base register for CFA tracking may be either REG_SP or REG_FP.
   - If the base register for CFA tracking is REG_SP, the precise amount of
     stack usage (and hence, the value of REG_SP) must be known at all times.
   - If using dynamic stack allocation, the function must switch to
     FP-based CFA.  This means using instructions like the following (in
     AMD64) in prologue:
        pushq   %rbp
        movq    %rsp, %rbp
     and analogous instructions in epilogue.
   - Save and Restore of callee-saved registers must be symmetrical.
     However, the SCFI machinery at this time only warns if any such
     asymmetry is seen.

These heuristics/rules are architecture-independent and are meant to
employed for all architectures/ABIs using SCFI in the future.

gas/
	* Makefile.am: Add new files.
	* Makefile.in: Regenerated.
	* as.c (defined): Handle documentation and listing option for
	ginsns and SCFI.
	* config/obj-elf.c (obj_elf_size): Invoke ginsn_data_end.
	(obj_elf_type): Invoke ginsn_data_begin.
	* config/tc-i386.c (x86_scfi_callee_saved_p): New function.
	(ginsn_prefix_66H_p): Likewise.
	(ginsn_dw2_regnum): Likewise.
	(x86_ginsn_addsub_reg_mem): Likewise.
	(x86_ginsn_addsub_mem_reg): Likewise.
	(x86_ginsn_alu_imm): Likewise.
	(x86_ginsn_move): Likewise.
	(x86_ginsn_lea): Likewise.
	(x86_ginsn_jump): Likewise.
	(x86_ginsn_jump_cond): Likewise.
	(x86_ginsn_enter): Likewise.
	(x86_ginsn_safe_to_skip): Likewise.
	(x86_ginsn_unhandled): Likewise.
	(x86_ginsn_new): New functionality to generate ginsns.
	(md_assemble): Invoke x86_ginsn_new.
	(s_insn): Likewise.
	(i386_target_format): Add hard error for usage of --scfi with non AMD64 ABIs.
	* config/tc-i386.h (TARGET_USE_GINSN): New definition.
	(TARGET_USE_SCFI): Likewise.
	(SCFI_MAX_REG_ID): Likewise.
	(REG_FP): Likewise.
	(REG_SP): Likewise.
	(SCFI_INIT_CFA_OFFSET): Likewise.
	(SCFI_CALLEE_SAVED_REG_P): Likewise.
	(x86_scfi_callee_saved_p): Likewise.
	* gas/listing.h (LISTING_GINSN_SCFI): New define for ginsn and
	SCFI.
	* gas/read.c (read_a_source_file): Close SCFI processing at end
	of file read.
	* gas/scfidw2gen.c (scfi_process_cfi_label): Add implementation.
	(scfi_process_cfi_signal_frame): Likewise.
	* subsegs.h (struct frch_ginsn_data): New forward declaration.
	(struct frchain): New member for ginsn data.
	* gas/subsegs.c (subseg_set_rest): Initialize the new member.
	* symbols.c (colon): Invoke ginsn_frob_label to convey
	user-defined labels to ginsn infrastructure.
	* ginsn.c: New file.
	* ginsn.h: New file.
	* scfi.c: New file.
	* scfi.h: New file.
---
 gas/Makefile.am      |    4 +
 gas/Makefile.in      |   19 +-
 gas/as.c             |    5 +
 gas/config/obj-elf.c |   16 +
 gas/config/tc-i386.c |  996 +++++++++++++++++++++++++++++++++
 gas/config/tc-i386.h |   21 +
 gas/ginsn.c          | 1259 ++++++++++++++++++++++++++++++++++++++++++
 gas/ginsn.h          |  384 +++++++++++++
 gas/listing.h        |    1 +
 gas/read.c           |   10 +
 gas/scfi.c           | 1232 +++++++++++++++++++++++++++++++++++++++++
 gas/scfi.h           |   38 ++
 gas/scfidw2gen.c     |   28 +-
 gas/subsegs.c        |    1 +
 gas/subsegs.h        |    2 +
 gas/symbols.c        |    3 +
 16 files changed, 4009 insertions(+), 10 deletions(-)
 create mode 100644 gas/ginsn.c
 create mode 100644 gas/ginsn.h
 create mode 100644 gas/scfi.c
 create mode 100644 gas/scfi.h
  

Comments

Jan Beulich Jan. 5, 2024, 1:58 p.m. UTC | #1
On 03.01.2024 08:15, Indu Bhagat wrote:
> --- a/gas/config/obj-elf.c
> +++ b/gas/config/obj-elf.c
> @@ -24,6 +24,7 @@
>  #include "subsegs.h"
>  #include "obstack.h"
>  #include "dwarf2dbg.h"
> +#include "ginsn.h"
>  
>  #ifndef ECOFF_DEBUGGING
>  #define ECOFF_DEBUGGING 0
> @@ -2311,6 +2312,13 @@ obj_elf_size (int ignore ATTRIBUTE_UNUSED)
>        symbol_get_obj (sym)->size = XNEW (expressionS);
>        *symbol_get_obj (sym)->size = exp;
>      }
> +
> +  /* If the symbol in the directive matches the current function being
> +     processed, indicate end of the current stream of ginsns.  */
> +  if (flag_synth_cfi
> +      && S_IS_FUNCTION (sym) && sym == ginsn_data_func_symbol ())
> +    ginsn_data_end (symbol_temp_new_now ());
> +
>    demand_empty_rest_of_line ();
>  }
>  
> @@ -2499,6 +2507,14 @@ obj_elf_type (int ignore ATTRIBUTE_UNUSED)
>  	elfsym->symbol.flags &= ~mask;
>      }
>  
> +  if (S_IS_FUNCTION (sym) && flag_synth_cfi)
> +    {
> +      /* Wrap up processing the previous block of ginsns first.  */
> +      if (frchain_now->frch_ginsn_data)
> +	ginsn_data_end (symbol_temp_new_now ());
> +      ginsn_data_begin (sym);
> +    }
> +
>    demand_empty_rest_of_line ();
>  }

Documentation about .type and .size use could be more precise. Generally
it is entirely benign where exactly these directives live relative to
the code they annotate.

> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -30,6 +30,7 @@
>  #include "subsegs.h"
>  #include "dwarf2dbg.h"
>  #include "dw2gencfi.h"
> +#include "scfi.h"
>  #include "gen-sframe.h"
>  #include "sframe.h"
>  #include "elf/x86-64.h"
> @@ -5287,6 +5288,978 @@ static INLINE bool may_need_pass2 (const insn_template *t)
>  	       && t->base_opcode == 0x63);
>  }
>  
> +bool
> +x86_scfi_callee_saved_p (unsigned int dw2reg_num)

Iirc SCFI is ELF-only. We're not in ELF-only code here, though. Non-ELF
gas, as indicated before, would better not carry any unreachable code.

> +{
> +  if (dw2reg_num == 3 /* rbx.  */
> +      || dw2reg_num == REG_FP /* rbp.  */
> +      || dw2reg_num == REG_SP /* rsp.  */
> +      || (dw2reg_num >= 12 && dw2reg_num <= 15) /* r12 - r15.  */)
> +    return true;
> +
> +  return false;
> +}

This entire function is SysV-ABI-specific without this being made clear
by a comment.

> +/* Check whether a '66H' prefix accompanies the instruction.

With APX 16-bit operand size isn't necessarily represented by a 66h
prefix, but perhaps with an "embedded prefix" inside the EVEX one.
Therefore both the comment and even more so ...

> +   The current users of this API are in the handlers for PUSH, POP
> +   instructions.  These instructions affect the stack pointer implicitly:  the
> +   operand size (16, 32, or 64 bits) determines the amount by which the stack
> +   pointer is decremented (2, 4 or 8).  When '66H' prefix is present, the
> +   instruction has a 16-bit operand.  */
> +
> +static bool
> +ginsn_prefix_66H_p (i386_insn insn)

... the function name would better not allude to just the legacy
encoding. Maybe ginsn_opsize_prefix_p()?

> +{
> +  return (insn.prefix[DATA_PREFIX] == 0x66);
> +}

I take it that all ginsn / scfi interaction is late enough in the
process for this array element already having been reliably set?

> +/* Get the DWARF register number for the given register entry.
> +   For specific byte/word register accesses like al, cl, ah, ch, r8dyte etc.,

What's "r8dyte"? I expect it's a typo, but I can't derive what was
intended to be written.

> +   there is no valid DWARF register number.  This function is a hack - it
> +   relies on relative ordering of reg entries in the i386_regtab.  FIXME - it
> +   will be good to allow a more direct way to get this information.  */

Saying it's a hack is a helpful sign, but it still would be useful to
also briefly describe what the intentions here are. It's hard to spot
whether the code is correct without knowing what's intended (i.e. how
8- and 16-bit registers, or non-GPRs, are meant to be treated).

> +static unsigned int
> +ginsn_dw2_regnum (const reg_entry *ireg)
> +{
> +  /* PS: Note the data type here as int32_t, because of Dw2Inval (-1).  */
> +  int32_t dwarf_reg = Dw2Inval;
> +  const reg_entry *temp = ireg;
> +
> +  /* ginsn creation is available for AMD64 abi only ATM.  Other flag_code
> +     are not expected.  */
> +  gas_assert (flag_code == CODE_64BIT);

With this assertion it is kind of odd to see a further use of flag_code
below.

> +  if (ireg <= &i386_regtab[3])
> +    /* For al, cl, dl, bl, bump over to axl, cxl, dxl, bxl respectively by
> +       adding 8.  */
> +    temp = ireg + 8;
> +  else if (ireg <= &i386_regtab[7])
> +    /* For ah, ch, dh, bh, bump over to axl, cxl, dxl, bxl respectively by
> +       adding 4.  */
> +    temp = ireg + 4;

Assuming this is a frequently executed path, why do these not live ...

> +  dwarf_reg = temp->dw2_regnum[flag_code >> 1];
> +  if (dwarf_reg == Dw2Inval)
> +    {

... here, thus at least not affecting the code path taken for 64-bit GPRs.

> +      /* The code relies on the relative ordering of the reg entries in
> +	 i386_regtab.  The assertion here ensures the code does not recurse
> +	 indefinitely.  */
> +      gas_assert (temp + 16 < &i386_regtab[i386_regtab_size - 1]);

Afaict this is (still) undefined behavior. You may not add to a pointer
without knowing whether the result still points into or exactly past
the underlying array.

> +      temp = temp + 16;

Also - where's the 16 coming from? Was this not updated when rebasing over
APX?

> +      dwarf_reg = ginsn_dw2_regnum (temp);
> +    }
> +
> +  gas_assert (dwarf_reg != Dw2Inval); /* Needs to be addressed.  */

Without actually addressing this (and possible similar cases elsewhere), I
don't think this can go in as other than experimental code (which the
NEWS entry then should state, and where there then should be a plan for an
easy approach of probing gas for no-longer-experimental SCFI support).

> +  return (unsigned int) dwarf_reg;
> +}
> +
> +static ginsnS *
> +x86_ginsn_addsub_reg_mem (const symbolS *insn_end_sym)
> +{
> +  unsigned int dw2_regnum;
> +  unsigned int src2_dw2_regnum;
> +  ginsnS *ginsn = NULL;
> +  ginsnS * (*ginsn_func) (const symbolS *, bool,
> +			  enum ginsn_src_type, unsigned int, offsetT,
> +			  enum ginsn_src_type, unsigned int, offsetT,
> +			  enum ginsn_dst_type, unsigned int, offsetT);
> +  uint16_t opcode = i.tm.base_opcode;
> +
> +  gas_assert (opcode == 0x1 || opcode == 0x29);
> +  ginsn_func = (opcode == 0x1) ? ginsn_new_add : ginsn_new_sub;

As mentioned before - checking opcode without also checking opcode
space is pretty meaningless.

> +  /* op %reg, symbol.  */
> +  if (i.mem_operands == 1 && !i.base_reg && !i.index_reg)
> +    return ginsn;

Why does this need special treatment, and why is returning NULL here
okay?

> +  /* op reg, reg/mem.  */
> +  dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
> +  if (i.reg_operands == 2)
> +    {
> +      src2_dw2_regnum = ginsn_dw2_regnum (i.op[1].regs);
> +      ginsn = ginsn_func (insn_end_sym, true,
> +			  GINSN_SRC_REG, dw2_regnum, 0,
> +			  GINSN_SRC_REG, src2_dw2_regnum, 0,
> +			  GINSN_DST_REG, src2_dw2_regnum, 0);
> +      ginsn_set_where (ginsn);
> +    }
> +  /* Other cases where destination involves indirect access are unnecessary
> +     for SCFI correctness.  TBD_GINSN_GEN_NOT_SCFI.  */
> +
> +  return ginsn;
> +}
> +
> +static ginsnS *
> +x86_ginsn_addsub_mem_reg (const symbolS *insn_end_sym)
> +{
> +  unsigned int dw2_regnum;
> +  unsigned int src2_dw2_regnum;
> +  const reg_entry *mem_reg;
> +  int32_t gdisp = 0;
> +  ginsnS *ginsn = NULL;
> +  ginsnS * (*ginsn_func) (const symbolS *, bool,
> +			  enum ginsn_src_type, unsigned int, offsetT,
> +			  enum ginsn_src_type, unsigned int, offsetT,
> +			  enum ginsn_dst_type, unsigned int, offsetT);
> +  uint16_t opcode = i.tm.base_opcode;
> +
> +  gas_assert (opcode == 0x3 || opcode == 0x2b);
> +  ginsn_func = (opcode == 0x3) ? ginsn_new_add : ginsn_new_sub;
> +
> +  /* op symbol, %reg.  */
> +  if (i.mem_operands && !i.base_reg && !i.index_reg)
> +    return ginsn;
> +  /* op mem, %reg.  */

/* op reg/mem, reg.  */ you mean? Which then raises the question ...

> +  dw2_regnum = ginsn_dw2_regnum (i.op[1].regs);
> +
> +  if (i.mem_operands)
> +    {
> +      mem_reg = (i.base_reg) ? i.base_reg : i.index_reg;
> +      src2_dw2_regnum = ginsn_dw2_regnum (mem_reg);
> +      if (i.disp_operands == 1)
> +	gdisp = i.op[0].disps->X_add_number;
> +      ginsn = ginsn_func (insn_end_sym, true,
> +			  GINSN_SRC_INDIRECT, src2_dw2_regnum, gdisp,
> +			  GINSN_SRC_REG, dw2_regnum, 0,
> +			  GINSN_DST_REG, dw2_regnum, 0);
> +      ginsn_set_where (ginsn);
> +    }
> +
> +  return ginsn;
> +}

... why a register source isn't dealt with here.

> +static ginsnS *
> +x86_ginsn_alu_imm (const symbolS *insn_end_sym)
> +{
> +  offsetT src_imm;
> +  unsigned int dw2_regnum;
> +  ginsnS *ginsn = NULL;
> +  enum ginsn_src_type src_type = GINSN_SRC_REG;
> +  enum ginsn_dst_type dst_type = GINSN_DST_REG;
> +
> +  ginsnS * (*ginsn_func) (const symbolS *, bool,
> +			  enum ginsn_src_type, unsigned int, offsetT,
> +			  enum ginsn_src_type, unsigned int, offsetT,
> +			  enum ginsn_dst_type, unsigned int, offsetT);
> +
> +  /* FIXME - create ginsn where dest is REG_SP / REG_FP only ? */
> +  /* Map for insn.tm.extension_opcode
> +     000 ADD    100 AND
> +     001 OR     101 SUB
> +     010 ADC    110 XOR
> +     011 SBB    111 CMP  */
> +
> +  /* add/sub/and imm, %reg only at this time for SCFI.
> +     Although all three (and, or , xor) make the destination reg untraceable,

Why would this also be done for CMP? And neither ADC nor SBB are
mentioned at all in ...

> +     and op is handled but not or/xor because we will look into supporting
> +     the DRAP pattern at some point.  */

... this entire comment, justifying the choice made.

> +  if (i.tm.extension_opcode == 5)
> +    ginsn_func = ginsn_new_sub;
> +  else if (i.tm.extension_opcode == 4)
> +    ginsn_func = ginsn_new_and;
> +  else if (i.tm.extension_opcode == 0)
> +    ginsn_func = ginsn_new_add;
> +  else
> +    return ginsn;
> +
> +  /* TBD_GINSN_REPRESENTATION_LIMIT: There is no representation for when a
> +     symbol is used as an operand, like so:
> +	  addq    $simd_cmp_op+8, %rdx
> +     Skip generating any ginsn for this.  */
> +  if (i.imm_operands == 1
> +      && i.op[0].imms->X_op != O_constant)
> +    return ginsn;
> +
> +  /* addq    $1, symbol
> +     addq    $1, -16(%rbp)
> +     Such instructions are not of interest for SCFI.  */
> +  if (i.mem_operands == 1)
> +    return ginsn;

Perhaps not just here: TBD_GINSN_GEN_NOT_SCFI?

> +  gas_assert (i.imm_operands == 1);
> +  src_imm = i.op[0].imms->X_add_number;
> +  /* The second operand may be a register or indirect access.  For SCFI, only
> +     the case when the second opnd is a register is interesting.  Revisit this
> +     if generating ginsns for a different gen mode TBD_GINSN_GEN_NOT_SCFI. */
> +  if (i.reg_operands == 1)
> +    {
> +      dw2_regnum = ginsn_dw2_regnum (i.op[1].regs);
> +      /* For ginsn, keep the imm as second src operand.  */
> +      ginsn = ginsn_func (insn_end_sym, true,
> +			  src_type, dw2_regnum, 0,
> +			  GINSN_SRC_IMM, 0, src_imm,
> +			  dst_type, dw2_regnum, 0);
> +
> +      ginsn_set_where (ginsn);
> +    }
> +
> +  return ginsn;
> +}
> +
> +static ginsnS *
> +x86_ginsn_move (const symbolS *insn_end_sym)
> +{
> +  ginsnS *ginsn;
> +  unsigned int dst_reg;
> +  unsigned int src_reg;
> +  offsetT src_disp = 0;
> +  offsetT dst_disp = 0;
> +  const reg_entry *dst = NULL;
> +  const reg_entry *src = NULL;
> +  uint16_t opcode = i.tm.base_opcode;
> +  enum ginsn_src_type src_type = GINSN_SRC_REG;
> +  enum ginsn_dst_type dst_type = GINSN_DST_REG;
> +
> +  if (opcode == 0x8b || opcode == 0x8a)

Above when handling ALU insns you didn't care about byte ops - why do
you do so here (by checking for 0x8a, and 0x88 below)?

> +    {
> +      /* mov  disp(%reg), %reg.  */
> +      if (i.mem_operands && i.base_reg)
> +	{
> +	  src = i.base_reg;
> +	  if (i.disp_operands == 1)
> +	    src_disp = i.op[0].disps->X_add_number;
> +	  src_type = GINSN_SRC_INDIRECT;
> +	}
> +      else
> +	src = i.op[0].regs;

Even when there's no base, the source isn't necessarily a register.
And in such a case using i.op[0].regs isn't valid.

> +      dst = i.op[1].regs;
> +    }
> +  else if (opcode == 0x89 || opcode == 0x88)
> +    {
> +      /* mov %reg, disp(%reg).  */
> +      src = i.op[0].regs;
> +      if (i.mem_operands && i.base_reg)
> +	{
> +	  dst = i.base_reg;
> +	  if (i.disp_operands == 1)
> +	    dst_disp = i.op[1].disps->X_add_number;
> +	  dst_type = GINSN_DST_INDIRECT;
> +	}
> +      else
> +	dst = i.op[1].regs;

Similarly here then.

> +    }
> +
> +  src_reg = ginsn_dw2_regnum (src);
> +  dst_reg = ginsn_dw2_regnum (dst);
> +
> +  ginsn = ginsn_new_mov (insn_end_sym, true,
> +			 src_type, src_reg, src_disp,
> +			 dst_type, dst_reg, dst_disp);
> +  ginsn_set_where (ginsn);
> +
> +  return ginsn;
> +}
> +
> +/* Generate appropriate ginsn for lea.
> +   Sub-cases marked with TBD_GINSN_INFO_LOSS indicate some loss of information
> +   in the ginsn.  But these are fine for now for GINSN_GEN_SCFI generation
> +   mode.  */
> +
> +static ginsnS *
> +x86_ginsn_lea (const symbolS *insn_end_sym)
> +{
> +  offsetT src_disp = 0;
> +  ginsnS *ginsn = NULL;
> +  unsigned int base_reg;
> +  unsigned int index_reg;
> +  offsetT index_scale;
> +  unsigned int dst_reg;
> +
> +  if (!i.index_reg && !i.base_reg)
> +    {
> +      /* lea symbol, %rN.  */
> +      dst_reg = ginsn_dw2_regnum (i.op[1].regs);
> +      /* TBD_GINSN_INFO_LOSS - Skip encoding information about the symbol.  */
> +      ginsn = ginsn_new_mov (insn_end_sym, false,
> +			     GINSN_SRC_IMM, 0xf /* arbitrary const.  */, 0,
> +			     GINSN_DST_REG, dst_reg, 0);
> +    }
> +  else if (i.base_reg && !i.index_reg)
> +    {
> +      /* lea    -0x2(%base),%dst.  */
> +      base_reg = ginsn_dw2_regnum (i.base_reg);

What if base is %eip? Aiui ginsn_dw2_regnum() will hit an assertion
then.

And what about e.g. "lea symbol(%rbx), %rbp"? The ...

> +      dst_reg = ginsn_dw2_regnum (i.op[1].regs);
> +
> +      if (i.disp_operands)
> +	src_disp = i.op[0].disps->X_add_number;

... constant retrieved here won't properly represent the displacement
then.

> +      if (src_disp)
> +	/* Generate an ADD ginsn.  */
> +	ginsn = ginsn_new_add (insn_end_sym, true,
> +			       GINSN_SRC_REG, base_reg, 0,
> +			       GINSN_SRC_IMM, 0, src_disp,
> +			       GINSN_DST_REG, dst_reg, 0);
> +      else
> +	/* Generate a MOV ginsn.  */
> +	ginsn = ginsn_new_mov (insn_end_sym, true,
> +			       GINSN_SRC_REG, base_reg, 0,
> +			       GINSN_DST_REG, dst_reg, 0);
> +    }
> +  else if (!i.base_reg && i.index_reg)
> +    {
> +      /* lea (,%index,imm), %dst.  */
> +      /* TBD_GINSN_INFO_LOSS - There is no explicit ginsn multiply operation,
> +	 instead use GINSN_TYPE_OTHER.  */

You're also losing the displacement here.

> +      index_scale = i.log2_scale_factor;
> +      index_reg = ginsn_dw2_regnum (i.index_reg);
> +      dst_reg = ginsn_dw2_regnum (i.op[1].regs);
> +      ginsn = ginsn_new_other (insn_end_sym, true,
> +			       GINSN_SRC_REG, index_reg,
> +			       GINSN_SRC_IMM, index_scale,
> +			       GINSN_DST_REG, dst_reg);

Wouldn't it make sense to represent a scale factor of 1 correctly
here (i.e. not as "other", but rather similar to the base-without-
index case above)?

> +    }
> +  else
> +    {
> +      /* lea disp(%base,%index,imm) %dst.  */
> +      /* TBD_GINSN_INFO_LOSS - Skip adding information about the disp and imm
> +	 for index reg. */
> +      base_reg = ginsn_dw2_regnum (i.base_reg);
> +      index_reg = ginsn_dw2_regnum (i.index_reg);
> +      dst_reg = ginsn_dw2_regnum (i.op[1].regs);
> +      /* Generate an ADD ginsn.  */
> +      ginsn = ginsn_new_add (insn_end_sym, true,
> +			     GINSN_SRC_REG, base_reg, 0,
> +			     GINSN_SRC_REG, index_reg, 0,
> +			     GINSN_DST_REG, dst_reg, 0);

Seeing the use of "other" above, why is this (wrongly) represented
as "add"?

> +    }
> +
> +  ginsn_set_where (ginsn);
> +
> +  return ginsn;
> +}

Throughout this function (and perhaps others as well), how come you
don't consider operand size at all? It matters whether results are
64-bit, 32-bit, or 16-bit, after all.

> +static ginsnS *
> +x86_ginsn_jump (const symbolS *insn_end_sym)
> +{
> +  ginsnS *ginsn = NULL;
> +  symbolS *src_symbol;

Here and elsewhere - please use const whenever possible. (I think I said
so already on an earlier version.)

> +  gas_assert (i.disp_operands == 1);
> +
> +  /* A non-zero addend in jump target makes control-flow tracking difficult.
> +     Skip SCFI for now.  */
> +  if (i.op[0].disps->X_op == O_symbol && i.op[0].disps->X_add_number)
> +    {
> +      as_bad ("SCFI: jmp insn with non-zero addend to sym not supported");
> +      return ginsn;
> +    }
> +
> +  if (i.op[0].disps->X_op == O_symbol)

Why the redundant evaluation of ->X_op? In fact, if you moved the
earlier if() ...

> +    {

... here, this ...

> +      gas_assert (!i.op[0].disps->X_add_number);

... assertion would become entirely redundant.

> +      src_symbol = i.op[0].disps->X_add_symbol;
> +      ginsn = ginsn_new_jump (insn_end_sym, true,
> +			      GINSN_SRC_SYMBOL, 0, src_symbol);
> +
> +      ginsn_set_where (ginsn);
> +    }
> +
> +  return ginsn;
> +}
> +
> +static ginsnS *
> +x86_ginsn_jump_cond (const symbolS *insn_end_sym)
> +{
> +  ginsnS *ginsn = NULL;
> +  symbolS *src_symbol;
> +
> +  gas_assert (i.disp_operands == 1);
> +
> +  /* A non-zero addend in JCC target makes control-flow tracking difficult.
> +     Skip SCFI for now.  */
> +  if (i.op[0].disps->X_op == O_symbol && i.op[0].disps->X_add_number)
> +    {
> +      as_bad ("SCFI: jcc insn with non-zero addend to sym not supported");
> +      return ginsn;
> +    }
> +
> +  if (i.op[0].disps->X_op == O_symbol)
> +    {
> +      gas_assert (i.op[0].disps->X_add_number == 0);
> +      src_symbol = i.op[0].disps->X_add_symbol;
> +      ginsn = ginsn_new_jump_cond (insn_end_sym, true,
> +				   GINSN_SRC_SYMBOL, 0, src_symbol);
> +      ginsn_set_where (ginsn);
> +    }
> +
> +  return ginsn;
> +}

This looks almost identical to x86_ginsn_jump() - can't the two be
folded?

> +static ginsnS *
> +x86_ginsn_enter (const symbolS *insn_end_sym)
> +{
> +  ginsnS *ginsn = NULL;
> +  ginsnS *ginsn_next = NULL;
> +  ginsnS *ginsn_last = NULL;
> +
> +  gas_assert (i.imm_operands == 2);
> +
> +  /* For non-zero size operands, bail out as untraceable for SCFI.  */
> +  if ((i.op[0].imms->X_op != O_constant || i.op[0].imms->X_add_symbol != 0)
> +      || (i.op[1].imms->X_op != O_constant || i.op[1].imms->X_add_symbol != 0))

While the comment makes sufficiently clear what's meant, the use of (inner)
parentheses here is still confusing as to whether indeed the || are meant.

> +    {
> +      as_bad ("SCFI: enter insn with non-zero operand not supported");
> +      return ginsn;
> +    }
> +
> +  /* If the nesting level is 0, the processor pushes the frame pointer from
> +     the BP/EBP/RBP register onto the stack, copies the current stack
> +     pointer from the SP/ESP/RSP register into the BP/EBP/RBP register, and
> +     loads the SP/ESP/RSP register with the current stack-pointer value
> +     minus the value in the size operand.  */
> +  ginsn = ginsn_new_sub (insn_end_sym, false,
> +			 GINSN_SRC_REG, REG_SP, 0,
> +			 GINSN_SRC_IMM, 0, 8,

I guess 8 is the operand size and you simply hope no-one's going to use
a 16-bit ENTER?

> +			 GINSN_DST_REG, REG_SP, 0);
> +  ginsn_set_where (ginsn);
> +  ginsn_next = ginsn_new_store (insn_end_sym, false,
> +				GINSN_SRC_REG, REG_FP,
> +				GINSN_DST_INDIRECT, REG_SP, 0);
> +  ginsn_set_where (ginsn_next);
> +  gas_assert (!ginsn_link_next (ginsn, ginsn_next));
> +  ginsn_last = ginsn_new_mov (insn_end_sym, false,
> +			      GINSN_SRC_REG, REG_SP, 0,
> +			      GINSN_DST_REG, REG_FP, 0);
> +  ginsn_set_where (ginsn_last);
> +  gas_assert (!ginsn_link_next (ginsn_next, ginsn_last));
> +
> +  return ginsn;
> +}
> +
> +static bool
> +x86_ginsn_safe_to_skip (void)
> +{
> +  bool skip_p = false;
> +  uint16_t opcode = i.tm.base_opcode;
> +
> +  switch (opcode)
> +    {
> +    case 0x39:

This isn't the only CMP encoding, and ...

> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* cmp reg, reg.  */
> +      skip_p = true;
> +      break;
> +    case 0x85:

... this isn't the only TEST one.

> +      /* test reg, reg/mem.  */
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      skip_p = true;
> +      break;
> +    default:
> +      break;
> +    }
> +
> +  return skip_p;
> +}
> +
> +#define X86_GINSN_UNHANDLED_NONE      0
> +#define X86_GINSN_UNHANDLED_DEST_REG  1
> +#define X86_GINSN_UNHANDLED_CFG       2
> +#define X86_GINSN_UNHANDLED_STACKOP   3
> +
> +/* Check the input insn for its impact on the correctness of the synthesized
> +   CFI.  Returns an error code to the caller.  */
> +
> +static int
> +x86_ginsn_unhandled (void)
> +{
> +  int err = X86_GINSN_UNHANDLED_NONE;
> +  const reg_entry *reg_op;
> +  unsigned int dw2_regnum;
> +
> +  /* Keep an eye out for instructions affecting control flow.  */
> +  if (i.tm.opcode_modifier.jump)
> +    err = X86_GINSN_UNHANDLED_CFG;
> +  /* Also, for any instructions involving an implicit update to the stack
> +     pointer.  */
> +  else if (i.tm.opcode_modifier.implicitstackop)
> +    err = X86_GINSN_UNHANDLED_STACKOP;
> +  /* Finally, also check if the missed instructions are affecting REG_SP or
> +     REG_FP.  The destination operand is the last at all stages of assembly
> +     (due to following AT&T syntax layout in the internal representation).  In
> +     case of Intel syntax input, this still remains true as swap_operands ()
> +     is done by now.
> +     PS: These checks do not involve index / base reg, as indirect memory
> +     accesses via REG_SP or REG_FP do not affect SCFI correctness.
> +     (Also note these instructions are candidates for other ginsn generation
> +     modes in future.  TBD_GINSN_GEN_NOT_SCFI.)  */
> +  else if (i.operands && i.reg_operands
> +	   && !(i.flags[i.operands - 1] & Operand_Mem))
> +    {
> +      reg_op = i.op[i.operands - 1].regs;
> +      if (reg_op)
> +	{
> +	  dw2_regnum = ginsn_dw2_regnum (reg_op);
> +	  if (dw2_regnum == REG_SP || dw2_regnum == REG_FP)
> +	    err = X86_GINSN_UNHANDLED_DEST_REG;
> +	}

else
  err = X86_GINSN_UNHANDLED_CONFUSED;

? You can't let this case go silently. An alternative would be to
assert instead of using if().

> +    }
> +
> +  return err;
> +}
> +
> +/* Generate one or more generic GAS instructions, a.k.a, ginsns for the current
> +   machine instruction.
> +
> +   Returns the head of linked list of ginsn(s) added, if success; Returns NULL
> +   if failure.
> +
> +   The input ginsn_gen_mode GMODE determines the set of minimal necessary
> +   ginsns necessary for correctness of any passes applicable for that mode.
> +   For supporting the GINSN_GEN_SCFI generation mode, following is the list of
> +   machine instructions that must be translated into the corresponding ginsns
> +   to ensure correctness of SCFI:
> +     - All instructions affecting the two registers that could potentially
> +       be used as the base register for CFA tracking.  For SCFI, the base
> +       register for CFA tracking is limited to REG_SP and REG_FP only for
> +       now.
> +     - All change of flow instructions: conditional and unconditional branches,
> +       call and return from functions.
> +     - All instructions that can potentially be a register save / restore
> +       operation.

This could do with being more fine grained, as "potentially" is pretty vague,
and (as per earlier version review comments) my take on this is a much wider
set than yours.

> +     - All instructions that perform stack manipulation implicitly: the CALL,
> +       RET, PUSH, POP, ENTER, and LEAVE instructions.
> +
> +   The function currently supports GINSN_GEN_SCFI ginsn generation mode only.
> +   To support other generation modes will require work on this target-specific
> +   process of creation of ginsns:
> +     - Some of such places are tagged with TBD_GINSN_GEN_NOT_SCFI to serve as
> +       possible starting points.

Oh, I see you're not meaning to have this annotation consistently. That's a
little sad ...

> +     - Also note that ginsn representation may need enhancements.  Specifically,
> +       note some TBD_GINSN_INFO_LOSS and TBD_GINSN_REPRESENTATION_LIMIT markers.
> +   */
> +
> +static ginsnS *
> +x86_ginsn_new (const symbolS *insn_end_sym, enum ginsn_gen_mode gmode)
> +{
> +  int err = 0;
> +  uint16_t opcode;
> +  unsigned int dw2_regnum;
> +  ginsnS *ginsn = NULL;
> +  ginsnS *ginsn_next = NULL;
> +  ginsnS *ginsn_last = NULL;
> +  /* In 64-bit mode, the default stack update size is 8 bytes.  */
> +  int stack_opnd_size = 8;
> +
> +  /* Currently supports generation of selected ginsns, sufficient for
> +     the use-case of SCFI only.  */
> +  if (gmode != GINSN_GEN_SCFI)
> +    return ginsn;
> +
> +  opcode = i.tm.base_opcode;
> +
> +  switch (opcode)
> +    {
> +    case 0x1:
> +      /* add reg, reg/mem.  */
> +    case 0x29:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;

What about the new APX NDD encodings in EVEX map 4?

> +      /* sub reg, reg/mem.  */

Please be careful with placing such comments when there are multiple
case labels (or fall-through). I think these would better go on the
same lines as the case labels themselves.

> +      ginsn = x86_ginsn_addsub_reg_mem (insn_end_sym);
> +      break;
> +
> +    case 0x3:
> +      /* add reg/mem, reg.  */
> +    case 0x2b:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* sub reg/mem, reg.  */
> +      ginsn = x86_ginsn_addsub_mem_reg (insn_end_sym);
> +      break;
> +
> +    case 0xa0:
> +    case 0xa8:
> +      /* push fs / push gs have opcode_space == SPACE_0F.  */
> +      if (i.tm.opcode_space != SPACE_0F)
> +	break;
> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
> +	stack_opnd_size = 2;

But only if there's not also REX.W / REX2.W.

> +      /* push fs / push gs.  */
> +      ginsn = ginsn_new_sub (insn_end_sym, false,
> +			     GINSN_SRC_REG, REG_SP, 0,
> +			     GINSN_SRC_IMM, 0, stack_opnd_size,
> +			     GINSN_DST_REG, REG_SP, 0);
> +      ginsn_set_where (ginsn);
> +      ginsn_next = ginsn_new_store (insn_end_sym, false,
> +				    GINSN_SRC_REG, dw2_regnum,
> +				    GINSN_DST_INDIRECT, REG_SP, 0);
> +      ginsn_set_where (ginsn_next);
> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
> +      break;
> +
> +    case 0xa1:
> +    case 0xa9:
> +      /* pop fs / pop gs have opcode_space == SPACE_0F.  */
> +      if (i.tm.opcode_space != SPACE_0F)
> +	break;
> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
> +	stack_opnd_size = 2;
> +      /* pop fs / pop gs.  */
> +      ginsn = ginsn_new_load (insn_end_sym, false,
> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
> +			      GINSN_DST_REG, dw2_regnum);
> +      ginsn_set_where (ginsn);
> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
> +				  GINSN_SRC_REG, REG_SP, 0,
> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
> +				  GINSN_DST_REG, REG_SP, 0);
> +      ginsn_set_where (ginsn_next);
> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
> +      break;
> +
> +    case 0x50 ... 0x57:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* push reg.  */
> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
> +	stack_opnd_size = 2;
> +      ginsn = ginsn_new_sub (insn_end_sym, false,
> +			     GINSN_SRC_REG, REG_SP, 0,
> +			     GINSN_SRC_IMM, 0, stack_opnd_size,
> +			     GINSN_DST_REG, REG_SP, 0);
> +      ginsn_set_where (ginsn);
> +      ginsn_next = ginsn_new_store (insn_end_sym, false,
> +				    GINSN_SRC_REG, dw2_regnum,
> +				    GINSN_DST_INDIRECT, REG_SP, 0);
> +      ginsn_set_where (ginsn_next);
> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
> +      break;
> +
> +    case 0x58 ... 0x5f:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* pop reg.  */
> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
> +      ginsn = ginsn_new_load (insn_end_sym, false,
> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
> +			      GINSN_DST_REG, dw2_regnum);
> +      ginsn_set_where (ginsn);
> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
> +	stack_opnd_size = 2;
> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
> +				  GINSN_SRC_REG, REG_SP, 0,
> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
> +				  GINSN_DST_REG, REG_SP, 0);
> +      ginsn_set_where (ginsn_next);
> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
> +      break;
> +
> +    case 0x6a:
> +    case 0x68:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* push imm8/imm16/imm32.  */
> +      if (opcode == 0x6a)
> +	stack_opnd_size = 1;
> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.
> +	 However, this prefix may only be present when opcode is 0x68.  */

Why would this be limited to opcode 0x68?

> +      else if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
> +	stack_opnd_size = 2;
> +      else
> +	stack_opnd_size = 4;

In 64-bit mode stack operations are never 32-bit.

> +      /* Skip getting the value of imm from machine instruction
> +	 because this is not important for SCFI.  */
> +      ginsn = ginsn_new_sub (insn_end_sym, false,
> +			     GINSN_SRC_REG, REG_SP, 0,
> +			     GINSN_SRC_IMM, 0, stack_opnd_size,
> +			     GINSN_DST_REG, REG_SP, 0);
> +      ginsn_set_where (ginsn);
> +      ginsn_next = ginsn_new_store (insn_end_sym, false,
> +				    GINSN_SRC_IMM, 0,
> +				    GINSN_DST_INDIRECT, REG_SP, 0);
> +      ginsn_set_where (ginsn_next);
> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
> +      break;
> +
> +    case 0x70 ... 0x7f:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      ginsn = x86_ginsn_jump_cond (insn_end_sym);
> +      break;

I think this wants a comment briefly explaining why SPACE_0F opcodes
0x8[0-f] don't need handling explicitly. Same for JMP (0xeb) below.

> +    case 0x80:
> +    case 0x81:
> +    case 0x83:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      ginsn = x86_ginsn_alu_imm (insn_end_sym);
> +      break;
> +
> +    case 0x8a:
> +    case 0x8b:
> +      /* Move reg/mem, reg (8/16/32/64).  */
> +    case 0x88:
> +    case 0x89:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* mov reg, reg/mem. (8/16/32/64).  */
> +      ginsn = x86_ginsn_move (insn_end_sym);
> +      break;
> +
> +    case 0x8d:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* lea disp(%src), %dst */

disp(%src) doesn't really represent the full set of possibilities.
Why not use "mem" as you do elsewhere?

> +      ginsn = x86_ginsn_lea (insn_end_sym);
> +      break;
> +
> +    case 0x8f:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* pop to mem.  */

Or to register. While this won't happen today, allowing a means to
have the programmer request that alternative encoding would surely
miss to update the code here. Hence this code would better be ready
to deal with the case right away.

> +      gas_assert (i.base_reg);

POP isn't different from other explicit memory accesses: All forms
are allowed, and hence a base register may not be in use.

Both remarks also apply to PUSH further down.

> +      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
> +      ginsn = ginsn_new_load (insn_end_sym, false,
> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
> +			      GINSN_DST_INDIRECT, dw2_regnum);
> +      ginsn_set_where (ginsn);
> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
> +	stack_opnd_size = 2;
> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
> +				  GINSN_SRC_REG, REG_SP, 0,
> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
> +				  GINSN_DST_REG, REG_SP, 0);
> +      ginsn_set_where (ginsn_next);
> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
> +      break;
> +
> +    case 0x9c:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* pushf / pushfq.  */
> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
> +	stack_opnd_size = 2;
> +      ginsn = ginsn_new_sub (insn_end_sym, false,
> +			     GINSN_SRC_REG, REG_SP, 0,
> +			     GINSN_SRC_IMM, 0, stack_opnd_size,
> +			     GINSN_DST_REG, REG_SP, 0);
> +      ginsn_set_where (ginsn);
> +      /* Tracking EFLAGS register by number is not necessary.  */

How does this fit with ...

> +      ginsn_next = ginsn_new_store (insn_end_sym, false,
> +				    GINSN_SRC_IMM, 0,
> +				    GINSN_DST_INDIRECT, REG_SP, 0);
> +      ginsn_set_where (ginsn_next);
> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
> +      break;
> +
> +    case 0x9d:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* popf / popfq.  */
> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
> +	stack_opnd_size = 2;
> +      /* FIXME - hardcode the actual DWARF reg number value.  As for SCFI
> +	 correctness, although this behaves simply a placeholder value; its
> +	 just clearer if the value is correct.  */
> +      dw2_regnum = 49;

... this?

> +      ginsn = ginsn_new_load (insn_end_sym, false,
> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
> +			      GINSN_DST_REG, dw2_regnum);
> +      ginsn_set_where (ginsn);
> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
> +				  GINSN_SRC_REG, REG_SP, 0,
> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
> +				  GINSN_DST_REG, REG_SP, 0);
> +      ginsn_set_where (ginsn_next);
> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
> +      break;
> +
> +    case 0xff:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* push from mem.  */
> +      if (i.tm.extension_opcode == 6)
> +	{
> +	  /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
> +	  if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
> +	    stack_opnd_size = 2;
> +	  ginsn = ginsn_new_sub (insn_end_sym, false,
> +				 GINSN_SRC_REG, REG_SP, 0,
> +				 GINSN_SRC_IMM, 0, stack_opnd_size,
> +				 GINSN_DST_REG, REG_SP, 0);
> +	  ginsn_set_where (ginsn);
> +	  /* These instructions have no imm, only indirect access.  */
> +	  gas_assert (i.base_reg);
> +	  dw2_regnum = ginsn_dw2_regnum (i.base_reg);
> +	  ginsn_next = ginsn_new_store (insn_end_sym, false,
> +					GINSN_SRC_INDIRECT, dw2_regnum,
> +					GINSN_DST_INDIRECT, REG_SP, 0);
> +	  ginsn_set_where (ginsn_next);
> +	  gas_assert (!ginsn_link_next (ginsn, ginsn_next));
> +	}
> +      else if (i.tm.extension_opcode == 4)
> +	{
> +	  /* jmp r/m.  E.g., notrack jmp *%rax.  */
> +	  if (i.reg_operands)
> +	    {
> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
> +				      GINSN_SRC_REG, dw2_regnum, NULL);
> +	      ginsn_set_where (ginsn);
> +	    }
> +	  else if (i.mem_operands && i.index_reg)
> +	    {
> +	      /* jmp    *0x0(,%rax,8).  */
> +	      dw2_regnum = ginsn_dw2_regnum (i.index_reg);
> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
> +				      GINSN_SRC_REG, dw2_regnum, NULL);
> +	      ginsn_set_where (ginsn);

What if both base and index are in use? Like for PUSH/POP, all addressing
forms are permitted here and ...

> +	    }
> +	  else if (i.mem_operands && i.base_reg)
> +	    {
> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
> +				      GINSN_SRC_REG, dw2_regnum, NULL);
> +	      ginsn_set_where (ginsn);
> +	    }
> +	}
> +      else if (i.tm.extension_opcode == 2)
> +	{
> +	  /* 0xFF /2 (call).  */
> +	  if (i.reg_operands)
> +	    {
> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
> +	      ginsn = ginsn_new_call (insn_end_sym, true,
> +				      GINSN_SRC_REG, dw2_regnum, NULL);
> +	      ginsn_set_where (ginsn);
> +	    }
> +	  else if (i.mem_operands && i.base_reg)
> +	    {
> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
> +	      ginsn = ginsn_new_call (insn_end_sym, true,
> +				      GINSN_SRC_REG, dw2_regnum, NULL);
> +	      ginsn_set_where (ginsn);
> +	    }

... here.

> +	}
> +      break;
> +
> +    case 0xc2:
> +    case 0xc3:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* Near ret.  */
> +      ginsn = ginsn_new_return (insn_end_sym, true);
> +      ginsn_set_where (ginsn);
> +      break;

No tracking of the stack pointer adjustment?

> +    case 0xc8:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* enter.  */
> +      ginsn = x86_ginsn_enter (insn_end_sym);
> +      break;
> +
> +    case 0xc9:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* The 'leave' instruction copies the contents of the RBP register
> +	 into the RSP register to release all stack space allocated to the
> +	 procedure.  */
> +      ginsn = ginsn_new_mov (insn_end_sym, false,
> +			     GINSN_SRC_REG, REG_FP, 0,
> +			     GINSN_DST_REG, REG_SP, 0);
> +      ginsn_set_where (ginsn);
> +      /* Then it restores the old value of the RBP register from the stack.  */
> +      ginsn_next = ginsn_new_load (insn_end_sym, false,
> +				   GINSN_SRC_INDIRECT, REG_SP, 0,
> +				   GINSN_DST_REG, REG_FP);
> +      ginsn_set_where (ginsn_next);
> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
> +      ginsn_last = ginsn_new_add (insn_end_sym, false,
> +				  GINSN_SRC_REG, REG_SP, 0,
> +				  GINSN_SRC_IMM, 0, 8,

Same comment as for ENTER wrt operand size.

> +				  GINSN_DST_REG, REG_SP, 0);
> +      ginsn_set_where (ginsn_next);
> +      gas_assert (!ginsn_link_next (ginsn_next, ginsn_last));
> +      break;
> +
> +    case 0xe0 ... 0xe2:
> +      /* loop / loope / loopne.  */
> +    case 0xe3:
> +      /* jecxz / jrcxz.  */
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      ginsn = x86_ginsn_jump_cond (insn_end_sym);
> +      ginsn_set_where (ginsn);
> +      break;
> +
> +    case 0xe8:
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* PS: SCFI machinery does not care about which func is being
> +	 called.  OK to skip that info.  */
> +      ginsn = ginsn_new_call (insn_end_sym, true,
> +			      GINSN_SRC_SYMBOL, 0, NULL);
> +      ginsn_set_where (ginsn);
> +      break;

Again - what about the stack pointer adjustment? Or wait - you only
care about the local function's view. Just that this will get you
in trouble for something like

	call	1f
1:
	pop	%rax

CALL with zero displacement really acts as if "push %rip".

> +    case 0xeb:
> +      /* If opcode_space != SPACE_BASE, this is not a jmp insn.  Skip it
> +	 for GINSN_GEN_SCFI.  */
> +      if (i.tm.opcode_space != SPACE_BASE)
> +	break;
> +      /* Unconditional jmp.  */
> +      ginsn = x86_ginsn_jump (insn_end_sym);
> +      ginsn_set_where (ginsn);
> +      break;
> +
> +    default:
> +      /* TBD_GINSN_GEN_NOT_SCFI: Skip all other opcodes uninteresting for
> +	 GINSN_GEN_SCFI mode.  */
> +      break;
> +    }
> +
> +  if (!ginsn && !x86_ginsn_safe_to_skip ())
> +    {
> +      /* For all unhandled insns that are not whitelisted, check that they do
> +	 not impact SCFI correctness.  */
> +      err = x86_ginsn_unhandled ();
> +      switch (err)
> +	{
> +	case X86_GINSN_UNHANDLED_NONE:
> +	  break;
> +	case X86_GINSN_UNHANDLED_DEST_REG:
> +	  /* Not all writes to REG_FP are harmful in context of SCFI.  Simply
> +	     generate a GINSN_TYPE_OTHER with destination set to the
> +	     appropriate register.  The SCFI machinery will bail out if this
> +	     ginsn affects SCFI correctness.  */
> +	  dw2_regnum = ginsn_dw2_regnum (i.op[i.operands - 1].regs);
> +	  ginsn = ginsn_new_other (insn_end_sym, true,
> +				   GINSN_SRC_IMM, 0,
> +				   GINSN_SRC_IMM, 0,
> +				   GINSN_DST_REG, dw2_regnum);
> +	  ginsn_set_where (ginsn);
> +	  break;
> +	case X86_GINSN_UNHANDLED_CFG:
> +	  /* Fall through.  */
> +	case X86_GINSN_UNHANDLED_STACKOP:

No fall-through comment please between immediately successive case labels.

> +	  as_bad (_("SCFI: unhandled op 0x%x may cause incorrect CFI"),
> +		  i.tm.base_opcode);

As a remark: %#x is a one byte shorter representation with largely the
same effect (plus, nicely imo, omitting the 0x when the value is zero).

> +	  break;
> +	default:
> +	  abort ();
> +	  break;
> +	}
> +    }
> +
> +  return ginsn;
> +}
> +
>  /* This is the guts of the machine-dependent assembler.  LINE points to a
>     machine dependent instruction.  This function is supposed to emit
>     the frags/bytes it assembles to.  */
> @@ -5299,6 +6272,7 @@ md_assemble (char *line)
>    const char *end, *pass1_mnem = NULL;
>    enum i386_error pass1_err = 0;
>    const insn_template *t;
> +  ginsnS *ginsn;
>    struct last_insn *last_insn
>      = &seg_info(now_seg)->tc_segment_info_data.last_insn;
>  
> @@ -5830,6 +6804,14 @@ md_assemble (char *line)
>    /* We are ready to output the insn.  */
>    output_insn (last_insn);
>  
> +  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
> +     performed in i386_target_format.  */

See my earlier comment - it's yet more restrictive (as in not covering
e.g. the Windows ABI, which importantly is also used in EFI).

> +  if (flag_synth_cfi)
> +    {
> +      ginsn = x86_ginsn_new (symbol_temp_new_now (), frch_ginsn_gen_mode ());
> +      frch_ginsn_data_append (ginsn);
> +    }
> +
>    insert_lfence_after ();
>  
>    if (i.tm.opcode_modifier.isprefix)
> @@ -11333,6 +12315,7 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
>    const char *end;
>    unsigned int j;
>    valueT val;
> +  ginsnS *ginsn;
>    bool vex = false, xop = false, evex = false;
>    struct last_insn *last_insn;
>  
> @@ -12104,6 +13087,14 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
>    last_insn->name = ".insn directive";
>    last_insn->file = as_where (&last_insn->line);
>  
> +  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
> +     performed in i386_target_format.  */
> +  if (flag_synth_cfi)
> +    {
> +      ginsn = x86_ginsn_new (symbol_temp_new_now (), frch_ginsn_gen_mode ());

If you really want to use this function here, more cases will need
handling (perhaps even beyond what I've commented on above). However,
I'd strongly suggest splitting off the "unhandled" part of that
function, and using only that here. After all you hardly know what
exactly the programmer's intentions are. Because of that, you may
also want to consider simply forbidding use of .insn when SCFI is to
be generated.

> +      frch_ginsn_data_append (ginsn);
> +    }
> +
>   done:
>    *saved_ilp = saved_char;
>    input_line_pointer = line;
> @@ -15748,6 +16739,11 @@ i386_target_format (void)
>    else
>      as_fatal (_("unknown architecture"));
>  
> +#if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
> +  if (flag_synth_cfi && x86_elf_abi != X86_64_ABI)
> +    as_fatal (_("Synthesizing CFI is not supported for this ABI"));
> +#endif

Elsewhere I've raised the question of whether we should really check
OBJ_MAYBE_ELF anywhere in this file. However, as long as we do, you'll
need to accompany that with an IS_ELF check in the if(). If, to
address my unreachable code comment near the top, you elect to add
further OBJ_ELF / OBJ_MAYBE_ELF checks, there you then wouldn't need
that further check (as flag_synth_cfi set then implies ELF).

Jan
  
Indu Bhagat Jan. 8, 2024, 12:46 a.m. UTC | #2
Hi Jan, Nick,

I am working on addressing the review comments in V4 and will continue 
that review in a separate thread.

Meanwhile...

On 1/5/24 05:58, Jan Beulich wrote:
>> +      dwarf_reg = ginsn_dw2_regnum (temp);
>> +    }
>> +
>> +  gas_assert (dwarf_reg != Dw2Inval); /* Needs to be addressed.  */
> Without actually addressing this (and possible similar cases elsewhere), I
> don't think this can go in as other than experimental code (which the
> NEWS entry then should state, and where there then should be a plan for an
> easy approach of probing gas for no-longer-experimental SCFI support).

... in this specific case, the /* Needs to be addressed.  */ comment is 
somewhat stale and may have lead the impression that this is an 
unhandled case (EIP is the pending unhandled case, I will deal with it 
in V5).

Moving forward, I would like to get consensus on whats the way forward 
for SCFI series, especially whether there is agreement on releasing what 
will be the V5 with binutils 2.42.

My take on "SCFI should go as experimental code" : my priority is to get 
this option to users and to continue development of SCFI incrementally 
with the help/reviews from community.  If the reviewers/maintainers 
think, offering this first as --scfi=experimental, which later is 
established as --scfi=all, in a future release is the best way to move 
forward, I can make the necessary changes now.

But I will appreciate if there is consensus on this plan before I 
undertake those changes and send a V5 aiming for the upcoming 2.42.

Thanks
Indu
  
Jan Beulich Jan. 8, 2024, 8:16 a.m. UTC | #3
On 08.01.2024 01:46, Indu Bhagat wrote:
> Hi Jan, Nick,
> 
> I am working on addressing the review comments in V4 and will continue 
> that review in a separate thread.
> 
> Meanwhile...
> 
> On 1/5/24 05:58, Jan Beulich wrote:
>>> +      dwarf_reg = ginsn_dw2_regnum (temp);
>>> +    }
>>> +
>>> +  gas_assert (dwarf_reg != Dw2Inval); /* Needs to be addressed.  */
>> Without actually addressing this (and possible similar cases elsewhere), I
>> don't think this can go in as other than experimental code (which the
>> NEWS entry then should state, and where there then should be a plan for an
>> easy approach of probing gas for no-longer-experimental SCFI support).
> 
> ... in this specific case, the /* Needs to be addressed.  */ comment is 
> somewhat stale and may have lead the impression that this is an 
> unhandled case (EIP is the pending unhandled case, I will deal with it 
> in V5).
> 
> Moving forward, I would like to get consensus on whats the way forward 
> for SCFI series, especially whether there is agreement on releasing what 
> will be the V5 with binutils 2.42.
> 
> My take on "SCFI should go as experimental code" : my priority is to get 
> this option to users and to continue development of SCFI incrementally 
> with the help/reviews from community.  If the reviewers/maintainers 
> think, offering this first as --scfi=experimental, which later is 
> established as --scfi=all, in a future release is the best way to move 
> forward, I can make the necessary changes now.

I'd be okay with this as a plan. Before fully supported, I'd actually hope
to see at least one other architecture to also use this machinery. Yet
maybe that's asking for too much ...

Jan
  
Indu Bhagat Jan. 8, 2024, 8:33 a.m. UTC | #4
On 1/8/24 00:16, Jan Beulich wrote:
> On 08.01.2024 01:46, Indu Bhagat wrote:
>> Hi Jan, Nick,
>>
>> I am working on addressing the review comments in V4 and will continue
>> that review in a separate thread.
>>
>> Meanwhile...
>>
>> On 1/5/24 05:58, Jan Beulich wrote:
>>>> +      dwarf_reg = ginsn_dw2_regnum (temp);
>>>> +    }
>>>> +
>>>> +  gas_assert (dwarf_reg != Dw2Inval); /* Needs to be addressed.  */
>>> Without actually addressing this (and possible similar cases elsewhere), I
>>> don't think this can go in as other than experimental code (which the
>>> NEWS entry then should state, and where there then should be a plan for an
>>> easy approach of probing gas for no-longer-experimental SCFI support).
>>
>> ... in this specific case, the /* Needs to be addressed.  */ comment is
>> somewhat stale and may have lead the impression that this is an
>> unhandled case (EIP is the pending unhandled case, I will deal with it
>> in V5).
>>
>> Moving forward, I would like to get consensus on whats the way forward
>> for SCFI series, especially whether there is agreement on releasing what
>> will be the V5 with binutils 2.42.
>>
>> My take on "SCFI should go as experimental code" : my priority is to get
>> this option to users and to continue development of SCFI incrementally
>> with the help/reviews from community.  If the reviewers/maintainers
>> think, offering this first as --scfi=experimental, which later is
>> established as --scfi=all, in a future release is the best way to move
>> forward, I can make the necessary changes now.
> 
> I'd be okay with this as a plan. Before fully supported, I'd actually hope
> to see at least one other architecture to also use this machinery. Yet
> maybe that's asking for too much ...
> 

Thanks Jan.

BTW, I have implementation for SCFI for aarch64 in a personal branch. 
FWIW, the current state of ginsn/scfi  machinery (as posted in V4) 
"works" on aarch64. That said, the aarch64 implementation is minimally 
tested at this time and needs more work.

I hope to upstream that at some point.

Indu
  
Indu Bhagat Jan. 8, 2024, 7:33 p.m. UTC | #5
Hi Jan,

On 1/5/24 05:58, Jan Beulich wrote:
> On 03.01.2024 08:15, Indu Bhagat wrote:
>> --- a/gas/config/obj-elf.c
>> +++ b/gas/config/obj-elf.c
>> @@ -24,6 +24,7 @@
>>   #include "subsegs.h"
>>   #include "obstack.h"
>>   #include "dwarf2dbg.h"
>> +#include "ginsn.h"
>>   
>>   #ifndef ECOFF_DEBUGGING
>>   #define ECOFF_DEBUGGING 0
>> @@ -2311,6 +2312,13 @@ obj_elf_size (int ignore ATTRIBUTE_UNUSED)
>>         symbol_get_obj (sym)->size = XNEW (expressionS);
>>         *symbol_get_obj (sym)->size = exp;
>>       }
>> +
>> +  /* If the symbol in the directive matches the current function being
>> +     processed, indicate end of the current stream of ginsns.  */
>> +  if (flag_synth_cfi
>> +      && S_IS_FUNCTION (sym) && sym == ginsn_data_func_symbol ())
>> +    ginsn_data_end (symbol_temp_new_now ());
>> +
>>     demand_empty_rest_of_line ();
>>   }
>>   
>> @@ -2499,6 +2507,14 @@ obj_elf_type (int ignore ATTRIBUTE_UNUSED)
>>   	elfsym->symbol.flags &= ~mask;
>>       }
>>   
>> +  if (S_IS_FUNCTION (sym) && flag_synth_cfi)
>> +    {
>> +      /* Wrap up processing the previous block of ginsns first.  */
>> +      if (frchain_now->frch_ginsn_data)
>> +	ginsn_data_end (symbol_temp_new_now ());
>> +      ginsn_data_begin (sym);
>> +    }
>> +
>>     demand_empty_rest_of_line ();
>>   }
> 
> Documentation about .type and .size use could be more precise. Generally
> it is entirely benign where exactly these directives live relative to
> the code they annotate.
> 

Added a comment for V5.

As stated in as.texi, usage of .type and .size will be bread and butter 
for SCFI: "The input asm must begin with the @code{.type} directive, and 
should ideally be closed off using a @code{.size} directive."

>> --- a/gas/config/tc-i386.c
>> +++ b/gas/config/tc-i386.c
>> @@ -30,6 +30,7 @@
>>   #include "subsegs.h"
>>   #include "dwarf2dbg.h"
>>   #include "dw2gencfi.h"
>> +#include "scfi.h"
>>   #include "gen-sframe.h"
>>   #include "sframe.h"
>>   #include "elf/x86-64.h"
>> @@ -5287,6 +5288,978 @@ static INLINE bool may_need_pass2 (const insn_template *t)
>>   	       && t->base_opcode == 0x63);
>>   }
>>   
>> +bool
>> +x86_scfi_callee_saved_p (unsigned int dw2reg_num)
> 
> Iirc SCFI is ELF-only. We're not in ELF-only code here, though. Non-ELF
> gas, as indicated before, would better not carry any unreachable code.
> 

Guarded these APIs with OBJ_ELF || MAYBE_OBJ_ELF for V5.

>> +{
>> +  if (dw2reg_num == 3 /* rbx.  */
>> +      || dw2reg_num == REG_FP /* rbp.  */
>> +      || dw2reg_num == REG_SP /* rsp.  */
>> +      || (dw2reg_num >= 12 && dw2reg_num <= 15) /* r12 - r15.  */)
>> +    return true;
>> +
>> +  return false;
>> +}
> 
> This entire function is SysV-ABI-specific without this being made clear
> by a comment.
> 

Added a comment.

>> +/* Check whether a '66H' prefix accompanies the instruction.
> 
> With APX 16-bit operand size isn't necessarily represented by a 66h
> prefix, but perhaps with an "embedded prefix" inside the EVEX one.
> Therefore both the comment and even more so ...
> 
>> +   The current users of this API are in the handlers for PUSH, POP
>> +   instructions.  These instructions affect the stack pointer implicitly:  the
>> +   operand size (16, 32, or 64 bits) determines the amount by which the stack
>> +   pointer is decremented (2, 4 or 8).  When '66H' prefix is present, the
>> +   instruction has a 16-bit operand.  */
>> +
>> +static bool
>> +ginsn_prefix_66H_p (i386_insn insn)
> 
> ... the function name would better not allude to just the legacy
> encoding. Maybe ginsn_opsize_prefix_p()?
> 

Isnt 66H_p more readable and easier to follow because that's what the 
function is currently checking ?  If more scenarios were being handled, 
ginsn_opsize_prefix_p () would fit better.

>> +{
>> +  return (insn.prefix[DATA_PREFIX] == 0x66);
>> +}
> 
> I take it that all ginsn / scfi interaction is late enough in the
> process for this array element already having been reliably set?
> 

Yes, I think so. All of ginsn creation is after output_insn ().

>> +/* Get the DWARF register number for the given register entry.
>> +   For specific byte/word register accesses like al, cl, ah, ch, r8dyte etc.,
> 
> What's "r8dyte"? I expect it's a typo, but I can't derive what was
> intended to be written.
> 

Typo it is.  I meant to write r8d. I have updated this to " like al, cl, 
ah, ch, r8d, r20w etc."

>> +   there is no valid DWARF register number.  This function is a hack - it
>> +   relies on relative ordering of reg entries in the i386_regtab.  FIXME - it
>> +   will be good to allow a more direct way to get this information.  */
> 
> Saying it's a hack is a helpful sign, but it still would be useful to
> also briefly describe what the intentions here are. It's hard to spot
> whether the code is correct without knowing what's intended (i.e. how
> 8- and 16-bit registers, or non-GPRs, are meant to be treated).
> 

I have updated the function-level description to include:

"For specific byte/word register accesses like al, cl, ah, ch, r8d, r20w 
   etc., we need to identify the DWARF register number for the 
corresponding 8-byte GPR."

Also included a test for new GPR r31w in ginsn-dw2-regnum-1 in V5. I 
realized that the data type for dw2_regnum in reg_entry needs to be 
(bumped up):
      signed short dw2_regnum[2];
instead of the current 'signed char' as the new GPRs have DWARF reg 
numbers > 130.  Will include a patch for this.

>> +static unsigned int
>> +ginsn_dw2_regnum (const reg_entry *ireg)
>> +{
>> +  /* PS: Note the data type here as int32_t, because of Dw2Inval (-1).  */
>> +  int32_t dwarf_reg = Dw2Inval;
>> +  const reg_entry *temp = ireg;
>> +
>> +  /* ginsn creation is available for AMD64 abi only ATM.  Other flag_code
>> +     are not expected.  */
>> +  gas_assert (flag_code == CODE_64BIT);
> 
> With this assertion it is kind of odd to see a further use of flag_code
> below.
> 

I think you are referring to the checks like:

   /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
   if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
     stack_opnd_size = 2;

Although the check on flag_code is redundant now, I chose to have them 
here to keep it aligned to how the prefix is meant to be used.

>> +  if (ireg <= &i386_regtab[3])
>> +    /* For al, cl, dl, bl, bump over to axl, cxl, dxl, bxl respectively by
>> +       adding 8.  */
>> +    temp = ireg + 8;
>> +  else if (ireg <= &i386_regtab[7])
>> +    /* For ah, ch, dh, bh, bump over to axl, cxl, dxl, bxl respectively by
>> +       adding 4.  */
>> +    temp = ireg + 4;
> 
> Assuming this is a frequently executed path, why do these not live ...
> 
>> +  dwarf_reg = temp->dw2_regnum[flag_code >> 1];
>> +  if (dwarf_reg == Dw2Inval)
>> +    {
> 
> ... here, thus at least not affecting the code path taken for 64-bit GPRs.
> 

Thanks. I have moved it inside the if () block.

>> +      /* The code relies on the relative ordering of the reg entries in
>> +	 i386_regtab.  The assertion here ensures the code does not recurse
>> +	 indefinitely.  */
>> +      gas_assert (temp + 16 < &i386_regtab[i386_regtab_size - 1]);
> 
> Afaict this is (still) undefined behavior. You may not add to a pointer
> without knowing whether the result still points into or exactly past
> the underlying array.
> 

Ah. I have now updated this to:

unsigned int idx;

gas_assert ((temp - &i386_regtab[0]) >= 0);
idx = temp - &i386_regtab[0];
gas_assert (idx + 32 < i386_regtab_size - 1);


>> +      temp = temp + 16;
> 
> Also - where's the 16 coming from? Was this not updated when rebasing over
> APX?
> 

Interesting that the testcase ginsn-dw2-regnum-1 was meant to catch such 
cases when the function goes out of sync with i386-opc.tbl, but this one 
was not detected after the addition of GPRs with the APX patch. The 
correct magic number after the new GPRs were added should be 32. The V4 
implementation, however, still reached the correct register entry in two 
iterations. Anyway, fixed now for V5.

>> +      dwarf_reg = ginsn_dw2_regnum (temp);
>> +    }
>> +
>> +  gas_assert (dwarf_reg != Dw2Inval); /* Needs to be addressed.  */
> 
> Without actually addressing this (and possible similar cases elsewhere), I
> don't think this can go in as other than experimental code (which the
> NEWS entry then should state, and where there then should be a plan for an
> easy approach of probing gas for no-longer-experimental SCFI support).
> 

The /* Needs to be addressed.  */ comment is imprecise and may have lead 
the impression that this is an unhandled case.

The intention was to catch most bad cases during dev/testing so that in 
production this serves merely as sanity check here - a failure triggered 
in rare case of state corruption or bad ginsn etc. I have updated the 
code comment. I have also updated this assert to now check that 
(dwarf_reg >= 0).

As I comment below, I have included handling for RegIP, RegIZ.

>> +  return (unsigned int) dwarf_reg;
>> +}
>> +
>> +static ginsnS *
>> +x86_ginsn_addsub_reg_mem (const symbolS *insn_end_sym)
>> +{
>> +  unsigned int dw2_regnum;
>> +  unsigned int src2_dw2_regnum;
>> +  ginsnS *ginsn = NULL;
>> +  ginsnS * (*ginsn_func) (const symbolS *, bool,
>> +			  enum ginsn_src_type, unsigned int, offsetT,
>> +			  enum ginsn_src_type, unsigned int, offsetT,
>> +			  enum ginsn_dst_type, unsigned int, offsetT);
>> +  uint16_t opcode = i.tm.base_opcode;
>> +
>> +  gas_assert (opcode == 0x1 || opcode == 0x29);
>> +  ginsn_func = (opcode == 0x1) ? ginsn_new_add : ginsn_new_sub;
> 
> As mentioned before - checking opcode without also checking opcode
> space is pretty meaningless.
> 

The caller of this API screens has checks for i.tm.opcode_space, so this 
is OK in theory. But I have now also added to the assert above.

>> +  /* op %reg, symbol.  */
>> +  if (i.mem_operands == 1 && !i.base_reg && !i.index_reg)
>> +    return ginsn;
> 
> Why does this need special treatment, and why is returning NULL here
> okay?
> 

An instruction like "addq    %rax, symbol" etc are uninteresting for 
SCFI.  One of feedback in a previous iteration was to "consider not 
generating ginsns for cases that are known to be uninteresting".

>> +  /* op reg, reg/mem.  */
>> +  dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>> +  if (i.reg_operands == 2)
>> +    {
>> +      src2_dw2_regnum = ginsn_dw2_regnum (i.op[1].regs);
>> +      ginsn = ginsn_func (insn_end_sym, true,
>> +			  GINSN_SRC_REG, dw2_regnum, 0,
>> +			  GINSN_SRC_REG, src2_dw2_regnum, 0,
>> +			  GINSN_DST_REG, src2_dw2_regnum, 0);
>> +      ginsn_set_where (ginsn);
>> +    }
>> +  /* Other cases where destination involves indirect access are unnecessary
>> +     for SCFI correctness.  TBD_GINSN_GEN_NOT_SCFI.  */
>> +
>> +  return ginsn;
>> +}
>> +
>> +static ginsnS *
>> +x86_ginsn_addsub_mem_reg (const symbolS *insn_end_sym)
>> +{
>> +  unsigned int dw2_regnum;
>> +  unsigned int src2_dw2_regnum;
>> +  const reg_entry *mem_reg;
>> +  int32_t gdisp = 0;
>> +  ginsnS *ginsn = NULL;
>> +  ginsnS * (*ginsn_func) (const symbolS *, bool,
>> +			  enum ginsn_src_type, unsigned int, offsetT,
>> +			  enum ginsn_src_type, unsigned int, offsetT,
>> +			  enum ginsn_dst_type, unsigned int, offsetT);
>> +  uint16_t opcode = i.tm.base_opcode;
>> +
>> +  gas_assert (opcode == 0x3 || opcode == 0x2b);
>> +  ginsn_func = (opcode == 0x3) ? ginsn_new_add : ginsn_new_sub;
>> +
>> +  /* op symbol, %reg.  */
>> +  if (i.mem_operands && !i.base_reg && !i.index_reg)
>> +    return ginsn;
>> +  /* op mem, %reg.  */
> 
> /* op reg/mem, reg.  */ you mean? Which then raises the question ...
> 

Yes (updated the comment for V5).

>> +  dw2_regnum = ginsn_dw2_regnum (i.op[1].regs);
>> +
>> +  if (i.mem_operands)
>> +    {
>> +      mem_reg = (i.base_reg) ? i.base_reg : i.index_reg;
>> +      src2_dw2_regnum = ginsn_dw2_regnum (mem_reg);
>> +      if (i.disp_operands == 1)
>> +	gdisp = i.op[0].disps->X_add_number;
>> +      ginsn = ginsn_func (insn_end_sym, true,
>> +			  GINSN_SRC_INDIRECT, src2_dw2_regnum, gdisp,
>> +			  GINSN_SRC_REG, dw2_regnum, 0,
>> +			  GINSN_DST_REG, dw2_regnum, 0);
>> +      ginsn_set_where (ginsn);
>> +    }
>> +
>> +  return ginsn;
>> +}
> 
> ... why a register source isn't dealt with here.
> 

I saw that the opcode used when the source is reg is either 0x1 or 0x29, 
hence concluded that the handling in x86_ginsn_addsub_reg_mem should 
suffice.  Perhaps this is a GAS implementation artifact, and I should 
have handling for register source here in x86_ginsn_addsub_mem_reg (for 
opcodes 0x3 and 0x2b) ?

>> +static ginsnS *
>> +x86_ginsn_alu_imm (const symbolS *insn_end_sym)
>> +{
>> +  offsetT src_imm;
>> +  unsigned int dw2_regnum;
>> +  ginsnS *ginsn = NULL;
>> +  enum ginsn_src_type src_type = GINSN_SRC_REG;
>> +  enum ginsn_dst_type dst_type = GINSN_DST_REG;
>> +
>> +  ginsnS * (*ginsn_func) (const symbolS *, bool,
>> +			  enum ginsn_src_type, unsigned int, offsetT,
>> +			  enum ginsn_src_type, unsigned int, offsetT,
>> +			  enum ginsn_dst_type, unsigned int, offsetT);
>> +
>> +  /* FIXME - create ginsn where dest is REG_SP / REG_FP only ? */
>> +  /* Map for insn.tm.extension_opcode
>> +     000 ADD    100 AND
>> +     001 OR     101 SUB
>> +     010 ADC    110 XOR
>> +     011 SBB    111 CMP  */
>> +
>> +  /* add/sub/and imm, %reg only at this time for SCFI.
>> +     Although all three (and, or , xor) make the destination reg untraceable,
> 
> Why would this also be done for CMP? And neither ADC nor SBB are
> mentioned at all in ...
> 
>> +     and op is handled but not or/xor because we will look into supporting
>> +     the DRAP pattern at some point.  */
> 
> ... this entire comment, justifying the choice made.
> 

I will add comment about CMP, ADC, SBB.  The choice made here is based 
on assumption that ADC, SBB with REG_SP / REG_FP are not commonly used 
for stack manipulation purposes.  If at all they do occur, they will be 
handled in the x86_ginsn_unhandled () track, and correctness is not 
sacrificed. Note that ADC, SBB with other registers as destination are 
not interesting for SCFI. I have added an adc op in ginsn-add-1 to keep 
this tested.

>> +  if (i.tm.extension_opcode == 5)
>> +    ginsn_func = ginsn_new_sub;
>> +  else if (i.tm.extension_opcode == 4)
>> +    ginsn_func = ginsn_new_and;
>> +  else if (i.tm.extension_opcode == 0)
>> +    ginsn_func = ginsn_new_add;
>> +  else
>> +    return ginsn;
>> +
>> +  /* TBD_GINSN_REPRESENTATION_LIMIT: There is no representation for when a
>> +     symbol is used as an operand, like so:
>> +	  addq    $simd_cmp_op+8, %rdx
>> +     Skip generating any ginsn for this.  */
>> +  if (i.imm_operands == 1
>> +      && i.op[0].imms->X_op != O_constant)
>> +    return ginsn;
>> +
>> +  /* addq    $1, symbol
>> +     addq    $1, -16(%rbp)
>> +     Such instructions are not of interest for SCFI.  */
>> +  if (i.mem_operands == 1)
>> +    return ginsn;
> 
> Perhaps not just here: TBD_GINSN_GEN_NOT_SCFI?
> 

OK. Added a comment TBD_GINSN_GEN_NOT_SCFI.

>> +  gas_assert (i.imm_operands == 1);
>> +  src_imm = i.op[0].imms->X_add_number;
>> +  /* The second operand may be a register or indirect access.  For SCFI, only
>> +     the case when the second opnd is a register is interesting.  Revisit this
>> +     if generating ginsns for a different gen mode TBD_GINSN_GEN_NOT_SCFI. */
>> +  if (i.reg_operands == 1)
>> +    {
>> +      dw2_regnum = ginsn_dw2_regnum (i.op[1].regs);
>> +      /* For ginsn, keep the imm as second src operand.  */
>> +      ginsn = ginsn_func (insn_end_sym, true,
>> +			  src_type, dw2_regnum, 0,
>> +			  GINSN_SRC_IMM, 0, src_imm,
>> +			  dst_type, dw2_regnum, 0);
>> +
>> +      ginsn_set_where (ginsn);
>> +    }
>> +
>> +  return ginsn;
>> +}
>> +
>> +static ginsnS *
>> +x86_ginsn_move (const symbolS *insn_end_sym)
>> +{
>> +  ginsnS *ginsn;
>> +  unsigned int dst_reg;
>> +  unsigned int src_reg;
>> +  offsetT src_disp = 0;
>> +  offsetT dst_disp = 0;
>> +  const reg_entry *dst = NULL;
>> +  const reg_entry *src = NULL;
>> +  uint16_t opcode = i.tm.base_opcode;
>> +  enum ginsn_src_type src_type = GINSN_SRC_REG;
>> +  enum ginsn_dst_type dst_type = GINSN_DST_REG;
>> +
>> +  if (opcode == 0x8b || opcode == 0x8a)
> 
> Above when handling ALU insns you didn't care about byte ops - why do
> you do so here (by checking for 0x8a, and 0x88 below)?
> 

Because there may be weird code that saves and restores 8-byte registers 
on stack.  For ALU insns, if the destination is REG_SP/REG_FP, we will 
detect them in the unhandled track.


>> +    {
>> +      /* mov  disp(%reg), %reg.  */
>> +      if (i.mem_operands && i.base_reg)
>> +	{
>> +	  src = i.base_reg;
>> +	  if (i.disp_operands == 1)
>> +	    src_disp = i.op[0].disps->X_add_number;
>> +	  src_type = GINSN_SRC_INDIRECT;
>> +	}
>> +      else
>> +	src = i.op[0].regs;
> 
> Even when there's no base, the source isn't necessarily a register.
> And in such a case using i.op[0].regs isn't valid.
> 
>> +      dst = i.op[1].regs;
>> +    }
>> +  else if (opcode == 0x89 || opcode == 0x88)
>> +    {
>> +      /* mov %reg, disp(%reg).  */
>> +      src = i.op[0].regs;
>> +      if (i.mem_operands && i.base_reg)
>> +	{
>> +	  dst = i.base_reg;
>> +	  if (i.disp_operands == 1)
>> +	    dst_disp = i.op[1].disps->X_add_number;
>> +	  dst_type = GINSN_DST_INDIRECT;
>> +	}
>> +      else
>> +	dst = i.op[1].regs;
> 
> Similarly here then.
> 

Right. Fixed this and above.

>> +    }
>> +
>> +  src_reg = ginsn_dw2_regnum (src);
>> +  dst_reg = ginsn_dw2_regnum (dst);
>> +
>> +  ginsn = ginsn_new_mov (insn_end_sym, true,
>> +			 src_type, src_reg, src_disp,
>> +			 dst_type, dst_reg, dst_disp);
>> +  ginsn_set_where (ginsn);
>> +
>> +  return ginsn;
>> +}
>> +
>> +/* Generate appropriate ginsn for lea.
>> +   Sub-cases marked with TBD_GINSN_INFO_LOSS indicate some loss of information
>> +   in the ginsn.  But these are fine for now for GINSN_GEN_SCFI generation
>> +   mode.  */
>> +
>> +static ginsnS *
>> +x86_ginsn_lea (const symbolS *insn_end_sym)
>> +{
>> +  offsetT src_disp = 0;
>> +  ginsnS *ginsn = NULL;
>> +  unsigned int base_reg;
>> +  unsigned int index_reg;
>> +  offsetT index_scale;
>> +  unsigned int dst_reg;
>> +
>> +  if (!i.index_reg && !i.base_reg)
>> +    {
>> +      /* lea symbol, %rN.  */
>> +      dst_reg = ginsn_dw2_regnum (i.op[1].regs);
>> +      /* TBD_GINSN_INFO_LOSS - Skip encoding information about the symbol.  */
>> +      ginsn = ginsn_new_mov (insn_end_sym, false,
>> +			     GINSN_SRC_IMM, 0xf /* arbitrary const.  */, 0,
>> +			     GINSN_DST_REG, dst_reg, 0);
>> +    }
>> +  else if (i.base_reg && !i.index_reg)
>> +    {
>> +      /* lea    -0x2(%base),%dst.  */
>> +      base_reg = ginsn_dw2_regnum (i.base_reg);
> 
> What if base is %eip? Aiui ginsn_dw2_regnum() will hit an assertion
> then.
> 

Yes it will. Handled RegIP, RegIZ now for V5. Thanks.

> And what about e.g. "lea symbol(%rbx), %rbp"? The ...
> 
>> +      dst_reg = ginsn_dw2_regnum (i.op[1].regs);
>> +
>> +      if (i.disp_operands)
>> +	src_disp = i.op[0].disps->X_add_number;
> 
> ... constant retrieved here won't properly represent the displacement
> then.
> 
>> +      if (src_disp)
>> +	/* Generate an ADD ginsn.  */
>> +	ginsn = ginsn_new_add (insn_end_sym, true,
>> +			       GINSN_SRC_REG, base_reg, 0,
>> +			       GINSN_SRC_IMM, 0, src_disp,
>> +			       GINSN_DST_REG, dst_reg, 0);
>> +      else
>> +	/* Generate a MOV ginsn.  */
>> +	ginsn = ginsn_new_mov (insn_end_sym, true,
>> +			       GINSN_SRC_REG, base_reg, 0,
>> +			       GINSN_DST_REG, dst_reg, 0);
>> +    }
>> +  else if (!i.base_reg && i.index_reg)
>> +    {
>> +      /* lea (,%index,imm), %dst.  */
>> +      /* TBD_GINSN_INFO_LOSS - There is no explicit ginsn multiply operation,
>> +	 instead use GINSN_TYPE_OTHER.  */
> 
> You're also losing the displacement here.
> 

Added a comment.

>> +      index_scale = i.log2_scale_factor;
>> +      index_reg = ginsn_dw2_regnum (i.index_reg);
>> +      dst_reg = ginsn_dw2_regnum (i.op[1].regs);
>> +      ginsn = ginsn_new_other (insn_end_sym, true,
>> +			       GINSN_SRC_REG, index_reg,
>> +			       GINSN_SRC_IMM, index_scale,
>> +			       GINSN_DST_REG, dst_reg);
> 
> Wouldn't it make sense to represent a scale factor of 1 correctly
> here (i.e. not as "other", but rather similar to the base-without-
> index case above)?
> 

Hmm... your suggestion does make sense.  For now, I have added this to 
my TODO items.

>> +    }
>> +  else
>> +    {
>> +      /* lea disp(%base,%index,imm) %dst.  */
>> +      /* TBD_GINSN_INFO_LOSS - Skip adding information about the disp and imm
>> +	 for index reg. */
>> +      base_reg = ginsn_dw2_regnum (i.base_reg);
>> +      index_reg = ginsn_dw2_regnum (i.index_reg);
>> +      dst_reg = ginsn_dw2_regnum (i.op[1].regs);
>> +      /* Generate an ADD ginsn.  */
>> +      ginsn = ginsn_new_add (insn_end_sym, true,
>> +			     GINSN_SRC_REG, base_reg, 0,
>> +			     GINSN_SRC_REG, index_reg, 0,
>> +			     GINSN_DST_REG, dst_reg, 0);
> 
> Seeing the use of "other" above, why is this (wrongly) represented
> as "add"?
> 

You have a point.  No good reason why this shouldnt be GINSN_TYPE_OTHER.

Usage of GINSN_TYPE_ADD is however not wrong though for SCFI purposes: 
because we know that such an add if later deemed as "interesting for 
SCFI" purposes, will cause the SCFI generation to bail out with a 
"unsupported stack manipulation pattern".  So will a GINSN_TYPE_OTHER.

I have now changed this to use ginsn_new_other though.  GINSN_TYPE_OTHER 
is a better fit.

>> +    }
>> +
>> +  ginsn_set_where (ginsn);
>> +
>> +  return ginsn;
>> +}
> 
> Throughout this function (and perhaps others as well), how come you
> don't consider operand size at all? It matters whether results are
> 64-bit, 32-bit, or 16-bit, after all.
> 

Operation size matters: No, not for all instructions in context of SCFI. 
For instructions using stack (save / restore), the size is important. 
But for other instructions, operation size will not affect SCFI correctness.

>> +static ginsnS *
>> +x86_ginsn_jump (const symbolS *insn_end_sym)
>> +{
>> +  ginsnS *ginsn = NULL;
>> +  symbolS *src_symbol;
> 
> Here and elsewhere - please use const whenever possible. (I think I said
> so already on an earlier version.)
> 

Yes.  Somehow this was missed. I also noted now that some more symbolS * 
usages in ginsn_new_jump () / ginsn_new_jump_cond () could use const. I 
have updated those too.

>> +  gas_assert (i.disp_operands == 1);
>> +
>> +  /* A non-zero addend in jump target makes control-flow tracking difficult.
>> +     Skip SCFI for now.  */
>> +  if (i.op[0].disps->X_op == O_symbol && i.op[0].disps->X_add_number)
>> +    {
>> +      as_bad ("SCFI: jmp insn with non-zero addend to sym not supported");
>> +      return ginsn;
>> +    }
>> +
>> +  if (i.op[0].disps->X_op == O_symbol)
> 
> Why the redundant evaluation of ->X_op? In fact, if you moved the
> earlier if() ...
> 
>> +    {
> 
> ... here, this ...
> 
>> +      gas_assert (!i.op[0].disps->X_add_number);
> 
> ... assertion would become entirely redundant.
> 

OK. Rearranged the checks.

>> +      src_symbol = i.op[0].disps->X_add_symbol;
>> +      ginsn = ginsn_new_jump (insn_end_sym, true,
>> +			      GINSN_SRC_SYMBOL, 0, src_symbol);
>> +
>> +      ginsn_set_where (ginsn);
>> +    }
>> +
>> +  return ginsn;
>> +}
>> +
>> +static ginsnS *
>> +x86_ginsn_jump_cond (const symbolS *insn_end_sym)
>> +{
>> +  ginsnS *ginsn = NULL;
>> +  symbolS *src_symbol;
>> +
>> +  gas_assert (i.disp_operands == 1);
>> +
>> +  /* A non-zero addend in JCC target makes control-flow tracking difficult.
>> +     Skip SCFI for now.  */
>> +  if (i.op[0].disps->X_op == O_symbol && i.op[0].disps->X_add_number)
>> +    {
>> +      as_bad ("SCFI: jcc insn with non-zero addend to sym not supported");
>> +      return ginsn;
>> +    }
>> +
>> +  if (i.op[0].disps->X_op == O_symbol)
>> +    {
>> +      gas_assert (i.op[0].disps->X_add_number == 0);
>> +      src_symbol = i.op[0].disps->X_add_symbol;
>> +      ginsn = ginsn_new_jump_cond (insn_end_sym, true,
>> +				   GINSN_SRC_SYMBOL, 0, src_symbol);
>> +      ginsn_set_where (ginsn);
>> +    }
>> +
>> +  return ginsn;
>> +}
> 
> This looks almost identical to x86_ginsn_jump() - can't the two be
> folded?
> 

This was in my TODO items.  I have merged the two functions now for V5.

>> +static ginsnS *
>> +x86_ginsn_enter (const symbolS *insn_end_sym)
>> +{
>> +  ginsnS *ginsn = NULL;
>> +  ginsnS *ginsn_next = NULL;
>> +  ginsnS *ginsn_last = NULL;
>> +
>> +  gas_assert (i.imm_operands == 2);
>> +
>> +  /* For non-zero size operands, bail out as untraceable for SCFI.  */
>> +  if ((i.op[0].imms->X_op != O_constant || i.op[0].imms->X_add_symbol != 0)
>> +      || (i.op[1].imms->X_op != O_constant || i.op[1].imms->X_add_symbol != 0))
> 
> While the comment makes sufficiently clear what's meant, the use of (inner)
> parentheses here is still confusing as to whether indeed the || are meant.
> 
>> +    {
>> +      as_bad ("SCFI: enter insn with non-zero operand not supported");
>> +      return ginsn;
>> +    }
>> +
>> +  /* If the nesting level is 0, the processor pushes the frame pointer from
>> +     the BP/EBP/RBP register onto the stack, copies the current stack
>> +     pointer from the SP/ESP/RSP register into the BP/EBP/RBP register, and
>> +     loads the SP/ESP/RSP register with the current stack-pointer value
>> +     minus the value in the size operand.  */
>> +  ginsn = ginsn_new_sub (insn_end_sym, false,
>> +			 GINSN_SRC_REG, REG_SP, 0,
>> +			 GINSN_SRC_IMM, 0, 8,
> 
> I guess 8 is the operand size and you simply hope no-one's going to use
> a 16-bit ENTER?
> 

Fixed for V5.

>> +			 GINSN_DST_REG, REG_SP, 0);
>> +  ginsn_set_where (ginsn);
>> +  ginsn_next = ginsn_new_store (insn_end_sym, false,
>> +				GINSN_SRC_REG, REG_FP,
>> +				GINSN_DST_INDIRECT, REG_SP, 0);
>> +  ginsn_set_where (ginsn_next);
>> +  gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>> +  ginsn_last = ginsn_new_mov (insn_end_sym, false,
>> +			      GINSN_SRC_REG, REG_SP, 0,
>> +			      GINSN_DST_REG, REG_FP, 0);
>> +  ginsn_set_where (ginsn_last);
>> +  gas_assert (!ginsn_link_next (ginsn_next, ginsn_last));
>> +
>> +  return ginsn;
>> +}
>> +
>> +static bool
>> +x86_ginsn_safe_to_skip (void)
>> +{
>> +  bool skip_p = false;
>> +  uint16_t opcode = i.tm.base_opcode;
>> +
>> +  switch (opcode)
>> +    {
>> +    case 0x39:
> 
> This isn't the only CMP encoding, and ...
> 
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* cmp reg, reg.  */
>> +      skip_p = true;
>> +      break;
>> +    case 0x85:
> 
> ... this isn't the only TEST one.
> 

These were added as I ran into them. Will add other opcodes and a FIXME 
around the outstanding problem:  There may also be yet more opcodes 
which are safe to skip...

>> +      /* test reg, reg/mem.  */
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      skip_p = true;
>> +      break;
>> +    default:
>> +      break;
>> +    }
>> +
>> +  return skip_p;
>> +}
>> +
>> +#define X86_GINSN_UNHANDLED_NONE      0
>> +#define X86_GINSN_UNHANDLED_DEST_REG  1
>> +#define X86_GINSN_UNHANDLED_CFG       2
>> +#define X86_GINSN_UNHANDLED_STACKOP   3
>> +
>> +/* Check the input insn for its impact on the correctness of the synthesized
>> +   CFI.  Returns an error code to the caller.  */
>> +
>> +static int
>> +x86_ginsn_unhandled (void)
>> +{
>> +  int err = X86_GINSN_UNHANDLED_NONE;
>> +  const reg_entry *reg_op;
>> +  unsigned int dw2_regnum;
>> +
>> +  /* Keep an eye out for instructions affecting control flow.  */
>> +  if (i.tm.opcode_modifier.jump)
>> +    err = X86_GINSN_UNHANDLED_CFG;
>> +  /* Also, for any instructions involving an implicit update to the stack
>> +     pointer.  */
>> +  else if (i.tm.opcode_modifier.implicitstackop)
>> +    err = X86_GINSN_UNHANDLED_STACKOP;
>> +  /* Finally, also check if the missed instructions are affecting REG_SP or
>> +     REG_FP.  The destination operand is the last at all stages of assembly
>> +     (due to following AT&T syntax layout in the internal representation).  In
>> +     case of Intel syntax input, this still remains true as swap_operands ()
>> +     is done by now.
>> +     PS: These checks do not involve index / base reg, as indirect memory
>> +     accesses via REG_SP or REG_FP do not affect SCFI correctness.
>> +     (Also note these instructions are candidates for other ginsn generation
>> +     modes in future.  TBD_GINSN_GEN_NOT_SCFI.)  */
>> +  else if (i.operands && i.reg_operands
>> +	   && !(i.flags[i.operands - 1] & Operand_Mem))
>> +    {
>> +      reg_op = i.op[i.operands - 1].regs;
>> +      if (reg_op)
>> +	{
>> +	  dw2_regnum = ginsn_dw2_regnum (reg_op);
>> +	  if (dw2_regnum == REG_SP || dw2_regnum == REG_FP)
>> +	    err = X86_GINSN_UNHANDLED_DEST_REG;
>> +	}
> 
> else
>    err = X86_GINSN_UNHANDLED_CONFUSED;
> 
> ? You can't let this case go silently. An alternative would be to
> assert instead of using if().
> 

No, the other cases are not important for SCFI correctness.  The case of 
potential register save/restore operation cannot be detected 
automatically.  Keeping an eye on ISA additions will be the only way for 
that category.

>> +    }
>> +
>> +  return err;
>> +}
>> +
>> +/* Generate one or more generic GAS instructions, a.k.a, ginsns for the current
>> +   machine instruction.
>> +
>> +   Returns the head of linked list of ginsn(s) added, if success; Returns NULL
>> +   if failure.
>> +
>> +   The input ginsn_gen_mode GMODE determines the set of minimal necessary
>> +   ginsns necessary for correctness of any passes applicable for that mode.
>> +   For supporting the GINSN_GEN_SCFI generation mode, following is the list of
>> +   machine instructions that must be translated into the corresponding ginsns
>> +   to ensure correctness of SCFI:
>> +     - All instructions affecting the two registers that could potentially
>> +       be used as the base register for CFA tracking.  For SCFI, the base
>> +       register for CFA tracking is limited to REG_SP and REG_FP only for
>> +       now.
>> +     - All change of flow instructions: conditional and unconditional branches,
>> +       call and return from functions.
>> +     - All instructions that can potentially be a register save / restore
>> +       operation.
> 
> This could do with being more fine grained, as "potentially" is pretty vague,
> and (as per earlier version review comments) my take on this is a much wider
> set than yours.
> 

I would like to understand more on this comment, especially the "my take 
on this is a much wider set than yours".  I see its being hinted at in 
different flavors in the current review.

I see some issues pointed out in this review (addressing modes of mov 
etc, safe to skip opcodes for TEST, CMP) etc., but it seems that your 
concerns are wider than this.

>> +     - All instructions that perform stack manipulation implicitly: the CALL,
>> +       RET, PUSH, POP, ENTER, and LEAVE instructions.
>> +
>> +   The function currently supports GINSN_GEN_SCFI ginsn generation mode only.
>> +   To support other generation modes will require work on this target-specific
>> +   process of creation of ginsns:
>> +     - Some of such places are tagged with TBD_GINSN_GEN_NOT_SCFI to serve as
>> +       possible starting points.
> 
> Oh, I see you're not meaning to have this annotation consistently. That's a
> little sad ...
> 
>> +     - Also note that ginsn representation may need enhancements.  Specifically,
>> +       note some TBD_GINSN_INFO_LOSS and TBD_GINSN_REPRESENTATION_LIMIT markers.
>> +   */
>> +
>> +static ginsnS *
>> +x86_ginsn_new (const symbolS *insn_end_sym, enum ginsn_gen_mode gmode)
>> +{
>> +  int err = 0;
>> +  uint16_t opcode;
>> +  unsigned int dw2_regnum;
>> +  ginsnS *ginsn = NULL;
>> +  ginsnS *ginsn_next = NULL;
>> +  ginsnS *ginsn_last = NULL;
>> +  /* In 64-bit mode, the default stack update size is 8 bytes.  */
>> +  int stack_opnd_size = 8;
>> +
>> +  /* Currently supports generation of selected ginsns, sufficient for
>> +     the use-case of SCFI only.  */
>> +  if (gmode != GINSN_GEN_SCFI)
>> +    return ginsn;
>> +
>> +  opcode = i.tm.base_opcode;
>> +
>> +  switch (opcode)
>> +    {
>> +    case 0x1:
>> +      /* add reg, reg/mem.  */
>> +    case 0x29:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
> 
> What about the new APX NDD encodings in EVEX map 4?
> 

I need to disallow them until I have proper handling for them.

I dont know why I didnt include the is_apx_rex2_encoding () and 
is_apx_evex_encoding () check to bar them in V4.  I will add that check 
in x86_ginsn_new to disallow those with SCFI for now.

>> +      /* sub reg, reg/mem.  */
> 
> Please be careful with placing such comments when there are multiple
> case labels (or fall-through). I think these would better go on the
> same lines as the case labels themselves.
> 
>> +      ginsn = x86_ginsn_addsub_reg_mem (insn_end_sym);
>> +      break;
>> +
>> +    case 0x3:
>> +      /* add reg/mem, reg.  */
>> +    case 0x2b:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* sub reg/mem, reg.  */
>> +      ginsn = x86_ginsn_addsub_mem_reg (insn_end_sym);
>> +      break;
>> +
>> +    case 0xa0:
>> +    case 0xa8:
>> +      /* push fs / push gs have opcode_space == SPACE_0F.  */
>> +      if (i.tm.opcode_space != SPACE_0F)
>> +	break;
>> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>> +	stack_opnd_size = 2;
> 
> But only if there's not also REX.W / REX2.W.
> 

This needs to be done for both push and pop then.  I am not sure about 
the REX2.W check though.

Something like
!is_apx_rex2_encoding () && !(i.prefix[REX_PREFIX] & REX_W)
covers it ?

>> +      /* push fs / push gs.  */
>> +      ginsn = ginsn_new_sub (insn_end_sym, false,
>> +			     GINSN_SRC_REG, REG_SP, 0,
>> +			     GINSN_SRC_IMM, 0, stack_opnd_size,
>> +			     GINSN_DST_REG, REG_SP, 0);
>> +      ginsn_set_where (ginsn);
>> +      ginsn_next = ginsn_new_store (insn_end_sym, false,
>> +				    GINSN_SRC_REG, dw2_regnum,
>> +				    GINSN_DST_INDIRECT, REG_SP, 0);
>> +      ginsn_set_where (ginsn_next);
>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>> +      break;
>> +
>> +    case 0xa1:
>> +    case 0xa9:
>> +      /* pop fs / pop gs have opcode_space == SPACE_0F.  */
>> +      if (i.tm.opcode_space != SPACE_0F)
>> +	break;
>> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>> +	stack_opnd_size = 2;
>> +      /* pop fs / pop gs.  */
>> +      ginsn = ginsn_new_load (insn_end_sym, false,
>> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
>> +			      GINSN_DST_REG, dw2_regnum);
>> +      ginsn_set_where (ginsn);
>> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
>> +				  GINSN_SRC_REG, REG_SP, 0,
>> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
>> +				  GINSN_DST_REG, REG_SP, 0);
>> +      ginsn_set_where (ginsn_next);
>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>> +      break;
>> +
>> +    case 0x50 ... 0x57:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* push reg.  */
>> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>> +	stack_opnd_size = 2;
>> +      ginsn = ginsn_new_sub (insn_end_sym, false,
>> +			     GINSN_SRC_REG, REG_SP, 0,
>> +			     GINSN_SRC_IMM, 0, stack_opnd_size,
>> +			     GINSN_DST_REG, REG_SP, 0);
>> +      ginsn_set_where (ginsn);
>> +      ginsn_next = ginsn_new_store (insn_end_sym, false,
>> +				    GINSN_SRC_REG, dw2_regnum,
>> +				    GINSN_DST_INDIRECT, REG_SP, 0);
>> +      ginsn_set_where (ginsn_next);
>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>> +      break;
>> +
>> +    case 0x58 ... 0x5f:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* pop reg.  */
>> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>> +      ginsn = ginsn_new_load (insn_end_sym, false,
>> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
>> +			      GINSN_DST_REG, dw2_regnum);
>> +      ginsn_set_where (ginsn);
>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>> +	stack_opnd_size = 2;
>> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
>> +				  GINSN_SRC_REG, REG_SP, 0,
>> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
>> +				  GINSN_DST_REG, REG_SP, 0);
>> +      ginsn_set_where (ginsn_next);
>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>> +      break;
>> +
>> +    case 0x6a:
>> +    case 0x68:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* push imm8/imm16/imm32.  */
>> +      if (opcode == 0x6a)
>> +	stack_opnd_size = 1;
>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.
>> +	 However, this prefix may only be present when opcode is 0x68.  */
> 
> Why would this be limited to opcode 0x68?
> 

My understanding from the manual is that 0x6a will always be push imm8. 
Hence, 66H is expected only with 0x68. Isnt this incorrect ?

>> +      else if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>> +	stack_opnd_size = 2;
>> +      else
>> +	stack_opnd_size = 4;
> 
> In 64-bit mode stack operations are never 32-bit.
> 

Ah right, so I need to use 8 in the fallback case, i.e., stack_opnd_size 
is 8 in case of push imm32.

>> +      /* Skip getting the value of imm from machine instruction
>> +	 because this is not important for SCFI.  */
>> +      ginsn = ginsn_new_sub (insn_end_sym, false,
>> +			     GINSN_SRC_REG, REG_SP, 0,
>> +			     GINSN_SRC_IMM, 0, stack_opnd_size,
>> +			     GINSN_DST_REG, REG_SP, 0);
>> +      ginsn_set_where (ginsn);
>> +      ginsn_next = ginsn_new_store (insn_end_sym, false,
>> +				    GINSN_SRC_IMM, 0,
>> +				    GINSN_DST_INDIRECT, REG_SP, 0);
>> +      ginsn_set_where (ginsn_next);
>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>> +      break;
>> +
>> +    case 0x70 ... 0x7f:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      ginsn = x86_ginsn_jump_cond (insn_end_sym);
>> +      break;
> 
> I think this wants a comment briefly explaining why SPACE_0F opcodes
> 0x8[0-f] don't need handling explicitly. Same for JMP (0xeb) below.
> 

OK.

>> +    case 0x80:
>> +    case 0x81:
>> +    case 0x83:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      ginsn = x86_ginsn_alu_imm (insn_end_sym);
>> +      break;
>> +
>> +    case 0x8a:
>> +    case 0x8b:
>> +      /* Move reg/mem, reg (8/16/32/64).  */
>> +    case 0x88:
>> +    case 0x89:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* mov reg, reg/mem. (8/16/32/64).  */
>> +      ginsn = x86_ginsn_move (insn_end_sym);
>> +      break;
>> +
>> +    case 0x8d:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* lea disp(%src), %dst */
> 
> disp(%src) doesn't really represent the full set of possibilities.
> Why not use "mem" as you do elsewhere?
> 

I can update this to "lea disp(%base,%index,imm) %dst".  If I use mem 
now, it may confuse me later.

>> +      ginsn = x86_ginsn_lea (insn_end_sym);
>> +      break;
>> +
>> +    case 0x8f:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* pop to mem.  */
> 
> Or to register. While this won't happen today, allowing a means to
> have the programmer request that alternative encoding would surely
> miss to update the code here. Hence this code would better be ready
> to deal with the case right away.
> 
>> +      gas_assert (i.base_reg);
> 
> POP isn't different from other explicit memory accesses: All forms
> are allowed, and hence a base register may not be in use.
> 
> Both remarks also apply to PUSH further down.
> 

OK. I will take a look.

>> +      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>> +      ginsn = ginsn_new_load (insn_end_sym, false,
>> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
>> +			      GINSN_DST_INDIRECT, dw2_regnum);
>> +      ginsn_set_where (ginsn);
>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>> +	stack_opnd_size = 2;
>> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
>> +				  GINSN_SRC_REG, REG_SP, 0,
>> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
>> +				  GINSN_DST_REG, REG_SP, 0);
>> +      ginsn_set_where (ginsn_next);
>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>> +      break;
>> +
>> +    case 0x9c:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* pushf / pushfq.  */
>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>> +	stack_opnd_size = 2;
>> +      ginsn = ginsn_new_sub (insn_end_sym, false,
>> +			     GINSN_SRC_REG, REG_SP, 0,
>> +			     GINSN_SRC_IMM, 0, stack_opnd_size,
>> +			     GINSN_DST_REG, REG_SP, 0);
>> +      ginsn_set_where (ginsn);
>> +      /* Tracking EFLAGS register by number is not necessary.  */
> 
> How does this fit with ...
> 
>> +      ginsn_next = ginsn_new_store (insn_end_sym, false,
>> +				    GINSN_SRC_IMM, 0,
>> +				    GINSN_DST_INDIRECT, REG_SP, 0);
>> +      ginsn_set_where (ginsn_next);
>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>> +      break;
>> +
>> +    case 0x9d:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* popf / popfq.  */
>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>> +	stack_opnd_size = 2;
>> +      /* FIXME - hardcode the actual DWARF reg number value.  As for SCFI
>> +	 correctness, although this behaves simply a placeholder value; its
>> +	 just clearer if the value is correct.  */
>> +      dw2_regnum = 49;
> 
> ... this?
> 

Inconsistency on my part. I have fixed the above case to use the actual 
DWARF reg number like done here.

>> +      ginsn = ginsn_new_load (insn_end_sym, false,
>> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
>> +			      GINSN_DST_REG, dw2_regnum);
>> +      ginsn_set_where (ginsn);
>> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
>> +				  GINSN_SRC_REG, REG_SP, 0,
>> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
>> +				  GINSN_DST_REG, REG_SP, 0);
>> +      ginsn_set_where (ginsn_next);
>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>> +      break;
>> +
>> +    case 0xff:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* push from mem.  */
>> +      if (i.tm.extension_opcode == 6)
>> +	{
>> +	  /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>> +	  if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>> +	    stack_opnd_size = 2;
>> +	  ginsn = ginsn_new_sub (insn_end_sym, false,
>> +				 GINSN_SRC_REG, REG_SP, 0,
>> +				 GINSN_SRC_IMM, 0, stack_opnd_size,
>> +				 GINSN_DST_REG, REG_SP, 0);
>> +	  ginsn_set_where (ginsn);
>> +	  /* These instructions have no imm, only indirect access.  */
>> +	  gas_assert (i.base_reg);
>> +	  dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>> +	  ginsn_next = ginsn_new_store (insn_end_sym, false,
>> +					GINSN_SRC_INDIRECT, dw2_regnum,
>> +					GINSN_DST_INDIRECT, REG_SP, 0);
>> +	  ginsn_set_where (ginsn_next);
>> +	  gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>> +	}
>> +      else if (i.tm.extension_opcode == 4)
>> +	{
>> +	  /* jmp r/m.  E.g., notrack jmp *%rax.  */
>> +	  if (i.reg_operands)
>> +	    {
>> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>> +	      ginsn_set_where (ginsn);
>> +	    }
>> +	  else if (i.mem_operands && i.index_reg)
>> +	    {
>> +	      /* jmp    *0x0(,%rax,8).  */
>> +	      dw2_regnum = ginsn_dw2_regnum (i.index_reg);
>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>> +	      ginsn_set_where (ginsn);
> 
> What if both base and index are in use? Like for PUSH/POP, all addressing
> forms are permitted here and ...
> 
>> +	    }
>> +	  else if (i.mem_operands && i.base_reg)
>> +	    {
>> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>> +	      ginsn_set_where (ginsn);
>> +	    }
>> +	}
>> +      else if (i.tm.extension_opcode == 2)
>> +	{
>> +	  /* 0xFF /2 (call).  */
>> +	  if (i.reg_operands)
>> +	    {
>> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>> +	      ginsn = ginsn_new_call (insn_end_sym, true,
>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>> +	      ginsn_set_where (ginsn);
>> +	    }
>> +	  else if (i.mem_operands && i.base_reg)
>> +	    {
>> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>> +	      ginsn = ginsn_new_call (insn_end_sym, true,
>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>> +	      ginsn_set_where (ginsn);
>> +	    }
> 
> ... here.
> 

For the indirect jump and call instructions, the target destination is 
not necessary to be known.  Indirect jumps will cause SCFI to error out 
as control flow cannot be constructed.

Call instructions are assumed to transfer control out of function.

In other words, more information in these cases will not be of use to SCFI.

>> +	}
>> +      break;
>> +
>> +    case 0xc2:
>> +    case 0xc3:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* Near ret.  */
>> +      ginsn = ginsn_new_return (insn_end_sym, true);
>> +      ginsn_set_where (ginsn);
>> +      break;
> 
> No tracking of the stack pointer adjustment?
> 

No stack unwind information for a function is relevant after the 
function has returned.  So, tracking of stack pointer adjustment by 
return is not necessary.

>> +    case 0xc8:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* enter.  */
>> +      ginsn = x86_ginsn_enter (insn_end_sym);
>> +      break;
>> +
>> +    case 0xc9:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* The 'leave' instruction copies the contents of the RBP register
>> +	 into the RSP register to release all stack space allocated to the
>> +	 procedure.  */
>> +      ginsn = ginsn_new_mov (insn_end_sym, false,
>> +			     GINSN_SRC_REG, REG_FP, 0,
>> +			     GINSN_DST_REG, REG_SP, 0);
>> +      ginsn_set_where (ginsn);
>> +      /* Then it restores the old value of the RBP register from the stack.  */
>> +      ginsn_next = ginsn_new_load (insn_end_sym, false,
>> +				   GINSN_SRC_INDIRECT, REG_SP, 0,
>> +				   GINSN_DST_REG, REG_FP);
>> +      ginsn_set_where (ginsn_next);
>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>> +      ginsn_last = ginsn_new_add (insn_end_sym, false,
>> +				  GINSN_SRC_REG, REG_SP, 0,
>> +				  GINSN_SRC_IMM, 0, 8,
> 
> Same comment as for ENTER wrt operand size.
> 

I have added handling 16-bit leave for V5.

>> +				  GINSN_DST_REG, REG_SP, 0);
>> +      ginsn_set_where (ginsn_next);
>> +      gas_assert (!ginsn_link_next (ginsn_next, ginsn_last));
>> +      break;
>> +
>> +    case 0xe0 ... 0xe2:
>> +      /* loop / loope / loopne.  */
>> +    case 0xe3:
>> +      /* jecxz / jrcxz.  */
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      ginsn = x86_ginsn_jump_cond (insn_end_sym);
>> +      ginsn_set_where (ginsn);
>> +      break;
>> +
>> +    case 0xe8:
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* PS: SCFI machinery does not care about which func is being
>> +	 called.  OK to skip that info.  */
>> +      ginsn = ginsn_new_call (insn_end_sym, true,
>> +			      GINSN_SRC_SYMBOL, 0, NULL);
>> +      ginsn_set_where (ginsn);
>> +      break;
> 
> Again - what about the stack pointer adjustment? Or wait - you only
> care about the local function's view. Just that this will get you
> in trouble for something like
> 
> 	call	1f
> 1:
> 	pop	%rax
> 
> CALL with zero displacement really acts as if "push %rip".
> 

Trouble for SCFI, yes.

Some of these cases may get help once the feature to issue diagnostic at 
unbalanced stack at the end of function is implemented.

I'm afraid there is no other respite for the user here otherwise.  I'd 
expect this to be a corner case though.  I will keep this is my notes 
until I have a resolution to this.

>> +    case 0xeb:
>> +      /* If opcode_space != SPACE_BASE, this is not a jmp insn.  Skip it
>> +	 for GINSN_GEN_SCFI.  */
>> +      if (i.tm.opcode_space != SPACE_BASE)
>> +	break;
>> +      /* Unconditional jmp.  */
>> +      ginsn = x86_ginsn_jump (insn_end_sym);
>> +      ginsn_set_where (ginsn);
>> +      break;
>> +
>> +    default:
>> +      /* TBD_GINSN_GEN_NOT_SCFI: Skip all other opcodes uninteresting for
>> +	 GINSN_GEN_SCFI mode.  */
>> +      break;
>> +    }
>> +
>> +  if (!ginsn && !x86_ginsn_safe_to_skip ())
>> +    {
>> +      /* For all unhandled insns that are not whitelisted, check that they do
>> +	 not impact SCFI correctness.  */
>> +      err = x86_ginsn_unhandled ();
>> +      switch (err)
>> +	{
>> +	case X86_GINSN_UNHANDLED_NONE:
>> +	  break;
>> +	case X86_GINSN_UNHANDLED_DEST_REG:
>> +	  /* Not all writes to REG_FP are harmful in context of SCFI.  Simply
>> +	     generate a GINSN_TYPE_OTHER with destination set to the
>> +	     appropriate register.  The SCFI machinery will bail out if this
>> +	     ginsn affects SCFI correctness.  */
>> +	  dw2_regnum = ginsn_dw2_regnum (i.op[i.operands - 1].regs);
>> +	  ginsn = ginsn_new_other (insn_end_sym, true,
>> +				   GINSN_SRC_IMM, 0,
>> +				   GINSN_SRC_IMM, 0,
>> +				   GINSN_DST_REG, dw2_regnum);
>> +	  ginsn_set_where (ginsn);
>> +	  break;
>> +	case X86_GINSN_UNHANDLED_CFG:
>> +	  /* Fall through.  */
>> +	case X86_GINSN_UNHANDLED_STACKOP:
> 
> No fall-through comment please between immediately successive case labels.
> 
>> +	  as_bad (_("SCFI: unhandled op 0x%x may cause incorrect CFI"),
>> +		  i.tm.base_opcode);
> 
> As a remark: %#x is a one byte shorter representation with largely the
> same effect (plus, nicely imo, omitting the 0x when the value is zero).
> 

OK to both of the above comments.

>> +	  break;
>> +	default:
>> +	  abort ();
>> +	  break;
>> +	}
>> +    }
>> +
>> +  return ginsn;
>> +}
>> +
>>   /* This is the guts of the machine-dependent assembler.  LINE points to a
>>      machine dependent instruction.  This function is supposed to emit
>>      the frags/bytes it assembles to.  */
>> @@ -5299,6 +6272,7 @@ md_assemble (char *line)
>>     const char *end, *pass1_mnem = NULL;
>>     enum i386_error pass1_err = 0;
>>     const insn_template *t;
>> +  ginsnS *ginsn;
>>     struct last_insn *last_insn
>>       = &seg_info(now_seg)->tc_segment_info_data.last_insn;
>>   
>> @@ -5830,6 +6804,14 @@ md_assemble (char *line)
>>     /* We are ready to output the insn.  */
>>     output_insn (last_insn);
>>   
>> +  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
>> +     performed in i386_target_format.  */
> 
> See my earlier comment - it's yet more restrictive (as in not covering
> e.g. the Windows ABI, which importantly is also used in EFI).
> 

Not clear, can you please elaborate ?

>> +  if (flag_synth_cfi)
>> +    {
>> +      ginsn = x86_ginsn_new (symbol_temp_new_now (), frch_ginsn_gen_mode ());
>> +      frch_ginsn_data_append (ginsn);
>> +    }
>> +
>>     insert_lfence_after ();
>>   
>>     if (i.tm.opcode_modifier.isprefix)
>> @@ -11333,6 +12315,7 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
>>     const char *end;
>>     unsigned int j;
>>     valueT val;
>> +  ginsnS *ginsn;
>>     bool vex = false, xop = false, evex = false;
>>     struct last_insn *last_insn;
>>   
>> @@ -12104,6 +13087,14 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
>>     last_insn->name = ".insn directive";
>>     last_insn->file = as_where (&last_insn->line);
>>   
>> +  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
>> +     performed in i386_target_format.  */
>> +  if (flag_synth_cfi)
>> +    {
>> +      ginsn = x86_ginsn_new (symbol_temp_new_now (), frch_ginsn_gen_mode ());
> 
> If you really want to use this function here, more cases will need
> handling (perhaps even beyond what I've commented on above). However,
> I'd strongly suggest splitting off the "unhandled" part of that
> function, and using only that here. After all you hardly know what
> exactly the programmer's intentions are. Because of that, you may
> also want to consider simply forbidding use of .insn when SCFI is to
> be generated.
> 

Sorry, not clear to me. I am not sure how moving simply the "unhandled" 
part of x86_ginsn_new will work better.  If we dont handle the 
instructions, more instructions will potentially trigger the "unhandled" 
case.

I could consider forbidding use of .insn as you suggest, for the first cut.

>> +      frch_ginsn_data_append (ginsn);
>> +    }
>> +
>>    done:
>>     *saved_ilp = saved_char;
>>     input_line_pointer = line;
>> @@ -15748,6 +16739,11 @@ i386_target_format (void)
>>     else
>>       as_fatal (_("unknown architecture"));
>>   
>> +#if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
>> +  if (flag_synth_cfi && x86_elf_abi != X86_64_ABI)
>> +    as_fatal (_("Synthesizing CFI is not supported for this ABI"));
>> +#endif
> 
> Elsewhere I've raised the question of whether we should really check
> OBJ_MAYBE_ELF anywhere in this file. However, as long as we do, you'll
> need to accompany that with an IS_ELF check in the if(). If, to
> address my unreachable code comment near the top, you elect to add
> further OBJ_ELF / OBJ_MAYBE_ELF checks, there you then wouldn't need
> that further check (as flag_synth_cfi set then implies ELF).
> 

I took the suggestion to not compile unnecessarily for non-ELF targets.

Use IS_ELF in if (): But flag_synth_cfi does not imply IS_ELF, it 
implies OBJ_ELF || OBJ_MAYBE_ELF. Looks to me then, that I should 
include IS_ELF check in each of the 3 blocks.

Thanks again
  
Jan Beulich Jan. 9, 2024, 9:30 a.m. UTC | #6
On 08.01.2024 20:33, Indu Bhagat wrote:
> On 1/5/24 05:58, Jan Beulich wrote:
>> On 03.01.2024 08:15, Indu Bhagat wrote:
>>> @@ -2311,6 +2312,13 @@ obj_elf_size (int ignore ATTRIBUTE_UNUSED)
>>>         symbol_get_obj (sym)->size = XNEW (expressionS);
>>>         *symbol_get_obj (sym)->size = exp;
>>>       }
>>> +
>>> +  /* If the symbol in the directive matches the current function being
>>> +     processed, indicate end of the current stream of ginsns.  */
>>> +  if (flag_synth_cfi
>>> +      && S_IS_FUNCTION (sym) && sym == ginsn_data_func_symbol ())
>>> +    ginsn_data_end (symbol_temp_new_now ());
>>> +
>>>     demand_empty_rest_of_line ();
>>>   }
>>>   
>>> @@ -2499,6 +2507,14 @@ obj_elf_type (int ignore ATTRIBUTE_UNUSED)
>>>   	elfsym->symbol.flags &= ~mask;
>>>       }
>>>   
>>> +  if (S_IS_FUNCTION (sym) && flag_synth_cfi)
>>> +    {
>>> +      /* Wrap up processing the previous block of ginsns first.  */
>>> +      if (frchain_now->frch_ginsn_data)
>>> +	ginsn_data_end (symbol_temp_new_now ());
>>> +      ginsn_data_begin (sym);
>>> +    }
>>> +
>>>     demand_empty_rest_of_line ();
>>>   }
>>
>> Documentation about .type and .size use could be more precise. Generally
>> it is entirely benign where exactly these directives live relative to
>> the code they annotate.
> 
> Added a comment for V5.
> 
> As stated in as.texi, usage of .type and .size will be bread and butter 
> for SCFI: "The input asm must begin with the @code{.type} directive, and 
> should ideally be closed off using a @code{.size} directive."

Except that to me "must begin" talks about a source file, not individual
functions. Hence that wording (which I saw) is at bets ambiguous, which
led me to ask for it to be "more precise".

>>> +/* Check whether a '66H' prefix accompanies the instruction.
>>
>> With APX 16-bit operand size isn't necessarily represented by a 66h
>> prefix, but perhaps with an "embedded prefix" inside the EVEX one.
>> Therefore both the comment and even more so ...
>>
>>> +   The current users of this API are in the handlers for PUSH, POP
>>> +   instructions.  These instructions affect the stack pointer implicitly:  the
>>> +   operand size (16, 32, or 64 bits) determines the amount by which the stack
>>> +   pointer is decremented (2, 4 or 8).  When '66H' prefix is present, the
>>> +   instruction has a 16-bit operand.  */
>>> +
>>> +static bool
>>> +ginsn_prefix_66H_p (i386_insn insn)
>>
>> ... the function name would better not allude to just the legacy
>> encoding. Maybe ginsn_opsize_prefix_p()?
>>
> 
> Isnt 66H_p more readable and easier to follow because that's what the 
> function is currently checking ?  If more scenarios were being handled, 
> ginsn_opsize_prefix_p () would fit better.

Well, as said - with APX you can't get away with just 0x66 prefix checking.
That prefix is simply illegal to use with EVEX-encoded insns.

>>> +/* Get the DWARF register number for the given register entry.
>>> +   For specific byte/word register accesses like al, cl, ah, ch, r8dyte etc.,
>>
>> What's "r8dyte"? I expect it's a typo, but I can't derive what was
>> intended to be written.
> 
> Typo it is.  I meant to write r8d. I have updated this to " like al, cl, 
> ah, ch, r8d, r20w etc."

Then perhaps further s;byte/word;byte/word/dword; ?

>>> +static unsigned int
>>> +ginsn_dw2_regnum (const reg_entry *ireg)
>>> +{
>>> +  /* PS: Note the data type here as int32_t, because of Dw2Inval (-1).  */
>>> +  int32_t dwarf_reg = Dw2Inval;
>>> +  const reg_entry *temp = ireg;
>>> +
>>> +  /* ginsn creation is available for AMD64 abi only ATM.  Other flag_code
>>> +     are not expected.  */
>>> +  gas_assert (flag_code == CODE_64BIT);
>>
>> With this assertion it is kind of odd to see a further use of flag_code
>> below.
> 
> I think you are referring to the checks like:
> 
>    /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>    if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>      stack_opnd_size = 2;
> 
> Although the check on flag_code is redundant now, I chose to have them 
> here to keep it aligned to how the prefix is meant to be used.

Yet the same is true in 32-bit mode (minus, of course, the REX.W aspect
mentioned elsewhere, but that could be ignored here as 32-bit code has
no way of setting that flag). IOW - either you drop the use of flag_code,
or you make it actually correct (as in: not misleading).

>>> +  /* op %reg, symbol.  */
>>> +  if (i.mem_operands == 1 && !i.base_reg && !i.index_reg)
>>> +    return ginsn;
>>
>> Why does this need special treatment, and why is returning NULL here
>> okay?
> 
> An instruction like "addq    %rax, symbol" etc are uninteresting for 
> SCFI.  One of feedback in a previous iteration was to "consider not 
> generating ginsns for cases that are known to be uninteresting".

But then why not simply

  if (i.mem_operands)
    return ginsn;

? What memory is added to doesn't matter for SFCI, does it? Aiui you
only really want to notice adds to registers.

>>> +static ginsnS *
>>> +x86_ginsn_addsub_mem_reg (const symbolS *insn_end_sym)
>>> +{
>>> +  unsigned int dw2_regnum;
>>> +  unsigned int src2_dw2_regnum;
>>> +  const reg_entry *mem_reg;
>>> +  int32_t gdisp = 0;
>>> +  ginsnS *ginsn = NULL;
>>> +  ginsnS * (*ginsn_func) (const symbolS *, bool,
>>> +			  enum ginsn_src_type, unsigned int, offsetT,
>>> +			  enum ginsn_src_type, unsigned int, offsetT,
>>> +			  enum ginsn_dst_type, unsigned int, offsetT);
>>> +  uint16_t opcode = i.tm.base_opcode;
>>> +
>>> +  gas_assert (opcode == 0x3 || opcode == 0x2b);
>>> +  ginsn_func = (opcode == 0x3) ? ginsn_new_add : ginsn_new_sub;
>>> +
>>> +  /* op symbol, %reg.  */
>>> +  if (i.mem_operands && !i.base_reg && !i.index_reg)
>>> +    return ginsn;
>>> +  /* op mem, %reg.  */
>>
>> /* op reg/mem, reg.  */ you mean? Which then raises the question ...
>>
> 
> Yes (updated the comment for V5).
> 
>>> +  dw2_regnum = ginsn_dw2_regnum (i.op[1].regs);
>>> +
>>> +  if (i.mem_operands)
>>> +    {
>>> +      mem_reg = (i.base_reg) ? i.base_reg : i.index_reg;
>>> +      src2_dw2_regnum = ginsn_dw2_regnum (mem_reg);
>>> +      if (i.disp_operands == 1)
>>> +	gdisp = i.op[0].disps->X_add_number;
>>> +      ginsn = ginsn_func (insn_end_sym, true,
>>> +			  GINSN_SRC_INDIRECT, src2_dw2_regnum, gdisp,
>>> +			  GINSN_SRC_REG, dw2_regnum, 0,
>>> +			  GINSN_DST_REG, dw2_regnum, 0);
>>> +      ginsn_set_where (ginsn);
>>> +    }
>>> +
>>> +  return ginsn;
>>> +}
>>
>> ... why a register source isn't dealt with here.
> 
> I saw that the opcode used when the source is reg is either 0x1 or 0x29, 
> hence concluded that the handling in x86_ginsn_addsub_reg_mem should 
> suffice.  Perhaps this is a GAS implementation artifact, and I should 
> have handling for register source here in x86_ginsn_addsub_mem_reg (for 
> opcodes 0x3 and 0x2b) ?

I think you should, yes. Try using the {load} / {store} pseudo-prefixes,
and I think you'll see these opcodes used.

>>> +static ginsnS *
>>> +x86_ginsn_move (const symbolS *insn_end_sym)
>>> +{
>>> +  ginsnS *ginsn;
>>> +  unsigned int dst_reg;
>>> +  unsigned int src_reg;
>>> +  offsetT src_disp = 0;
>>> +  offsetT dst_disp = 0;
>>> +  const reg_entry *dst = NULL;
>>> +  const reg_entry *src = NULL;
>>> +  uint16_t opcode = i.tm.base_opcode;
>>> +  enum ginsn_src_type src_type = GINSN_SRC_REG;
>>> +  enum ginsn_dst_type dst_type = GINSN_DST_REG;
>>> +
>>> +  if (opcode == 0x8b || opcode == 0x8a)
>>
>> Above when handling ALU insns you didn't care about byte ops - why do
>> you do so here (by checking for 0x8a, and 0x88 below)?
> 
> Because there may be weird code that saves and restores 8-byte registers 
> on stack.  For ALU insns, if the destination is REG_SP/REG_FP, we will 
> detect them in the unhandled track.

You talk about 8-byte registers when I asked about byte reg moves. If
there's a MOV targeting %spl or %bpl, is that really any different from,
say, an ADD doing so?

>>> +    }
>>> +
>>> +  ginsn_set_where (ginsn);
>>> +
>>> +  return ginsn;
>>> +}
>>
>> Throughout this function (and perhaps others as well), how come you
>> don't consider operand size at all? It matters whether results are
>> 64-bit, 32-bit, or 16-bit, after all.
> 
> Operation size matters: No, not for all instructions in context of SCFI. 
> For instructions using stack (save / restore), the size is important. 
> But for other instructions, operation size will not affect SCFI correctness.

But aren't you wrongly treating an update of %rbp and an update of,
say, %bp the same then? The latter should end up as untracable, aiui.

>>> +static int
>>> +x86_ginsn_unhandled (void)
>>> +{
>>> +  int err = X86_GINSN_UNHANDLED_NONE;
>>> +  const reg_entry *reg_op;
>>> +  unsigned int dw2_regnum;
>>> +
>>> +  /* Keep an eye out for instructions affecting control flow.  */
>>> +  if (i.tm.opcode_modifier.jump)
>>> +    err = X86_GINSN_UNHANDLED_CFG;
>>> +  /* Also, for any instructions involving an implicit update to the stack
>>> +     pointer.  */
>>> +  else if (i.tm.opcode_modifier.implicitstackop)
>>> +    err = X86_GINSN_UNHANDLED_STACKOP;
>>> +  /* Finally, also check if the missed instructions are affecting REG_SP or
>>> +     REG_FP.  The destination operand is the last at all stages of assembly
>>> +     (due to following AT&T syntax layout in the internal representation).  In
>>> +     case of Intel syntax input, this still remains true as swap_operands ()
>>> +     is done by now.
>>> +     PS: These checks do not involve index / base reg, as indirect memory
>>> +     accesses via REG_SP or REG_FP do not affect SCFI correctness.
>>> +     (Also note these instructions are candidates for other ginsn generation
>>> +     modes in future.  TBD_GINSN_GEN_NOT_SCFI.)  */
>>> +  else if (i.operands && i.reg_operands
>>> +	   && !(i.flags[i.operands - 1] & Operand_Mem))
>>> +    {
>>> +      reg_op = i.op[i.operands - 1].regs;
>>> +      if (reg_op)
>>> +	{
>>> +	  dw2_regnum = ginsn_dw2_regnum (reg_op);
>>> +	  if (dw2_regnum == REG_SP || dw2_regnum == REG_FP)
>>> +	    err = X86_GINSN_UNHANDLED_DEST_REG;
>>> +	}
>>
>> else
>>    err = X86_GINSN_UNHANDLED_CONFUSED;
>>
>> ? You can't let this case go silently. An alternative would be to
>> assert instead of using if().
> 
> No, the other cases are not important for SCFI correctness.  The case of 
> potential register save/restore operation cannot be detected 
> automatically.  Keeping an eye on ISA additions will be the only way for 
> that category.

That wasn't my point. reg_op turning out to be NULL is bogus considering
the earlier checks. Hence why an assertion may make sense, and hence why
otherwise I suggested an error indicator towards "internal error".

>>> +/* Generate one or more generic GAS instructions, a.k.a, ginsns for the current
>>> +   machine instruction.
>>> +
>>> +   Returns the head of linked list of ginsn(s) added, if success; Returns NULL
>>> +   if failure.
>>> +
>>> +   The input ginsn_gen_mode GMODE determines the set of minimal necessary
>>> +   ginsns necessary for correctness of any passes applicable for that mode.
>>> +   For supporting the GINSN_GEN_SCFI generation mode, following is the list of
>>> +   machine instructions that must be translated into the corresponding ginsns
>>> +   to ensure correctness of SCFI:
>>> +     - All instructions affecting the two registers that could potentially
>>> +       be used as the base register for CFA tracking.  For SCFI, the base
>>> +       register for CFA tracking is limited to REG_SP and REG_FP only for
>>> +       now.
>>> +     - All change of flow instructions: conditional and unconditional branches,
>>> +       call and return from functions.
>>> +     - All instructions that can potentially be a register save / restore
>>> +       operation.
>>
>> This could do with being more fine grained, as "potentially" is pretty vague,
>> and (as per earlier version review comments) my take on this is a much wider
>> set than yours.
> 
> I would like to understand more on this comment, especially the "my take 
> on this is a much wider set than yours".  I see its being hinted at in 
> different flavors in the current review.
> 
> I see some issues pointed out in this review (addressing modes of mov 
> etc, safe to skip opcodes for TEST, CMP) etc., but it seems that your 
> concerns are wider than this.

I earlier version review I mentioned that even vector or mask registers
could in principle be use to hold preserved GPR values. I seem to recall
that you said you wouldn't want to deal with such. Hence my use of
"wider set": Just to give an example, "kmovq %rbp, %k0" plus later
"kmovq %k0, %rbp" is a pair of "instructions that can potentially be a
register save / restore operation".

>>> +    case 0xa0:
>>> +    case 0xa8:
>>> +      /* push fs / push gs have opcode_space == SPACE_0F.  */
>>> +      if (i.tm.opcode_space != SPACE_0F)
>>> +	break;
>>> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>> +	stack_opnd_size = 2;
>>
>> But only if there's not also REX.W / REX2.W.
> 
> This needs to be done for both push and pop then.  I am not sure about 
> the REX2.W check though.

It's not just PUSH/POP, but all stack operations.

> Something like
> !is_apx_rex2_encoding () && !(i.prefix[REX_PREFIX] & REX_W)
> covers it ?

I don't see why you'd have is_apx_rex2_encoding() here. I don't recall
where/when exactly you invoke your code. It may therefore be either, as
you have it, i.prefix[REX_PREFIX], or (before establish_rex()) it could
(additionally) be i.rex.

>>> +      /* push fs / push gs.  */
>>> +      ginsn = ginsn_new_sub (insn_end_sym, false,
>>> +			     GINSN_SRC_REG, REG_SP, 0,
>>> +			     GINSN_SRC_IMM, 0, stack_opnd_size,
>>> +			     GINSN_DST_REG, REG_SP, 0);
>>> +      ginsn_set_where (ginsn);
>>> +      ginsn_next = ginsn_new_store (insn_end_sym, false,
>>> +				    GINSN_SRC_REG, dw2_regnum,
>>> +				    GINSN_DST_INDIRECT, REG_SP, 0);
>>> +      ginsn_set_where (ginsn_next);
>>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>> +      break;
>>> +
>>> +    case 0xa1:
>>> +    case 0xa9:
>>> +      /* pop fs / pop gs have opcode_space == SPACE_0F.  */
>>> +      if (i.tm.opcode_space != SPACE_0F)
>>> +	break;
>>> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>> +	stack_opnd_size = 2;
>>> +      /* pop fs / pop gs.  */
>>> +      ginsn = ginsn_new_load (insn_end_sym, false,
>>> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
>>> +			      GINSN_DST_REG, dw2_regnum);
>>> +      ginsn_set_where (ginsn);
>>> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
>>> +				  GINSN_SRC_REG, REG_SP, 0,
>>> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
>>> +				  GINSN_DST_REG, REG_SP, 0);
>>> +      ginsn_set_where (ginsn_next);
>>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>> +      break;
>>> +
>>> +    case 0x50 ... 0x57:
>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>> +	break;
>>> +      /* push reg.  */
>>> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>> +	stack_opnd_size = 2;
>>> +      ginsn = ginsn_new_sub (insn_end_sym, false,
>>> +			     GINSN_SRC_REG, REG_SP, 0,
>>> +			     GINSN_SRC_IMM, 0, stack_opnd_size,
>>> +			     GINSN_DST_REG, REG_SP, 0);
>>> +      ginsn_set_where (ginsn);
>>> +      ginsn_next = ginsn_new_store (insn_end_sym, false,
>>> +				    GINSN_SRC_REG, dw2_regnum,
>>> +				    GINSN_DST_INDIRECT, REG_SP, 0);
>>> +      ginsn_set_where (ginsn_next);
>>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>> +      break;
>>> +
>>> +    case 0x58 ... 0x5f:
>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>> +	break;
>>> +      /* pop reg.  */
>>> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>> +      ginsn = ginsn_new_load (insn_end_sym, false,
>>> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
>>> +			      GINSN_DST_REG, dw2_regnum);
>>> +      ginsn_set_where (ginsn);
>>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>> +	stack_opnd_size = 2;
>>> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
>>> +				  GINSN_SRC_REG, REG_SP, 0,
>>> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
>>> +				  GINSN_DST_REG, REG_SP, 0);
>>> +      ginsn_set_where (ginsn_next);
>>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>> +      break;
>>> +
>>> +    case 0x6a:
>>> +    case 0x68:
>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>> +	break;
>>> +      /* push imm8/imm16/imm32.  */
>>> +      if (opcode == 0x6a)
>>> +	stack_opnd_size = 1;
>>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.
>>> +	 However, this prefix may only be present when opcode is 0x68.  */
>>
>> Why would this be limited to opcode 0x68?
> 
> My understanding from the manual is that 0x6a will always be push imm8. 
> Hence, 66H is expected only with 0x68. Isnt this incorrect ?

The opcode byte determines merely the size of the immediate field.
Operand size determines to which size the signed 8-bit field would be
extended and then pushed onto the stack. There's never a single byte
that would be pushed.

>>> +      else if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>> +	stack_opnd_size = 2;
>>> +      else
>>> +	stack_opnd_size = 4;
>>
>> In 64-bit mode stack operations are never 32-bit.
> 
> Ah right, so I need to use 8 in the fallback case, i.e., stack_opnd_size 
> is 8 in case of push imm32.

Correct.

>>> +      ginsn = ginsn_new_load (insn_end_sym, false,
>>> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
>>> +			      GINSN_DST_REG, dw2_regnum);
>>> +      ginsn_set_where (ginsn);
>>> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
>>> +				  GINSN_SRC_REG, REG_SP, 0,
>>> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
>>> +				  GINSN_DST_REG, REG_SP, 0);
>>> +      ginsn_set_where (ginsn_next);
>>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>> +      break;
>>> +
>>> +    case 0xff:
>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>> +	break;
>>> +      /* push from mem.  */
>>> +      if (i.tm.extension_opcode == 6)
>>> +	{
>>> +	  /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>> +	  if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>> +	    stack_opnd_size = 2;
>>> +	  ginsn = ginsn_new_sub (insn_end_sym, false,
>>> +				 GINSN_SRC_REG, REG_SP, 0,
>>> +				 GINSN_SRC_IMM, 0, stack_opnd_size,
>>> +				 GINSN_DST_REG, REG_SP, 0);
>>> +	  ginsn_set_where (ginsn);
>>> +	  /* These instructions have no imm, only indirect access.  */
>>> +	  gas_assert (i.base_reg);
>>> +	  dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>> +	  ginsn_next = ginsn_new_store (insn_end_sym, false,
>>> +					GINSN_SRC_INDIRECT, dw2_regnum,
>>> +					GINSN_DST_INDIRECT, REG_SP, 0);
>>> +	  ginsn_set_where (ginsn_next);
>>> +	  gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>> +	}
>>> +      else if (i.tm.extension_opcode == 4)
>>> +	{
>>> +	  /* jmp r/m.  E.g., notrack jmp *%rax.  */
>>> +	  if (i.reg_operands)
>>> +	    {
>>> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>> +	      ginsn_set_where (ginsn);
>>> +	    }
>>> +	  else if (i.mem_operands && i.index_reg)
>>> +	    {
>>> +	      /* jmp    *0x0(,%rax,8).  */
>>> +	      dw2_regnum = ginsn_dw2_regnum (i.index_reg);
>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>> +	      ginsn_set_where (ginsn);
>>
>> What if both base and index are in use? Like for PUSH/POP, all addressing
>> forms are permitted here and ...
>>
>>> +	    }
>>> +	  else if (i.mem_operands && i.base_reg)
>>> +	    {
>>> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>> +	      ginsn_set_where (ginsn);
>>> +	    }
>>> +	}
>>> +      else if (i.tm.extension_opcode == 2)
>>> +	{
>>> +	  /* 0xFF /2 (call).  */
>>> +	  if (i.reg_operands)
>>> +	    {
>>> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>> +	      ginsn = ginsn_new_call (insn_end_sym, true,
>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>> +	      ginsn_set_where (ginsn);
>>> +	    }
>>> +	  else if (i.mem_operands && i.base_reg)
>>> +	    {
>>> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>> +	      ginsn = ginsn_new_call (insn_end_sym, true,
>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>> +	      ginsn_set_where (ginsn);
>>> +	    }
>>
>> ... here.
> 
> For the indirect jump and call instructions, the target destination is 
> not necessary to be known.  Indirect jumps will cause SCFI to error out 
> as control flow cannot be constructed.
> 
> Call instructions are assumed to transfer control out of function.
> 
> In other words, more information in these cases will not be of use to SCFI.

But then aren't you already doing too much work here? With what you say,
you don't care about the kind of operand at all, merely the fact it's a
CALL might be interesting. Albeit even there I'd then wonder why - if
the function called isn't of interest, how is CALL different from just
NOP? The only CALL you'd need to be concerned about would be the direct
form with a displacement of 0, as indicated elsewhere. But of course
tricky code might also use variants of that; see e.g. the retpoline code
that was introduced into kernels a couple of years back. Recognizing
and bailing on such tricky code may be desirable, if not necessary.

>>> +	}
>>> +      break;
>>> +
>>> +    case 0xc2:
>>> +    case 0xc3:
>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>> +	break;
>>> +      /* Near ret.  */
>>> +      ginsn = ginsn_new_return (insn_end_sym, true);
>>> +      ginsn_set_where (ginsn);
>>> +      break;
>>
>> No tracking of the stack pointer adjustment?
> 
> No stack unwind information for a function is relevant after the 
> function has returned.  So, tracking of stack pointer adjustment by 
> return is not necessary.

What information does the "return" insn then carry, beyond it being
an unconditional branch (which you have a different insn for)?

>>> @@ -5830,6 +6804,14 @@ md_assemble (char *line)
>>>     /* We are ready to output the insn.  */
>>>     output_insn (last_insn);
>>>   
>>> +  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
>>> +     performed in i386_target_format.  */
>>
>> See my earlier comment - it's yet more restrictive (as in not covering
>> e.g. the Windows ABI, which importantly is also used in EFI).
> 
> Not clear, can you please elaborate ?

Hmm, it's not clear to me what's not clear in my earlier comment. The
set of preserved registers, for example, differs between the SysV and
the Windows ABIs (see x86_scfi_callee_saved_p()). Then again, thinking
of it, talking of anything ABI-ish in assembly code is questionable.
An assembly function calling another assembly function may use an
entirely custom "ABI". You just cannot guess upon preserved registers.

>>> @@ -12104,6 +13087,14 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
>>>     last_insn->name = ".insn directive";
>>>     last_insn->file = as_where (&last_insn->line);
>>>   
>>> +  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
>>> +     performed in i386_target_format.  */
>>> +  if (flag_synth_cfi)
>>> +    {
>>> +      ginsn = x86_ginsn_new (symbol_temp_new_now (), frch_ginsn_gen_mode ());
>>
>> If you really want to use this function here, more cases will need
>> handling (perhaps even beyond what I've commented on above). However,
>> I'd strongly suggest splitting off the "unhandled" part of that
>> function, and using only that here. After all you hardly know what
>> exactly the programmer's intentions are. Because of that, you may
>> also want to consider simply forbidding use of .insn when SCFI is to
>> be generated.
> 
> Sorry, not clear to me. I am not sure how moving simply the "unhandled" 
> part of x86_ginsn_new will work better.  If we dont handle the 
> instructions, more instructions will potentially trigger the "unhandled" 
> case.

Correct. You simply shouldn't try to interpret what the user encoded
via .insn. Much like gcc doesn't try to interpret what the string
literal of an asm() contains.

>>> @@ -15748,6 +16739,11 @@ i386_target_format (void)
>>>     else
>>>       as_fatal (_("unknown architecture"));
>>>   
>>> +#if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
>>> +  if (flag_synth_cfi && x86_elf_abi != X86_64_ABI)
>>> +    as_fatal (_("Synthesizing CFI is not supported for this ABI"));
>>> +#endif
>>
>> Elsewhere I've raised the question of whether we should really check
>> OBJ_MAYBE_ELF anywhere in this file. However, as long as we do, you'll
>> need to accompany that with an IS_ELF check in the if(). If, to
>> address my unreachable code comment near the top, you elect to add
>> further OBJ_ELF / OBJ_MAYBE_ELF checks, there you then wouldn't need
>> that further check (as flag_synth_cfi set then implies ELF).
> 
> I took the suggestion to not compile unnecessarily for non-ELF targets.
> 
> Use IS_ELF in if (): But flag_synth_cfi does not imply IS_ELF, it 
> implies OBJ_ELF || OBJ_MAYBE_ELF. Looks to me then, that I should 
> include IS_ELF check in each of the 3 blocks.

Why? If you use IS_ELF here, you won't make it there when IS_ELF
returned "false", simply because you use as_fatal() here. It is
once you pass the check here that flag_synth_cfi implies IS_ELF
(as much as it then also implies x86_elf_abi == X86_64_ABI).

Jan
  
Indu Bhagat Jan. 10, 2024, 6:10 a.m. UTC | #7
On 1/9/24 01:30, Jan Beulich wrote:
> On 08.01.2024 20:33, Indu Bhagat wrote:
>> On 1/5/24 05:58, Jan Beulich wrote:
>>> On 03.01.2024 08:15, Indu Bhagat wrote:
>>>> @@ -2311,6 +2312,13 @@ obj_elf_size (int ignore ATTRIBUTE_UNUSED)
>>>>          symbol_get_obj (sym)->size = XNEW (expressionS);
>>>>          *symbol_get_obj (sym)->size = exp;
>>>>        }
>>>> +
>>>> +  /* If the symbol in the directive matches the current function being
>>>> +     processed, indicate end of the current stream of ginsns.  */
>>>> +  if (flag_synth_cfi
>>>> +      && S_IS_FUNCTION (sym) && sym == ginsn_data_func_symbol ())
>>>> +    ginsn_data_end (symbol_temp_new_now ());
>>>> +
>>>>      demand_empty_rest_of_line ();
>>>>    }
>>>>    
>>>> @@ -2499,6 +2507,14 @@ obj_elf_type (int ignore ATTRIBUTE_UNUSED)
>>>>    	elfsym->symbol.flags &= ~mask;
>>>>        }
>>>>    
>>>> +  if (S_IS_FUNCTION (sym) && flag_synth_cfi)
>>>> +    {
>>>> +      /* Wrap up processing the previous block of ginsns first.  */
>>>> +      if (frchain_now->frch_ginsn_data)
>>>> +	ginsn_data_end (symbol_temp_new_now ());
>>>> +      ginsn_data_begin (sym);
>>>> +    }
>>>> +
>>>>      demand_empty_rest_of_line ();
>>>>    }
>>>
>>> Documentation about .type and .size use could be more precise. Generally
>>> it is entirely benign where exactly these directives live relative to
>>> the code they annotate.
>>
>> Added a comment for V5.
>>
>> As stated in as.texi, usage of .type and .size will be bread and butter
>> for SCFI: "The input asm must begin with the @code{.type} directive, and
>> should ideally be closed off using a @code{.size} directive."
> 
> Except that to me "must begin" talks about a source file, not individual
> functions. Hence that wording (which I saw) is at bets ambiguous, which
> led me to ask for it to be "more precise".
> 

I have updated both code comments and gas/doc/as.texi.

Code comments:

"When using SCFI, .type directive indicates start of a new FDE for 
SCFI processing.  So, we must first demarcate the previous block of 
ginsns, if any, to mark the end of a previous FDE."

gas/doc/as.texi:
" Each input function in assembly must begin with the @code{.type} 
directive, and should ideally be closed off using a @code{.size} 
directive.  When using SCFI, each @code{.type} directive prompts GAS to 
start a new FDE (a.k.a., Function Descriptor Entry).  This implies that 
with each @code{.type} directive, a previous block of instructions, if 
any, is finalised as a distinct FDE."


>>>> +/* Check whether a '66H' prefix accompanies the instruction.
>>>
>>> With APX 16-bit operand size isn't necessarily represented by a 66h
>>> prefix, but perhaps with an "embedded prefix" inside the EVEX one.
>>> Therefore both the comment and even more so ...
>>>
>>>> +   The current users of this API are in the handlers for PUSH, POP
>>>> +   instructions.  These instructions affect the stack pointer implicitly:  the
>>>> +   operand size (16, 32, or 64 bits) determines the amount by which the stack
>>>> +   pointer is decremented (2, 4 or 8).  When '66H' prefix is present, the
>>>> +   instruction has a 16-bit operand.  */
>>>> +
>>>> +static bool
>>>> +ginsn_prefix_66H_p (i386_insn insn)
>>>
>>> ... the function name would better not allude to just the legacy
>>> encoding. Maybe ginsn_opsize_prefix_p()?
>>>
>>
>> Isnt 66H_p more readable and easier to follow because that's what the
>> function is currently checking ?  If more scenarios were being handled,
>> ginsn_opsize_prefix_p () would fit better.
> 
> Well, as said - with APX you can't get away with just 0x66 prefix checking.
> That prefix is simply illegal to use with EVEX-encoded insns.
> 

I am using the following in ginsn_opsize_prefix_p ():

!(i.prefix[REX_PREFIX] & REX_W) && i.prefix[DATA_PREFIX] == 0x66

>>>> +/* Get the DWARF register number for the given register entry.
>>>> +   For specific byte/word register accesses like al, cl, ah, ch, r8dyte etc.,
>>>
>>> What's "r8dyte"? I expect it's a typo, but I can't derive what was
>>> intended to be written.
>>
>> Typo it is.  I meant to write r8d. I have updated this to " like al, cl,
>> ah, ch, r8d, r20w etc."
> 
> Then perhaps further s;byte/word;byte/word/dword; ?
> 

Yes. Done.

>>>> +static unsigned int
>>>> +ginsn_dw2_regnum (const reg_entry *ireg)
>>>> +{
>>>> +  /* PS: Note the data type here as int32_t, because of Dw2Inval (-1).  */
>>>> +  int32_t dwarf_reg = Dw2Inval;
>>>> +  const reg_entry *temp = ireg;
>>>> +
>>>> +  /* ginsn creation is available for AMD64 abi only ATM.  Other flag_code
>>>> +     are not expected.  */
>>>> +  gas_assert (flag_code == CODE_64BIT);
>>>
>>> With this assertion it is kind of odd to see a further use of flag_code
>>> below.
>>
>> I think you are referring to the checks like:
>>
>>     /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>     if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>       stack_opnd_size = 2;
>>
>> Although the check on flag_code is redundant now, I chose to have them
>> here to keep it aligned to how the prefix is meant to be used.
> 
> Yet the same is true in 32-bit mode (minus, of course, the REX.W aspect
> mentioned elsewhere, but that could be ignored here as 32-bit code has
> no way of setting that flag). IOW - either you drop the use of flag_code,
> or you make it actually correct (as in: not misleading).
> 

Dropped the use of flag_code  in these instances for V5.

>>>> +  /* op %reg, symbol.  */
>>>> +  if (i.mem_operands == 1 && !i.base_reg && !i.index_reg)
>>>> +    return ginsn;
>>>
>>> Why does this need special treatment, and why is returning NULL here
>>> okay?
>>
>> An instruction like "addq    %rax, symbol" etc are uninteresting for
>> SCFI.  One of feedback in a previous iteration was to "consider not
>> generating ginsns for cases that are known to be uninteresting".
> 
> But then why not simply
> 
>    if (i.mem_operands)
>      return ginsn;
> 
> ? What memory is added to doesn't matter for SFCI, does it? Aiui you
> only really want to notice adds to registers.
> 

Correct. Updated the code with additional comments.

>>>> +static ginsnS *
>>>> +x86_ginsn_addsub_mem_reg (const symbolS *insn_end_sym)
>>>> +{
>>>> +  unsigned int dw2_regnum;
>>>> +  unsigned int src2_dw2_regnum;
>>>> +  const reg_entry *mem_reg;
>>>> +  int32_t gdisp = 0;
>>>> +  ginsnS *ginsn = NULL;
>>>> +  ginsnS * (*ginsn_func) (const symbolS *, bool,
>>>> +			  enum ginsn_src_type, unsigned int, offsetT,
>>>> +			  enum ginsn_src_type, unsigned int, offsetT,
>>>> +			  enum ginsn_dst_type, unsigned int, offsetT);
>>>> +  uint16_t opcode = i.tm.base_opcode;
>>>> +
>>>> +  gas_assert (opcode == 0x3 || opcode == 0x2b);
>>>> +  ginsn_func = (opcode == 0x3) ? ginsn_new_add : ginsn_new_sub;
>>>> +
>>>> +  /* op symbol, %reg.  */
>>>> +  if (i.mem_operands && !i.base_reg && !i.index_reg)
>>>> +    return ginsn;
>>>> +  /* op mem, %reg.  */
>>>
>>> /* op reg/mem, reg.  */ you mean? Which then raises the question ...
>>>
>>
>> Yes (updated the comment for V5).
>>
>>>> +  dw2_regnum = ginsn_dw2_regnum (i.op[1].regs);
>>>> +
>>>> +  if (i.mem_operands)
>>>> +    {
>>>> +      mem_reg = (i.base_reg) ? i.base_reg : i.index_reg;
>>>> +      src2_dw2_regnum = ginsn_dw2_regnum (mem_reg);
>>>> +      if (i.disp_operands == 1)
>>>> +	gdisp = i.op[0].disps->X_add_number;
>>>> +      ginsn = ginsn_func (insn_end_sym, true,
>>>> +			  GINSN_SRC_INDIRECT, src2_dw2_regnum, gdisp,
>>>> +			  GINSN_SRC_REG, dw2_regnum, 0,
>>>> +			  GINSN_DST_REG, dw2_regnum, 0);
>>>> +      ginsn_set_where (ginsn);
>>>> +    }
>>>> +
>>>> +  return ginsn;
>>>> +}
>>>
>>> ... why a register source isn't dealt with here.
>>
>> I saw that the opcode used when the source is reg is either 0x1 or 0x29,
>> hence concluded that the handling in x86_ginsn_addsub_reg_mem should
>> suffice.  Perhaps this is a GAS implementation artifact, and I should
>> have handling for register source here in x86_ginsn_addsub_mem_reg (for
>> opcodes 0x3 and 0x2b) ?
> 
> I think you should, yes. Try using the {load} / {store} pseudo-prefixes,
> and I think you'll see these opcodes used.
> 

I see them now with the suggested pseudo prefixes. Added handling in the 
appropriate function and an additional instruction in ginsn-add-1 testcase.

>>>> +static ginsnS *
>>>> +x86_ginsn_move (const symbolS *insn_end_sym)
>>>> +{
>>>> +  ginsnS *ginsn;
>>>> +  unsigned int dst_reg;
>>>> +  unsigned int src_reg;
>>>> +  offsetT src_disp = 0;
>>>> +  offsetT dst_disp = 0;
>>>> +  const reg_entry *dst = NULL;
>>>> +  const reg_entry *src = NULL;
>>>> +  uint16_t opcode = i.tm.base_opcode;
>>>> +  enum ginsn_src_type src_type = GINSN_SRC_REG;
>>>> +  enum ginsn_dst_type dst_type = GINSN_DST_REG;
>>>> +
>>>> +  if (opcode == 0x8b || opcode == 0x8a)
>>>
>>> Above when handling ALU insns you didn't care about byte ops - why do
>>> you do so here (by checking for 0x8a, and 0x88 below)?
>>
>> Because there may be weird code that saves and restores 8-byte registers
>> on stack.  For ALU insns, if the destination is REG_SP/REG_FP, we will
>> detect them in the unhandled track.
> 
> You talk about 8-byte registers when I asked about byte reg moves. If
> there's a MOV targeting %spl or %bpl, is that really any different from,
> say, an ADD doing so?
> 

ATM, Yes. SCFI has heuristics implemented, _some_ of which (can be 
evolved) are opcode specific. E.g.,
   - MOV %rsp, %rbp will make SCFI machinery check if this is making the 
CFA switch to %rbp.  If the target was %bpl, since we track 8-byte 
registers, it will still trigger the same code path. A bug as I ack below.
   - ADD rsp + 0 = rbp will not trigger CFA switch to RBP. Should it ? 
Perhaps yes? Its on my TODO list (with low priority) to evolve this bit.

>>>> +    }
>>>> +
>>>> +  ginsn_set_where (ginsn);
>>>> +
>>>> +  return ginsn;
>>>> +}
>>>
>>> Throughout this function (and perhaps others as well), how come you
>>> don't consider operand size at all? It matters whether results are
>>> 64-bit, 32-bit, or 16-bit, after all.
>>
>> Operation size matters: No, not for all instructions in context of SCFI.
>> For instructions using stack (save / restore), the size is important.
>> But for other instructions, operation size will not affect SCFI correctness.
> 
> But aren't you wrongly treating an update of %rbp and an update of,
> say, %bp the same then? The latter should end up as untracable, aiui.
> 

For ALU ops:
   - ADD/SUB reg1, reg2
     If reg2 is the same as REG_CFA, then even in 64-bit mode, this 
causes untraceability. So there is untraceability irrespective of 
operation size. On the other hand, if uninteresting, it will remain 
unintersting irrespective of operation size.
   - ADD/SUB imm, reg will never trigger untraceability, irrespective of 
size. There is the element of ignoring the carry bit, but I think the 
current diagnostics around "asymmetrical save/restore" and the planned 
"unbalanced stack at return" should provide user with some safety net.
   - Other ALU ops should all trigger untracebility alike, irrespective 
of operation size.
Hence, my statement that ignoring operation size is fine here for SCFI.

For MOV ops:
   - 8-bit/16-bit MOV should trigger untracebility. I ack this as bug to 
be fixed. Thanks to your careful review. ATM, I plan to deal with this 
after the series goes in, unless you have strong opinion about this.

>>>> +static int
>>>> +x86_ginsn_unhandled (void)
>>>> +{
>>>> +  int err = X86_GINSN_UNHANDLED_NONE;
>>>> +  const reg_entry *reg_op;
>>>> +  unsigned int dw2_regnum;
>>>> +
>>>> +  /* Keep an eye out for instructions affecting control flow.  */
>>>> +  if (i.tm.opcode_modifier.jump)
>>>> +    err = X86_GINSN_UNHANDLED_CFG;
>>>> +  /* Also, for any instructions involving an implicit update to the stack
>>>> +     pointer.  */
>>>> +  else if (i.tm.opcode_modifier.implicitstackop)
>>>> +    err = X86_GINSN_UNHANDLED_STACKOP;
>>>> +  /* Finally, also check if the missed instructions are affecting REG_SP or
>>>> +     REG_FP.  The destination operand is the last at all stages of assembly
>>>> +     (due to following AT&T syntax layout in the internal representation).  In
>>>> +     case of Intel syntax input, this still remains true as swap_operands ()
>>>> +     is done by now.
>>>> +     PS: These checks do not involve index / base reg, as indirect memory
>>>> +     accesses via REG_SP or REG_FP do not affect SCFI correctness.
>>>> +     (Also note these instructions are candidates for other ginsn generation
>>>> +     modes in future.  TBD_GINSN_GEN_NOT_SCFI.)  */
>>>> +  else if (i.operands && i.reg_operands
>>>> +	   && !(i.flags[i.operands - 1] & Operand_Mem))
>>>> +    {
>>>> +      reg_op = i.op[i.operands - 1].regs;
>>>> +      if (reg_op)
>>>> +	{
>>>> +	  dw2_regnum = ginsn_dw2_regnum (reg_op);
>>>> +	  if (dw2_regnum == REG_SP || dw2_regnum == REG_FP)
>>>> +	    err = X86_GINSN_UNHANDLED_DEST_REG;
>>>> +	}
>>>
>>> else
>>>     err = X86_GINSN_UNHANDLED_CONFUSED;
>>>
>>> ? You can't let this case go silently. An alternative would be to
>>> assert instead of using if().
>>
>> No, the other cases are not important for SCFI correctness.  The case of
>> potential register save/restore operation cannot be detected
>> automatically.  Keeping an eye on ISA additions will be the only way for
>> that category.
> 
> That wasn't my point. reg_op turning out to be NULL is bogus considering
> the earlier checks. Hence why an assertion may make sense, and hence why
> otherwise I suggested an error indicator towards "internal error".
> 

Sorry, I misunderstood.  I added a new X86_GINSN_UNHANDLED_UNEXPECTED 
now for V5.

>>>> +/* Generate one or more generic GAS instructions, a.k.a, ginsns for the current
>>>> +   machine instruction.
>>>> +
>>>> +   Returns the head of linked list of ginsn(s) added, if success; Returns NULL
>>>> +   if failure.
>>>> +
>>>> +   The input ginsn_gen_mode GMODE determines the set of minimal necessary
>>>> +   ginsns necessary for correctness of any passes applicable for that mode.
>>>> +   For supporting the GINSN_GEN_SCFI generation mode, following is the list of
>>>> +   machine instructions that must be translated into the corresponding ginsns
>>>> +   to ensure correctness of SCFI:
>>>> +     - All instructions affecting the two registers that could potentially
>>>> +       be used as the base register for CFA tracking.  For SCFI, the base
>>>> +       register for CFA tracking is limited to REG_SP and REG_FP only for
>>>> +       now.
>>>> +     - All change of flow instructions: conditional and unconditional branches,
>>>> +       call and return from functions.
>>>> +     - All instructions that can potentially be a register save / restore
>>>> +       operation.
>>>
>>> This could do with being more fine grained, as "potentially" is pretty vague,
>>> and (as per earlier version review comments) my take on this is a much wider
>>> set than yours.
>>
>> I would like to understand more on this comment, especially the "my take
>> on this is a much wider set than yours".  I see its being hinted at in
>> different flavors in the current review.
>>
>> I see some issues pointed out in this review (addressing modes of mov
>> etc, safe to skip opcodes for TEST, CMP) etc., but it seems that your
>> concerns are wider than this.
> 
> I earlier version review I mentioned that even vector or mask registers
> could in principle be use to hold preserved GPR values. I seem to recall
> that you said you wouldn't want to deal with such. Hence my use of
> "wider set": Just to give an example, "kmovq %rbp, %k0" plus later
> "kmovq %k0, %rbp" is a pair of "instructions that can potentially be a
> register save / restore operation".
> 

Hmm. I will need to understand them on a case to case basis.  For the 
case of "kmovq %rbp, %k0" / "kmovq %k0, %rbp" how can this be used as 
save/restore to/from stack ?

>>>> +    case 0xa0:
>>>> +    case 0xa8:
>>>> +      /* push fs / push gs have opcode_space == SPACE_0F.  */
>>>> +      if (i.tm.opcode_space != SPACE_0F)
>>>> +	break;
>>>> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>>> +	stack_opnd_size = 2;
>>>
>>> But only if there's not also REX.W / REX2.W.
>>
>> This needs to be done for both push and pop then.  I am not sure about
>> the REX2.W check though.
> 
> It's not just PUSH/POP, but all stack operations.
> 

Sorry I was imprecise.  Now checking REX.W in ginsn_opsize_prefix_p; for 
all stack related instructions.

>> Something like
>> !is_apx_rex2_encoding () && !(i.prefix[REX_PREFIX] & REX_W)
>> covers it ?
> 
> I don't see why you'd have is_apx_rex2_encoding() here. I don't recall
> where/when exactly you invoke your code. It may therefore be either, as
> you have it, i.prefix[REX_PREFIX], or (before establish_rex()) it could
> (additionally) be i.rex.
> 

This is done after output_insn () (hence after establish_rex ()).  I am 
now using the following in ginsn_opsize_prefix_p ():

!(i.prefix[REX_PREFIX] & REX_W) && i.prefix[DATA_PREFIX] == 0x66

>>>> +      /* push fs / push gs.  */
>>>> +      ginsn = ginsn_new_sub (insn_end_sym, false,
>>>> +			     GINSN_SRC_REG, REG_SP, 0,
>>>> +			     GINSN_SRC_IMM, 0, stack_opnd_size,
>>>> +			     GINSN_DST_REG, REG_SP, 0);
>>>> +      ginsn_set_where (ginsn);
>>>> +      ginsn_next = ginsn_new_store (insn_end_sym, false,
>>>> +				    GINSN_SRC_REG, dw2_regnum,
>>>> +				    GINSN_DST_INDIRECT, REG_SP, 0);
>>>> +      ginsn_set_where (ginsn_next);
>>>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>>> +      break;
>>>> +
>>>> +    case 0xa1:
>>>> +    case 0xa9:
>>>> +      /* pop fs / pop gs have opcode_space == SPACE_0F.  */
>>>> +      if (i.tm.opcode_space != SPACE_0F)
>>>> +	break;
>>>> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>>> +	stack_opnd_size = 2;
>>>> +      /* pop fs / pop gs.  */
>>>> +      ginsn = ginsn_new_load (insn_end_sym, false,
>>>> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
>>>> +			      GINSN_DST_REG, dw2_regnum);
>>>> +      ginsn_set_where (ginsn);
>>>> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
>>>> +				  GINSN_SRC_REG, REG_SP, 0,
>>>> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
>>>> +				  GINSN_DST_REG, REG_SP, 0);
>>>> +      ginsn_set_where (ginsn_next);
>>>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>>> +      break;
>>>> +
>>>> +    case 0x50 ... 0x57:
>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>> +	break;
>>>> +      /* push reg.  */
>>>> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>>> +	stack_opnd_size = 2;
>>>> +      ginsn = ginsn_new_sub (insn_end_sym, false,
>>>> +			     GINSN_SRC_REG, REG_SP, 0,
>>>> +			     GINSN_SRC_IMM, 0, stack_opnd_size,
>>>> +			     GINSN_DST_REG, REG_SP, 0);
>>>> +      ginsn_set_where (ginsn);
>>>> +      ginsn_next = ginsn_new_store (insn_end_sym, false,
>>>> +				    GINSN_SRC_REG, dw2_regnum,
>>>> +				    GINSN_DST_INDIRECT, REG_SP, 0);
>>>> +      ginsn_set_where (ginsn_next);
>>>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>>> +      break;
>>>> +
>>>> +    case 0x58 ... 0x5f:
>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>> +	break;
>>>> +      /* pop reg.  */
>>>> +      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>>> +      ginsn = ginsn_new_load (insn_end_sym, false,
>>>> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
>>>> +			      GINSN_DST_REG, dw2_regnum);
>>>> +      ginsn_set_where (ginsn);
>>>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>>> +      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>>> +	stack_opnd_size = 2;
>>>> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
>>>> +				  GINSN_SRC_REG, REG_SP, 0,
>>>> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
>>>> +				  GINSN_DST_REG, REG_SP, 0);
>>>> +      ginsn_set_where (ginsn_next);
>>>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>>> +      break;
>>>> +
>>>> +    case 0x6a:
>>>> +    case 0x68:
>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>> +	break;
>>>> +      /* push imm8/imm16/imm32.  */
>>>> +      if (opcode == 0x6a)
>>>> +	stack_opnd_size = 1;
>>>> +      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.
>>>> +	 However, this prefix may only be present when opcode is 0x68.  */
>>>
>>> Why would this be limited to opcode 0x68?
>>
>> My understanding from the manual is that 0x6a will always be push imm8.
>> Hence, 66H is expected only with 0x68. Isnt this incorrect ?
> 
> The opcode byte determines merely the size of the immediate field.
> Operand size determines to which size the signed 8-bit field would be
> extended and then pushed onto the stack. There's never a single byte
> that would be pushed.
> 

This is now fixed for V5.

>>>> +      ginsn = ginsn_new_load (insn_end_sym, false,
>>>> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
>>>> +			      GINSN_DST_REG, dw2_regnum);
>>>> +      ginsn_set_where (ginsn);
>>>> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
>>>> +				  GINSN_SRC_REG, REG_SP, 0,
>>>> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
>>>> +				  GINSN_DST_REG, REG_SP, 0);
>>>> +      ginsn_set_where (ginsn_next);
>>>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>>> +      break;
>>>> +
>>>> +    case 0xff:
>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>> +	break;
>>>> +      /* push from mem.  */
>>>> +      if (i.tm.extension_opcode == 6)
>>>> +	{
>>>> +	  /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>>> +	  if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>>> +	    stack_opnd_size = 2;
>>>> +	  ginsn = ginsn_new_sub (insn_end_sym, false,
>>>> +				 GINSN_SRC_REG, REG_SP, 0,
>>>> +				 GINSN_SRC_IMM, 0, stack_opnd_size,
>>>> +				 GINSN_DST_REG, REG_SP, 0);
>>>> +	  ginsn_set_where (ginsn);
>>>> +	  /* These instructions have no imm, only indirect access.  */
>>>> +	  gas_assert (i.base_reg);
>>>> +	  dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>>> +	  ginsn_next = ginsn_new_store (insn_end_sym, false,
>>>> +					GINSN_SRC_INDIRECT, dw2_regnum,
>>>> +					GINSN_DST_INDIRECT, REG_SP, 0);
>>>> +	  ginsn_set_where (ginsn_next);
>>>> +	  gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>>> +	}
>>>> +      else if (i.tm.extension_opcode == 4)
>>>> +	{
>>>> +	  /* jmp r/m.  E.g., notrack jmp *%rax.  */
>>>> +	  if (i.reg_operands)
>>>> +	    {
>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>> +	      ginsn_set_where (ginsn);
>>>> +	    }
>>>> +	  else if (i.mem_operands && i.index_reg)
>>>> +	    {
>>>> +	      /* jmp    *0x0(,%rax,8).  */
>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.index_reg);
>>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>> +	      ginsn_set_where (ginsn);
>>>
>>> What if both base and index are in use? Like for PUSH/POP, all addressing
>>> forms are permitted here and ...
>>>
>>>> +	    }
>>>> +	  else if (i.mem_operands && i.base_reg)
>>>> +	    {
>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>> +	      ginsn_set_where (ginsn);
>>>> +	    }
>>>> +	}
>>>> +      else if (i.tm.extension_opcode == 2)
>>>> +	{
>>>> +	  /* 0xFF /2 (call).  */
>>>> +	  if (i.reg_operands)
>>>> +	    {
>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>>> +	      ginsn = ginsn_new_call (insn_end_sym, true,
>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>> +	      ginsn_set_where (ginsn);
>>>> +	    }
>>>> +	  else if (i.mem_operands && i.base_reg)
>>>> +	    {
>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>>> +	      ginsn = ginsn_new_call (insn_end_sym, true,
>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>> +	      ginsn_set_where (ginsn);
>>>> +	    }
>>>
>>> ... here.
>>
>> For the indirect jump and call instructions, the target destination is
>> not necessary to be known.  Indirect jumps will cause SCFI to error out
>> as control flow cannot be constructed.
>>
>> Call instructions are assumed to transfer control out of function.
>>
>> In other words, more information in these cases will not be of use to SCFI.
> 
> But then aren't you already doing too much work here? With what you say,
> you don't care about the kind of operand at all, merely the fact it's a
> CALL might be interesting. Albeit even there I'd then wonder why - if
> the function called isn't of interest, how is CALL different from just
> NOP? The only CALL you'd need to be concerned about would be the direct
> form with a displacement of 0, as indicated elsewhere. But of course
> tricky code might also use variants of that; see e.g. the retpoline code
> that was introduced into kernels a couple of years back. Recognizing
> and bailing on such tricky code may be desirable, if not necessary.
> 

Tracking operands for CALL instructions does not provide value ATM.  We 
do not even split a BB if there is a CALL instruction (the assumption is 
that CALL insn transfers control out of function).

You're right that we need to treat some CALLs differently (because of 
its affect of control flow and SCFI correctness).

RE: doing more work. Sure, but the vision for ginsn was to allow a 
representation where other analyses may be added. The representation of 
ginsn will need revisions to get there, but keeping an explicit 
GINSN_TYPE_CALL seems good.

>>>> +	}
>>>> +      break;
>>>> +
>>>> +    case 0xc2:
>>>> +    case 0xc3:
>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>> +	break;
>>>> +      /* Near ret.  */
>>>> +      ginsn = ginsn_new_return (insn_end_sym, true);
>>>> +      ginsn_set_where (ginsn);
>>>> +      break;
>>>
>>> No tracking of the stack pointer adjustment?
>>
>> No stack unwind information for a function is relevant after the
>> function has returned.  So, tracking of stack pointer adjustment by
>> return is not necessary.
> 
> What information does the "return" insn then carry, beyond it being
> an unconditional branch (which you have a different insn for)?
> 

"return" does not carry any more information than just the 
GINSN_TYPE_RETURN as ginsn->type.

So then why support both "return" and an unconditional branch: The 
intention is to carry the semantic difference between ret and 
unconditional jump.  Unconditional jumps may be to a label within 
function, and in those cases, we use it for some validation and BB 
linking when creating CFG. Return, OTOH, always indicates exit from 
function.

For SCFI purposes, above is the one use.  Future analyses may find other 
use-cases for an explicit return ginsn.  But IMO, keeping 
GINSN_TYPE_RETURN as an explicit insn makes the overall offering cleaner.

>>>> @@ -5830,6 +6804,14 @@ md_assemble (char *line)
>>>>      /* We are ready to output the insn.  */
>>>>      output_insn (last_insn);
>>>>    
>>>> +  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
>>>> +     performed in i386_target_format.  */
>>>
>>> See my earlier comment - it's yet more restrictive (as in not covering
>>> e.g. the Windows ABI, which importantly is also used in EFI).
>>
>> Not clear, can you please elaborate ?
> 
> Hmm, it's not clear to me what's not clear in my earlier comment. The
> set of preserved registers, for example, differs between the SysV and
> the Windows ABIs (see x86_scfi_callee_saved_p()). Then again, thinking
> of it, talking of anything ABI-ish in assembly code is questionable.
> An assembly function calling another assembly function may use an
> entirely custom "ABI". You just cannot guess upon preserved registers.
> 

I think the confusion is stemming from my usage of AMD64 colloquially to 
refer to System V ABI for x86_64. I think "System V AMD64 ABI" is the 
correct term. I will use that.

And yes, GAS can only work out SCFI if there is ABI adherence.  If input 
asm does not adhere to the supported ABIs, and the user invokes SCFI, 
then the user is on their own.  GAS will not (rather can not) warn about 
ABI incompliance.

>>>> @@ -12104,6 +13087,14 @@ s_insn (int dummy ATTRIBUTE_UNUSED)
>>>>      last_insn->name = ".insn directive";
>>>>      last_insn->file = as_where (&last_insn->line);
>>>>    
>>>> +  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
>>>> +     performed in i386_target_format.  */
>>>> +  if (flag_synth_cfi)
>>>> +    {
>>>> +      ginsn = x86_ginsn_new (symbol_temp_new_now (), frch_ginsn_gen_mode ());
>>>
>>> If you really want to use this function here, more cases will need
>>> handling (perhaps even beyond what I've commented on above). However,
>>> I'd strongly suggest splitting off the "unhandled" part of that
>>> function, and using only that here. After all you hardly know what
>>> exactly the programmer's intentions are. Because of that, you may
>>> also want to consider simply forbidding use of .insn when SCFI is to
>>> be generated.
>>
>> Sorry, not clear to me. I am not sure how moving simply the "unhandled"
>> part of x86_ginsn_new will work better.  If we dont handle the
>> instructions, more instructions will potentially trigger the "unhandled"
>> case.
> 
> Correct. You simply shouldn't try to interpret what the user encoded
> via .insn. Much like gcc doesn't try to interpret what the string
> literal of an asm() contains.
> 

Removed the call to x86_ginsn_new () in s_insn () and replaced by an 
error if SCFI is in effect.

>>>> @@ -15748,6 +16739,11 @@ i386_target_format (void)
>>>>      else
>>>>        as_fatal (_("unknown architecture"));
>>>>    
>>>> +#if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
>>>> +  if (flag_synth_cfi && x86_elf_abi != X86_64_ABI)
>>>> +    as_fatal (_("Synthesizing CFI is not supported for this ABI"));
>>>> +#endif
>>>
>>> Elsewhere I've raised the question of whether we should really check
>>> OBJ_MAYBE_ELF anywhere in this file. However, as long as we do, you'll
>>> need to accompany that with an IS_ELF check in the if(). If, to
>>> address my unreachable code comment near the top, you elect to add
>>> further OBJ_ELF / OBJ_MAYBE_ELF checks, there you then wouldn't need
>>> that further check (as flag_synth_cfi set then implies ELF).
>>
>> I took the suggestion to not compile unnecessarily for non-ELF targets.
>>
>> Use IS_ELF in if (): But flag_synth_cfi does not imply IS_ELF, it
>> implies OBJ_ELF || OBJ_MAYBE_ELF. Looks to me then, that I should
>> include IS_ELF check in each of the 3 blocks.
> 
> Why? If you use IS_ELF here, you won't make it there when IS_ELF
> returned "false", simply because you use as_fatal() here. It is
> once you pass the check here that flag_synth_cfi implies IS_ELF
> (as much as it then also implies x86_elf_abi == X86_64_ABI).
> 

Got it now.
  
Jan Beulich Jan. 10, 2024, 9:44 a.m. UTC | #8
On 10.01.2024 07:10, Indu Bhagat wrote:
> On 1/9/24 01:30, Jan Beulich wrote:
>> On 08.01.2024 20:33, Indu Bhagat wrote:
>>> On 1/5/24 05:58, Jan Beulich wrote:
>>>> On 03.01.2024 08:15, Indu Bhagat wrote:
>>>>> +/* Check whether a '66H' prefix accompanies the instruction.
>>>>
>>>> With APX 16-bit operand size isn't necessarily represented by a 66h
>>>> prefix, but perhaps with an "embedded prefix" inside the EVEX one.
>>>> Therefore both the comment and even more so ...
>>>>
>>>>> +   The current users of this API are in the handlers for PUSH, POP
>>>>> +   instructions.  These instructions affect the stack pointer implicitly:  the
>>>>> +   operand size (16, 32, or 64 bits) determines the amount by which the stack
>>>>> +   pointer is decremented (2, 4 or 8).  When '66H' prefix is present, the
>>>>> +   instruction has a 16-bit operand.  */
>>>>> +
>>>>> +static bool
>>>>> +ginsn_prefix_66H_p (i386_insn insn)
>>>>
>>>> ... the function name would better not allude to just the legacy
>>>> encoding. Maybe ginsn_opsize_prefix_p()?
>>>>
>>>
>>> Isnt 66H_p more readable and easier to follow because that's what the
>>> function is currently checking ?  If more scenarios were being handled,
>>> ginsn_opsize_prefix_p () would fit better.
>>
>> Well, as said - with APX you can't get away with just 0x66 prefix checking.
>> That prefix is simply illegal to use with EVEX-encoded insns.
>>
> 
> I am using the following in ginsn_opsize_prefix_p ():
> 
> !(i.prefix[REX_PREFIX] & REX_W) && i.prefix[DATA_PREFIX] == 0x66

That addresses one half of my earlier remarks. Note however that elsewhere
we never check i.prefix[DATA_PREFIX] against being 0x66; we only ever check
for it being zero or non-zero. I'd like this to remain consistent.

For EVEX-encoded APX insns this isn't going to be sufficient though. See
respective code in process_suffix():

	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
	     needs to be adjusted.  */
	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
	    {
	      gas_assert (!i.tm.opcode_modifier.opcodeprefix);
	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
	    }
	  else if (!add_prefix (prefix))
	    return 0;

So you'll need to also check for that combination, plus take care of
avoiding insns where PREFIX_0X66 alters operation, not operand size
(ADCX being an example).

>>>>> +static ginsnS *
>>>>> +x86_ginsn_move (const symbolS *insn_end_sym)
>>>>> +{
>>>>> +  ginsnS *ginsn;
>>>>> +  unsigned int dst_reg;
>>>>> +  unsigned int src_reg;
>>>>> +  offsetT src_disp = 0;
>>>>> +  offsetT dst_disp = 0;
>>>>> +  const reg_entry *dst = NULL;
>>>>> +  const reg_entry *src = NULL;
>>>>> +  uint16_t opcode = i.tm.base_opcode;
>>>>> +  enum ginsn_src_type src_type = GINSN_SRC_REG;
>>>>> +  enum ginsn_dst_type dst_type = GINSN_DST_REG;
>>>>> +
>>>>> +  if (opcode == 0x8b || opcode == 0x8a)
>>>>
>>>> Above when handling ALU insns you didn't care about byte ops - why do
>>>> you do so here (by checking for 0x8a, and 0x88 below)?
>>>
>>> Because there may be weird code that saves and restores 8-byte registers
>>> on stack.  For ALU insns, if the destination is REG_SP/REG_FP, we will
>>> detect them in the unhandled track.
>>
>> You talk about 8-byte registers when I asked about byte reg moves. If
>> there's a MOV targeting %spl or %bpl, is that really any different from,
>> say, an ADD doing so?
>>
> 
> ATM, Yes. SCFI has heuristics implemented, _some_ of which (can be 
> evolved) are opcode specific. E.g.,
>    - MOV %rsp, %rbp will make SCFI machinery check if this is making the 
> CFA switch to %rbp.  If the target was %bpl, since we track 8-byte 
> registers, it will still trigger the same code path. A bug as I ack below.
>    - ADD rsp + 0 = rbp will not trigger CFA switch to RBP. Should it ? 
> Perhaps yes? Its on my TODO list (with low priority) to evolve this bit.

I'm not sure about this. Either way such special cases may want documenting
explicitly.

>>>>> +    }
>>>>> +
>>>>> +  ginsn_set_where (ginsn);
>>>>> +
>>>>> +  return ginsn;
>>>>> +}
>>>>
>>>> Throughout this function (and perhaps others as well), how come you
>>>> don't consider operand size at all? It matters whether results are
>>>> 64-bit, 32-bit, or 16-bit, after all.
>>>
>>> Operation size matters: No, not for all instructions in context of SCFI.
>>> For instructions using stack (save / restore), the size is important.
>>> But for other instructions, operation size will not affect SCFI correctness.
>>
>> But aren't you wrongly treating an update of %rbp and an update of,
>> say, %bp the same then? The latter should end up as untracable, aiui.
>>
> 
> For ALU ops:
>    - ADD/SUB reg1, reg2
>      If reg2 is the same as REG_CFA, then even in 64-bit mode, this 
> causes untraceability. So there is untraceability irrespective of 
> operation size. On the other hand, if uninteresting, it will remain 
> unintersting irrespective of operation size.
>    - ADD/SUB imm, reg will never trigger untraceability, irrespective of 
> size. There is the element of ignoring the carry bit, but I think the 
> current diagnostics around "asymmetrical save/restore" and the planned 
> "unbalanced stack at return" should provide user with some safety net.

I don't see how the carry flag would matter here. What does matter is the
size of the register: Anything not 64-bit will need to trigger
untraceability imo.

>    - Other ALU ops should all trigger untracebility alike, irrespective 
> of operation size.
> Hence, my statement that ignoring operation size is fine here for SCFI.

As per above, I disagree.

> For MOV ops:
>    - 8-bit/16-bit MOV should trigger untracebility. I ack this as bug to 
> be fixed. Thanks to your careful review. ATM, I plan to deal with this 
> after the series goes in, unless you have strong opinion about this.

32-bit MOV should, too. And yes, I'm okay-ish with such being dealt with
later, as long as the open work item is easily recognizable (and hence
it'll be easy to check that all such work items are gone at the point
where the feature is promoted from experimental).

>>>>> +/* Generate one or more generic GAS instructions, a.k.a, ginsns for the current
>>>>> +   machine instruction.
>>>>> +
>>>>> +   Returns the head of linked list of ginsn(s) added, if success; Returns NULL
>>>>> +   if failure.
>>>>> +
>>>>> +   The input ginsn_gen_mode GMODE determines the set of minimal necessary
>>>>> +   ginsns necessary for correctness of any passes applicable for that mode.
>>>>> +   For supporting the GINSN_GEN_SCFI generation mode, following is the list of
>>>>> +   machine instructions that must be translated into the corresponding ginsns
>>>>> +   to ensure correctness of SCFI:
>>>>> +     - All instructions affecting the two registers that could potentially
>>>>> +       be used as the base register for CFA tracking.  For SCFI, the base
>>>>> +       register for CFA tracking is limited to REG_SP and REG_FP only for
>>>>> +       now.
>>>>> +     - All change of flow instructions: conditional and unconditional branches,
>>>>> +       call and return from functions.
>>>>> +     - All instructions that can potentially be a register save / restore
>>>>> +       operation.
>>>>
>>>> This could do with being more fine grained, as "potentially" is pretty vague,
>>>> and (as per earlier version review comments) my take on this is a much wider
>>>> set than yours.
>>>
>>> I would like to understand more on this comment, especially the "my take
>>> on this is a much wider set than yours".  I see its being hinted at in
>>> different flavors in the current review.
>>>
>>> I see some issues pointed out in this review (addressing modes of mov
>>> etc, safe to skip opcodes for TEST, CMP) etc., but it seems that your
>>> concerns are wider than this.
>>
>> I earlier version review I mentioned that even vector or mask registers
>> could in principle be use to hold preserved GPR values. I seem to recall
>> that you said you wouldn't want to deal with such. Hence my use of
>> "wider set": Just to give an example, "kmovq %rbp, %k0" plus later
>> "kmovq %k0, %rbp" is a pair of "instructions that can potentially be a
>> register save / restore operation".
>>
> 
> Hmm. I will need to understand them on a case to case basis.  For the 
> case of "kmovq %rbp, %k0" / "kmovq %k0, %rbp" how can this be used as 
> save/restore to/from stack ?

Maybe I'm still not having a clear enough picture of what forms of insns
you want to fully track. Said insn forms don't access the stack. But they
could in principle be used to preserve a certain register. Such preserving
of registers is part of what needs encoding in CFI, isn't it?

>>>>> +      ginsn = ginsn_new_load (insn_end_sym, false,
>>>>> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
>>>>> +			      GINSN_DST_REG, dw2_regnum);
>>>>> +      ginsn_set_where (ginsn);
>>>>> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
>>>>> +				  GINSN_SRC_REG, REG_SP, 0,
>>>>> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
>>>>> +				  GINSN_DST_REG, REG_SP, 0);
>>>>> +      ginsn_set_where (ginsn_next);
>>>>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>>>> +      break;
>>>>> +
>>>>> +    case 0xff:
>>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>>> +	break;
>>>>> +      /* push from mem.  */
>>>>> +      if (i.tm.extension_opcode == 6)
>>>>> +	{
>>>>> +	  /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>>>> +	  if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>>>> +	    stack_opnd_size = 2;
>>>>> +	  ginsn = ginsn_new_sub (insn_end_sym, false,
>>>>> +				 GINSN_SRC_REG, REG_SP, 0,
>>>>> +				 GINSN_SRC_IMM, 0, stack_opnd_size,
>>>>> +				 GINSN_DST_REG, REG_SP, 0);
>>>>> +	  ginsn_set_where (ginsn);
>>>>> +	  /* These instructions have no imm, only indirect access.  */
>>>>> +	  gas_assert (i.base_reg);
>>>>> +	  dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>>>> +	  ginsn_next = ginsn_new_store (insn_end_sym, false,
>>>>> +					GINSN_SRC_INDIRECT, dw2_regnum,
>>>>> +					GINSN_DST_INDIRECT, REG_SP, 0);
>>>>> +	  ginsn_set_where (ginsn_next);
>>>>> +	  gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>>>> +	}
>>>>> +      else if (i.tm.extension_opcode == 4)
>>>>> +	{
>>>>> +	  /* jmp r/m.  E.g., notrack jmp *%rax.  */
>>>>> +	  if (i.reg_operands)
>>>>> +	    {
>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>> +	      ginsn_set_where (ginsn);
>>>>> +	    }
>>>>> +	  else if (i.mem_operands && i.index_reg)
>>>>> +	    {
>>>>> +	      /* jmp    *0x0(,%rax,8).  */
>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.index_reg);
>>>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>> +	      ginsn_set_where (ginsn);
>>>>
>>>> What if both base and index are in use? Like for PUSH/POP, all addressing
>>>> forms are permitted here and ...
>>>>
>>>>> +	    }
>>>>> +	  else if (i.mem_operands && i.base_reg)
>>>>> +	    {
>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>> +	      ginsn_set_where (ginsn);
>>>>> +	    }
>>>>> +	}
>>>>> +      else if (i.tm.extension_opcode == 2)
>>>>> +	{
>>>>> +	  /* 0xFF /2 (call).  */
>>>>> +	  if (i.reg_operands)
>>>>> +	    {
>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>>>> +	      ginsn = ginsn_new_call (insn_end_sym, true,
>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>> +	      ginsn_set_where (ginsn);
>>>>> +	    }
>>>>> +	  else if (i.mem_operands && i.base_reg)
>>>>> +	    {
>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>>>> +	      ginsn = ginsn_new_call (insn_end_sym, true,
>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>> +	      ginsn_set_where (ginsn);
>>>>> +	    }
>>>>
>>>> ... here.
>>>
>>> For the indirect jump and call instructions, the target destination is
>>> not necessary to be known.  Indirect jumps will cause SCFI to error out
>>> as control flow cannot be constructed.
>>>
>>> Call instructions are assumed to transfer control out of function.
>>>
>>> In other words, more information in these cases will not be of use to SCFI.
>>
>> But then aren't you already doing too much work here? With what you say,
>> you don't care about the kind of operand at all, merely the fact it's a
>> CALL might be interesting. Albeit even there I'd then wonder why - if
>> the function called isn't of interest, how is CALL different from just
>> NOP? The only CALL you'd need to be concerned about would be the direct
>> form with a displacement of 0, as indicated elsewhere. But of course
>> tricky code might also use variants of that; see e.g. the retpoline code
>> that was introduced into kernels a couple of years back. Recognizing
>> and bailing on such tricky code may be desirable, if not necessary.
>>
> 
> Tracking operands for CALL instructions does not provide value ATM.  We 
> do not even split a BB if there is a CALL instruction (the assumption is 
> that CALL insn transfers control out of function).
> 
> You're right that we need to treat some CALLs differently (because of 
> its affect of control flow and SCFI correctness).
> 
> RE: doing more work. Sure, but the vision for ginsn was to allow a 
> representation where other analyses may be added. The representation of 
> ginsn will need revisions to get there, but keeping an explicit 
> GINSN_TYPE_CALL seems good.

That's okay, but with as little dead (for now) code as possible. If the
operand isn't interesting, why deal with the various operand forms right
now?

>>>>> +	}
>>>>> +      break;
>>>>> +
>>>>> +    case 0xc2:
>>>>> +    case 0xc3:
>>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>>> +	break;
>>>>> +      /* Near ret.  */
>>>>> +      ginsn = ginsn_new_return (insn_end_sym, true);
>>>>> +      ginsn_set_where (ginsn);
>>>>> +      break;
>>>>
>>>> No tracking of the stack pointer adjustment?
>>>
>>> No stack unwind information for a function is relevant after the
>>> function has returned.  So, tracking of stack pointer adjustment by
>>> return is not necessary.
>>
>> What information does the "return" insn then carry, beyond it being
>> an unconditional branch (which you have a different insn for)?
>>
> 
> "return" does not carry any more information than just the 
> GINSN_TYPE_RETURN as ginsn->type.
> 
> So then why support both "return" and an unconditional branch: The 
> intention is to carry the semantic difference between ret and 
> unconditional jump.  Unconditional jumps may be to a label within 
> function, and in those cases, we use it for some validation and BB 
> linking when creating CFG. Return, OTOH, always indicates exit from 
> function.
> 
> For SCFI purposes, above is the one use.  Future analyses may find other 
> use-cases for an explicit return ginsn.  But IMO, keeping 
> GINSN_TYPE_RETURN as an explicit insn makes the overall offering cleaner.

Okay. And here you don't bother decoding operands. Hence why I'm
asking the same to be the case for (e.g.) CALL.

>>>>> @@ -5830,6 +6804,14 @@ md_assemble (char *line)
>>>>>      /* We are ready to output the insn.  */
>>>>>      output_insn (last_insn);
>>>>>    
>>>>> +  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
>>>>> +     performed in i386_target_format.  */
>>>>
>>>> See my earlier comment - it's yet more restrictive (as in not covering
>>>> e.g. the Windows ABI, which importantly is also used in EFI).
>>>
>>> Not clear, can you please elaborate ?
>>
>> Hmm, it's not clear to me what's not clear in my earlier comment. The
>> set of preserved registers, for example, differs between the SysV and
>> the Windows ABIs (see x86_scfi_callee_saved_p()). Then again, thinking
>> of it, talking of anything ABI-ish in assembly code is questionable.
>> An assembly function calling another assembly function may use an
>> entirely custom "ABI". You just cannot guess upon preserved registers.
>>
> 
> I think the confusion is stemming from my usage of AMD64 colloquially to 
> refer to System V ABI for x86_64. I think "System V AMD64 ABI" is the 
> correct term. I will use that.
> 
> And yes, GAS can only work out SCFI if there is ABI adherence.  If input 
> asm does not adhere to the supported ABIs, and the user invokes SCFI, 
> then the user is on their own.  GAS will not (rather can not) warn about 
> ABI incompliance.

I don't recall documentation stating this explicitly. This is a pretty
important aspect for users to consider, after all.

Jan
  
Indu Bhagat Jan. 10, 2024, 11:26 a.m. UTC | #9
On 1/10/24 01:44, Jan Beulich wrote:
> On 10.01.2024 07:10, Indu Bhagat wrote:
>> On 1/9/24 01:30, Jan Beulich wrote:
>>> On 08.01.2024 20:33, Indu Bhagat wrote:
>>>> On 1/5/24 05:58, Jan Beulich wrote:
>>>>> On 03.01.2024 08:15, Indu Bhagat wrote:
>>>>>> +/* Check whether a '66H' prefix accompanies the instruction.
>>>>>
>>>>> With APX 16-bit operand size isn't necessarily represented by a 66h
>>>>> prefix, but perhaps with an "embedded prefix" inside the EVEX one.
>>>>> Therefore both the comment and even more so ...
>>>>>
>>>>>> +   The current users of this API are in the handlers for PUSH, POP
>>>>>> +   instructions.  These instructions affect the stack pointer implicitly:  the
>>>>>> +   operand size (16, 32, or 64 bits) determines the amount by which the stack
>>>>>> +   pointer is decremented (2, 4 or 8).  When '66H' prefix is present, the
>>>>>> +   instruction has a 16-bit operand.  */
>>>>>> +
>>>>>> +static bool
>>>>>> +ginsn_prefix_66H_p (i386_insn insn)
>>>>>
>>>>> ... the function name would better not allude to just the legacy
>>>>> encoding. Maybe ginsn_opsize_prefix_p()?
>>>>>
>>>>
>>>> Isnt 66H_p more readable and easier to follow because that's what the
>>>> function is currently checking ?  If more scenarios were being handled,
>>>> ginsn_opsize_prefix_p () would fit better.
>>>
>>> Well, as said - with APX you can't get away with just 0x66 prefix checking.
>>> That prefix is simply illegal to use with EVEX-encoded insns.
>>>
>>
>> I am using the following in ginsn_opsize_prefix_p ():
>>
>> !(i.prefix[REX_PREFIX] & REX_W) && i.prefix[DATA_PREFIX] == 0x66
> 
> That addresses one half of my earlier remarks. Note however that elsewhere
> we never check i.prefix[DATA_PREFIX] against being 0x66; we only ever check
> for it being zero or non-zero. I'd like this to remain consistent.
> 
> For EVEX-encoded APX insns this isn't going to be sufficient though. See
> respective code in process_suffix():
> 
> 	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
> 	     needs to be adjusted.  */
> 	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
> 	    {
> 	      gas_assert (!i.tm.opcode_modifier.opcodeprefix);
> 	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
> 	    }
> 	  else if (!add_prefix (prefix))
> 	    return 0;
> 
> So you'll need to also check for that combination, plus take care of
> avoiding insns where PREFIX_0X66 alters operation, not operand size
> (ADCX being an example).
> 

But I am now disallowing APX insns for now, altogether, by doing this in 
the x86_ginsn_new:

/* Until it is clear how to handle APX NDD and other new opcodes,   disallow
      them from SCFI.  */
   if (is_apx_rex2_encoding ()
       || (i.tm.opcode_modifier.evex && is_apx_evex_encoding ()))
     {
       as_bad (_("SCFI: unsupported APX op %#x may cause incorrect CFI"),
               opcode);
       return ginsn;
     }

>>>>>> +static ginsnS *
>>>>>> +x86_ginsn_move (const symbolS *insn_end_sym)
>>>>>> +{
>>>>>> +  ginsnS *ginsn;
>>>>>> +  unsigned int dst_reg;
>>>>>> +  unsigned int src_reg;
>>>>>> +  offsetT src_disp = 0;
>>>>>> +  offsetT dst_disp = 0;
>>>>>> +  const reg_entry *dst = NULL;
>>>>>> +  const reg_entry *src = NULL;
>>>>>> +  uint16_t opcode = i.tm.base_opcode;
>>>>>> +  enum ginsn_src_type src_type = GINSN_SRC_REG;
>>>>>> +  enum ginsn_dst_type dst_type = GINSN_DST_REG;
>>>>>> +
>>>>>> +  if (opcode == 0x8b || opcode == 0x8a)
>>>>>
>>>>> Above when handling ALU insns you didn't care about byte ops - why do
>>>>> you do so here (by checking for 0x8a, and 0x88 below)?
>>>>
>>>> Because there may be weird code that saves and restores 8-byte registers
>>>> on stack.  For ALU insns, if the destination is REG_SP/REG_FP, we will
>>>> detect them in the unhandled track.
>>>
>>> You talk about 8-byte registers when I asked about byte reg moves. If
>>> there's a MOV targeting %spl or %bpl, is that really any different from,
>>> say, an ADD doing so?
>>>
>>
>> ATM, Yes. SCFI has heuristics implemented, _some_ of which (can be
>> evolved) are opcode specific. E.g.,
>>     - MOV %rsp, %rbp will make SCFI machinery check if this is making the
>> CFA switch to %rbp.  If the target was %bpl, since we track 8-byte
>> registers, it will still trigger the same code path. A bug as I ack below.
>>     - ADD rsp + 0 = rbp will not trigger CFA switch to RBP. Should it ?
>> Perhaps yes? Its on my TODO list (with low priority) to evolve this bit.
> 
> I'm not sure about this. Either way such special cases may want documenting
> explicitly.
> 

OK.  There is also need to document ginsn, SCFI workings with examples 
of unsupported cases etc.  I will invest some time on this.

>>>>>> +    }
>>>>>> +
>>>>>> +  ginsn_set_where (ginsn);
>>>>>> +
>>>>>> +  return ginsn;
>>>>>> +}
>>>>>
>>>>> Throughout this function (and perhaps others as well), how come you
>>>>> don't consider operand size at all? It matters whether results are
>>>>> 64-bit, 32-bit, or 16-bit, after all.
>>>>
>>>> Operation size matters: No, not for all instructions in context of SCFI.
>>>> For instructions using stack (save / restore), the size is important.
>>>> But for other instructions, operation size will not affect SCFI correctness.
>>>
>>> But aren't you wrongly treating an update of %rbp and an update of,
>>> say, %bp the same then? The latter should end up as untracable, aiui.
>>>
>>
>> For ALU ops:
>>     - ADD/SUB reg1, reg2
>>       If reg2 is the same as REG_CFA, then even in 64-bit mode, this
>> causes untraceability. So there is untraceability irrespective of
>> operation size. On the other hand, if uninteresting, it will remain
>> unintersting irrespective of operation size.
>>     - ADD/SUB imm, reg will never trigger untraceability, irrespective of
>> size. There is the element of ignoring the carry bit, but I think the
>> current diagnostics around "asymmetrical save/restore" and the planned
>> "unbalanced stack at return" should provide user with some safety net.
> 
> I don't see how the carry flag would matter here. What does matter is the
> size of the register: Anything not 64-bit will need to trigger
> untraceability imo.
> 

  - For a 16-bit ADD/SUB imm, reg insn: if the 16-bit result had a 
carry, it will be ignored as only 16-bit result is copied to the 
destination address IIUC.  If there is no carry bit, 16-bit ADD/SUB imm, 
reg is semantically the same insn as a 64-bit ADD/SUB imm, reg for the 
same immediate.
  - For 32-bit ADD/SUB imm, reg insn: yes, I stand corrected. It should 
trigger untraceability as upper 32-bits are zeroed out. IMO, this 32-bit 
operation should likely be unintentional by the user; something better 
alerted to the user if we can.

I will come back to this item of tracking operation sizes when I fix the 
MOV issue below.

>>     - Other ALU ops should all trigger untracebility alike, irrespective
>> of operation size.
>> Hence, my statement that ignoring operation size is fine here for SCFI.
> 
> As per above, I disagree.
> 
>> For MOV ops:
>>     - 8-bit/16-bit MOV should trigger untracebility. I ack this as bug to
>> be fixed. Thanks to your careful review. ATM, I plan to deal with this
>> after the series goes in, unless you have strong opinion about this.
> 
> 32-bit MOV should, too. And yes, I'm okay-ish with such being dealt with
> later, as long as the open work item is easily recognizable (and hence
> it'll be easy to check that all such work items are gone at the point
> where the feature is promoted from experimental).
> 

Right, 32-bit MOV should too.

>>>>>> +/* Generate one or more generic GAS instructions, a.k.a, ginsns for the current
>>>>>> +   machine instruction.
>>>>>> +
>>>>>> +   Returns the head of linked list of ginsn(s) added, if success; Returns NULL
>>>>>> +   if failure.
>>>>>> +
>>>>>> +   The input ginsn_gen_mode GMODE determines the set of minimal necessary
>>>>>> +   ginsns necessary for correctness of any passes applicable for that mode.
>>>>>> +   For supporting the GINSN_GEN_SCFI generation mode, following is the list of
>>>>>> +   machine instructions that must be translated into the corresponding ginsns
>>>>>> +   to ensure correctness of SCFI:
>>>>>> +     - All instructions affecting the two registers that could potentially
>>>>>> +       be used as the base register for CFA tracking.  For SCFI, the base
>>>>>> +       register for CFA tracking is limited to REG_SP and REG_FP only for
>>>>>> +       now.
>>>>>> +     - All change of flow instructions: conditional and unconditional branches,
>>>>>> +       call and return from functions.
>>>>>> +     - All instructions that can potentially be a register save / restore
>>>>>> +       operation.
>>>>>
>>>>> This could do with being more fine grained, as "potentially" is pretty vague,
>>>>> and (as per earlier version review comments) my take on this is a much wider
>>>>> set than yours.
>>>>
>>>> I would like to understand more on this comment, especially the "my take
>>>> on this is a much wider set than yours".  I see its being hinted at in
>>>> different flavors in the current review.
>>>>
>>>> I see some issues pointed out in this review (addressing modes of mov
>>>> etc, safe to skip opcodes for TEST, CMP) etc., but it seems that your
>>>> concerns are wider than this.
>>>
>>> I earlier version review I mentioned that even vector or mask registers
>>> could in principle be use to hold preserved GPR values. I seem to recall
>>> that you said you wouldn't want to deal with such. Hence my use of
>>> "wider set": Just to give an example, "kmovq %rbp, %k0" plus later
>>> "kmovq %k0, %rbp" is a pair of "instructions that can potentially be a
>>> register save / restore operation".
>>>
>>
>> Hmm. I will need to understand them on a case to case basis.  For the
>> case of "kmovq %rbp, %k0" / "kmovq %k0, %rbp" how can this be used as
>> save/restore to/from stack ?
> 
> Maybe I'm still not having a clear enough picture of what forms of insns
> you want to fully track. Said insn forms don't access the stack. But they
> could in principle be used to preserve a certain register. Such preserving
> of registers is part of what needs encoding in CFI, isn't it?
> 

The kind of preserving is usually on stack. It can also be in another 
callee-saved register, in theory, but the latter defeats the purpose of 
state saving across calls.

BTW, I had earlier written down some notes about SCFI 
https://sourceware.org/pipermail/binutils/2023-September/129558.html
Some bits are stale already though, but may be it helps.

>>>>>> +      ginsn = ginsn_new_load (insn_end_sym, false,
>>>>>> +			      GINSN_SRC_INDIRECT, REG_SP, 0,
>>>>>> +			      GINSN_DST_REG, dw2_regnum);
>>>>>> +      ginsn_set_where (ginsn);
>>>>>> +      ginsn_next = ginsn_new_add (insn_end_sym, false,
>>>>>> +				  GINSN_SRC_REG, REG_SP, 0,
>>>>>> +				  GINSN_SRC_IMM, 0, stack_opnd_size,
>>>>>> +				  GINSN_DST_REG, REG_SP, 0);
>>>>>> +      ginsn_set_where (ginsn_next);
>>>>>> +      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>>>>> +      break;
>>>>>> +
>>>>>> +    case 0xff:
>>>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>>>> +	break;
>>>>>> +      /* push from mem.  */
>>>>>> +      if (i.tm.extension_opcode == 6)
>>>>>> +	{
>>>>>> +	  /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
>>>>>> +	  if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
>>>>>> +	    stack_opnd_size = 2;
>>>>>> +	  ginsn = ginsn_new_sub (insn_end_sym, false,
>>>>>> +				 GINSN_SRC_REG, REG_SP, 0,
>>>>>> +				 GINSN_SRC_IMM, 0, stack_opnd_size,
>>>>>> +				 GINSN_DST_REG, REG_SP, 0);
>>>>>> +	  ginsn_set_where (ginsn);
>>>>>> +	  /* These instructions have no imm, only indirect access.  */
>>>>>> +	  gas_assert (i.base_reg);
>>>>>> +	  dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>>>>> +	  ginsn_next = ginsn_new_store (insn_end_sym, false,
>>>>>> +					GINSN_SRC_INDIRECT, dw2_regnum,
>>>>>> +					GINSN_DST_INDIRECT, REG_SP, 0);
>>>>>> +	  ginsn_set_where (ginsn_next);
>>>>>> +	  gas_assert (!ginsn_link_next (ginsn, ginsn_next));
>>>>>> +	}
>>>>>> +      else if (i.tm.extension_opcode == 4)
>>>>>> +	{
>>>>>> +	  /* jmp r/m.  E.g., notrack jmp *%rax.  */
>>>>>> +	  if (i.reg_operands)
>>>>>> +	    {
>>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>>>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>>> +	      ginsn_set_where (ginsn);
>>>>>> +	    }
>>>>>> +	  else if (i.mem_operands && i.index_reg)
>>>>>> +	    {
>>>>>> +	      /* jmp    *0x0(,%rax,8).  */
>>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.index_reg);
>>>>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>>> +	      ginsn_set_where (ginsn);
>>>>>
>>>>> What if both base and index are in use? Like for PUSH/POP, all addressing
>>>>> forms are permitted here and ...
>>>>>
>>>>>> +	    }
>>>>>> +	  else if (i.mem_operands && i.base_reg)
>>>>>> +	    {
>>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>>>>> +	      ginsn = ginsn_new_jump (insn_end_sym, true,
>>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>>> +	      ginsn_set_where (ginsn);
>>>>>> +	    }
>>>>>> +	}
>>>>>> +      else if (i.tm.extension_opcode == 2)
>>>>>> +	{
>>>>>> +	  /* 0xFF /2 (call).  */
>>>>>> +	  if (i.reg_operands)
>>>>>> +	    {
>>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
>>>>>> +	      ginsn = ginsn_new_call (insn_end_sym, true,
>>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>>> +	      ginsn_set_where (ginsn);
>>>>>> +	    }
>>>>>> +	  else if (i.mem_operands && i.base_reg)
>>>>>> +	    {
>>>>>> +	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
>>>>>> +	      ginsn = ginsn_new_call (insn_end_sym, true,
>>>>>> +				      GINSN_SRC_REG, dw2_regnum, NULL);
>>>>>> +	      ginsn_set_where (ginsn);
>>>>>> +	    }
>>>>>
>>>>> ... here.
>>>>
>>>> For the indirect jump and call instructions, the target destination is
>>>> not necessary to be known.  Indirect jumps will cause SCFI to error out
>>>> as control flow cannot be constructed.
>>>>
>>>> Call instructions are assumed to transfer control out of function.
>>>>
>>>> In other words, more information in these cases will not be of use to SCFI.
>>>
>>> But then aren't you already doing too much work here? With what you say,
>>> you don't care about the kind of operand at all, merely the fact it's a
>>> CALL might be interesting. Albeit even there I'd then wonder why - if
>>> the function called isn't of interest, how is CALL different from just
>>> NOP? The only CALL you'd need to be concerned about would be the direct
>>> form with a displacement of 0, as indicated elsewhere. But of course
>>> tricky code might also use variants of that; see e.g. the retpoline code
>>> that was introduced into kernels a couple of years back. Recognizing
>>> and bailing on such tricky code may be desirable, if not necessary.
>>>
>>
>> Tracking operands for CALL instructions does not provide value ATM.  We
>> do not even split a BB if there is a CALL instruction (the assumption is
>> that CALL insn transfers control out of function).
>>
>> You're right that we need to treat some CALLs differently (because of
>> its affect of control flow and SCFI correctness).
>>
>> RE: doing more work. Sure, but the vision for ginsn was to allow a
>> representation where other analyses may be added. The representation of
>> ginsn will need revisions to get there, but keeping an explicit
>> GINSN_TYPE_CALL seems good.
> 
> That's okay, but with as little dead (for now) code as possible. If the
> operand isn't interesting, why deal with the various operand forms right
> now?
> 

I dont have a good answer to this. If I dont add opnds to 
GINSN_TYPE_CALL now, I will have to put some dummy default args (which I 
personally find confusing to see; I already have some in the code 
elsewhere for RegIP/RegIZ). And yes, we have disagreed on this matter 
before too.

>>>>>> +	}
>>>>>> +      break;
>>>>>> +
>>>>>> +    case 0xc2:
>>>>>> +    case 0xc3:
>>>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>>>> +	break;
>>>>>> +      /* Near ret.  */
>>>>>> +      ginsn = ginsn_new_return (insn_end_sym, true);
>>>>>> +      ginsn_set_where (ginsn);
>>>>>> +      break;
>>>>>
>>>>> No tracking of the stack pointer adjustment?
>>>>
>>>> No stack unwind information for a function is relevant after the
>>>> function has returned.  So, tracking of stack pointer adjustment by
>>>> return is not necessary.
>>>
>>> What information does the "return" insn then carry, beyond it being
>>> an unconditional branch (which you have a different insn for)?
>>>
>>
>> "return" does not carry any more information than just the
>> GINSN_TYPE_RETURN as ginsn->type.
>>
>> So then why support both "return" and an unconditional branch: The
>> intention is to carry the semantic difference between ret and
>> unconditional jump.  Unconditional jumps may be to a label within
>> function, and in those cases, we use it for some validation and BB
>> linking when creating CFG. Return, OTOH, always indicates exit from
>> function.
>>
>> For SCFI purposes, above is the one use.  Future analyses may find other
>> use-cases for an explicit return ginsn.  But IMO, keeping
>> GINSN_TYPE_RETURN as an explicit insn makes the overall offering cleaner.
> 
> Okay. And here you don't bother decoding operands. Hence why I'm
> asking the same to be the case for (e.g.) CALL.
> 

It seems I will need to deal with operands of RETURN insn soon.  For 
implementing "Warn if imbalanced stack at return", we will need this info.

>>>>>> @@ -5830,6 +6804,14 @@ md_assemble (char *line)
>>>>>>       /* We are ready to output the insn.  */
>>>>>>       output_insn (last_insn);
>>>>>>     
>>>>>> +  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
>>>>>> +     performed in i386_target_format.  */
>>>>>
>>>>> See my earlier comment - it's yet more restrictive (as in not covering
>>>>> e.g. the Windows ABI, which importantly is also used in EFI).
>>>>
>>>> Not clear, can you please elaborate ?
>>>
>>> Hmm, it's not clear to me what's not clear in my earlier comment. The
>>> set of preserved registers, for example, differs between the SysV and
>>> the Windows ABIs (see x86_scfi_callee_saved_p()). Then again, thinking
>>> of it, talking of anything ABI-ish in assembly code is questionable.
>>> An assembly function calling another assembly function may use an
>>> entirely custom "ABI". You just cannot guess upon preserved registers.
>>>
>>
>> I think the confusion is stemming from my usage of AMD64 colloquially to
>> refer to System V ABI for x86_64. I think "System V AMD64 ABI" is the
>> correct term. I will use that.
>>
>> And yes, GAS can only work out SCFI if there is ABI adherence.  If input
>> asm does not adhere to the supported ABIs, and the user invokes SCFI,
>> then the user is on their own.  GAS will not (rather can not) warn about
>> ABI incompliance.
> 
> I don't recall documentation stating this explicitly. This is a pretty
> important aspect for users to consider, after all.
> 

I added a reference in gas/NEWS for now.

* Experimental support in GAS to synthesize CFI for ABI-conformant,
   hand-written asm using the new command line option 
--scfi=experimental  on x86-64.

I will add a mention of "ABI conformance" in as.texi.
  
Jan Beulich Jan. 10, 2024, 2:15 p.m. UTC | #10
On 10.01.2024 12:26, Indu Bhagat wrote:
> On 1/10/24 01:44, Jan Beulich wrote:
>> On 10.01.2024 07:10, Indu Bhagat wrote:
>>> On 1/9/24 01:30, Jan Beulich wrote:
>>>> On 08.01.2024 20:33, Indu Bhagat wrote:
>>>>> On 1/5/24 05:58, Jan Beulich wrote:
>>>>>> On 03.01.2024 08:15, Indu Bhagat wrote:
>>>>>>> +/* Check whether a '66H' prefix accompanies the instruction.
>>>>>>
>>>>>> With APX 16-bit operand size isn't necessarily represented by a 66h
>>>>>> prefix, but perhaps with an "embedded prefix" inside the EVEX one.
>>>>>> Therefore both the comment and even more so ...
>>>>>>
>>>>>>> +   The current users of this API are in the handlers for PUSH, POP
>>>>>>> +   instructions.  These instructions affect the stack pointer implicitly:  the
>>>>>>> +   operand size (16, 32, or 64 bits) determines the amount by which the stack
>>>>>>> +   pointer is decremented (2, 4 or 8).  When '66H' prefix is present, the
>>>>>>> +   instruction has a 16-bit operand.  */
>>>>>>> +
>>>>>>> +static bool
>>>>>>> +ginsn_prefix_66H_p (i386_insn insn)
>>>>>>
>>>>>> ... the function name would better not allude to just the legacy
>>>>>> encoding. Maybe ginsn_opsize_prefix_p()?
>>>>>>
>>>>>
>>>>> Isnt 66H_p more readable and easier to follow because that's what the
>>>>> function is currently checking ?  If more scenarios were being handled,
>>>>> ginsn_opsize_prefix_p () would fit better.
>>>>
>>>> Well, as said - with APX you can't get away with just 0x66 prefix checking.
>>>> That prefix is simply illegal to use with EVEX-encoded insns.
>>>>
>>>
>>> I am using the following in ginsn_opsize_prefix_p ():
>>>
>>> !(i.prefix[REX_PREFIX] & REX_W) && i.prefix[DATA_PREFIX] == 0x66
>>
>> That addresses one half of my earlier remarks. Note however that elsewhere
>> we never check i.prefix[DATA_PREFIX] against being 0x66; we only ever check
>> for it being zero or non-zero. I'd like this to remain consistent.
>>
>> For EVEX-encoded APX insns this isn't going to be sufficient though. See
>> respective code in process_suffix():
>>
>> 	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
>> 	     needs to be adjusted.  */
>> 	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
>> 	    {
>> 	      gas_assert (!i.tm.opcode_modifier.opcodeprefix);
>> 	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
>> 	    }
>> 	  else if (!add_prefix (prefix))
>> 	    return 0;
>>
>> So you'll need to also check for that combination, plus take care of
>> avoiding insns where PREFIX_0X66 alters operation, not operand size
>> (ADCX being an example).
>>
> 
> But I am now disallowing APX insns for now, altogether, by doing this in 
> the x86_ginsn_new:
> 
> /* Until it is clear how to handle APX NDD and other new opcodes,   disallow
>       them from SCFI.  */
>    if (is_apx_rex2_encoding ()
>        || (i.tm.opcode_modifier.evex && is_apx_evex_encoding ()))
>      {
>        as_bad (_("SCFI: unsupported APX op %#x may cause incorrect CFI"),
>                opcode);
>        return ginsn;
>      }

Well, if that's what you want to do ... (Even if you wanted to not support
APX for now, I would have expected the catch-all looking at just the
destination register to properly diagnose any use.)

>>>>>>> +    }
>>>>>>> +
>>>>>>> +  ginsn_set_where (ginsn);
>>>>>>> +
>>>>>>> +  return ginsn;
>>>>>>> +}
>>>>>>
>>>>>> Throughout this function (and perhaps others as well), how come you
>>>>>> don't consider operand size at all? It matters whether results are
>>>>>> 64-bit, 32-bit, or 16-bit, after all.
>>>>>
>>>>> Operation size matters: No, not for all instructions in context of SCFI.
>>>>> For instructions using stack (save / restore), the size is important.
>>>>> But for other instructions, operation size will not affect SCFI correctness.
>>>>
>>>> But aren't you wrongly treating an update of %rbp and an update of,
>>>> say, %bp the same then? The latter should end up as untracable, aiui.
>>>>
>>>
>>> For ALU ops:
>>>     - ADD/SUB reg1, reg2
>>>       If reg2 is the same as REG_CFA, then even in 64-bit mode, this
>>> causes untraceability. So there is untraceability irrespective of
>>> operation size. On the other hand, if uninteresting, it will remain
>>> unintersting irrespective of operation size.
>>>     - ADD/SUB imm, reg will never trigger untraceability, irrespective of
>>> size. There is the element of ignoring the carry bit, but I think the
>>> current diagnostics around "asymmetrical save/restore" and the planned
>>> "unbalanced stack at return" should provide user with some safety net.
>>
>> I don't see how the carry flag would matter here. What does matter is the
>> size of the register: Anything not 64-bit will need to trigger
>> untraceability imo.
>>
> 
>   - For a 16-bit ADD/SUB imm, reg insn: if the 16-bit result had a 
> carry, it will be ignored as only 16-bit result is copied to the 
> destination address IIUC.  If there is no carry bit, 16-bit ADD/SUB imm, 
> reg is semantically the same insn as a 64-bit ADD/SUB imm, reg for the 
> same immediate.

Oh, that's where you see CF coming into play. I wouldn't view it from
this angle - you simply don't know whether overflow/underflow would
occur, so it's no different to ...

>   - For 32-bit ADD/SUB imm, reg insn: yes, I stand corrected. It should 
> trigger untraceability as upper 32-bits are zeroed out. IMO, this 32-bit 
> operation should likely be unintentional by the user; something better 
> alerted to the user if we can.

... this case.

>>>>>>> +/* Generate one or more generic GAS instructions, a.k.a, ginsns for the current
>>>>>>> +   machine instruction.
>>>>>>> +
>>>>>>> +   Returns the head of linked list of ginsn(s) added, if success; Returns NULL
>>>>>>> +   if failure.
>>>>>>> +
>>>>>>> +   The input ginsn_gen_mode GMODE determines the set of minimal necessary
>>>>>>> +   ginsns necessary for correctness of any passes applicable for that mode.
>>>>>>> +   For supporting the GINSN_GEN_SCFI generation mode, following is the list of
>>>>>>> +   machine instructions that must be translated into the corresponding ginsns
>>>>>>> +   to ensure correctness of SCFI:
>>>>>>> +     - All instructions affecting the two registers that could potentially
>>>>>>> +       be used as the base register for CFA tracking.  For SCFI, the base
>>>>>>> +       register for CFA tracking is limited to REG_SP and REG_FP only for
>>>>>>> +       now.
>>>>>>> +     - All change of flow instructions: conditional and unconditional branches,
>>>>>>> +       call and return from functions.
>>>>>>> +     - All instructions that can potentially be a register save / restore
>>>>>>> +       operation.
>>>>>>
>>>>>> This could do with being more fine grained, as "potentially" is pretty vague,
>>>>>> and (as per earlier version review comments) my take on this is a much wider
>>>>>> set than yours.
>>>>>
>>>>> I would like to understand more on this comment, especially the "my take
>>>>> on this is a much wider set than yours".  I see its being hinted at in
>>>>> different flavors in the current review.
>>>>>
>>>>> I see some issues pointed out in this review (addressing modes of mov
>>>>> etc, safe to skip opcodes for TEST, CMP) etc., but it seems that your
>>>>> concerns are wider than this.
>>>>
>>>> I earlier version review I mentioned that even vector or mask registers
>>>> could in principle be use to hold preserved GPR values. I seem to recall
>>>> that you said you wouldn't want to deal with such. Hence my use of
>>>> "wider set": Just to give an example, "kmovq %rbp, %k0" plus later
>>>> "kmovq %k0, %rbp" is a pair of "instructions that can potentially be a
>>>> register save / restore operation".
>>>>
>>>
>>> Hmm. I will need to understand them on a case to case basis.  For the
>>> case of "kmovq %rbp, %k0" / "kmovq %k0, %rbp" how can this be used as
>>> save/restore to/from stack ?
>>
>> Maybe I'm still not having a clear enough picture of what forms of insns
>> you want to fully track. Said insn forms don't access the stack. But they
>> could in principle be used to preserve a certain register. Such preserving
>> of registers is part of what needs encoding in CFI, isn't it?
>>
> 
> The kind of preserving is usually on stack. It can also be in another 
> callee-saved register, in theory, but the latter defeats the purpose of 
> state saving across calls.

Callee-preserved registers, when they have a special purpose in the
architecture (like %rsi, %rdi, and %rbx have) may be cheaper to
preserve by moving to a call-clobbered register that isn't otherwise
used in the function. In the SysV ABI this only affects %rbx, the
special purpose of which is extremely limited in the ISA (xlatb). In
the Windows ABI, otoh, %rsi and %rdi are callee-preserved, and those
have very common uses in the string insns.

> BTW, I had earlier written down some notes about SCFI 
> https://sourceware.org/pipermail/binutils/2023-September/129558.html
> Some bits are stale already though, but may be it helps.

I had read through this before first reviewing v1 (I think it was).

>>>>>>> +    case 0xc2:
>>>>>>> +    case 0xc3:
>>>>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>>>>> +	break;
>>>>>>> +      /* Near ret.  */
>>>>>>> +      ginsn = ginsn_new_return (insn_end_sym, true);
>>>>>>> +      ginsn_set_where (ginsn);
>>>>>>> +      break;
>>>>>>
>>>>>> No tracking of the stack pointer adjustment?
>>>>>
>>>>> No stack unwind information for a function is relevant after the
>>>>> function has returned.  So, tracking of stack pointer adjustment by
>>>>> return is not necessary.
>>>>
>>>> What information does the "return" insn then carry, beyond it being
>>>> an unconditional branch (which you have a different insn for)?
>>>>
>>>
>>> "return" does not carry any more information than just the
>>> GINSN_TYPE_RETURN as ginsn->type.
>>>
>>> So then why support both "return" and an unconditional branch: The
>>> intention is to carry the semantic difference between ret and
>>> unconditional jump.  Unconditional jumps may be to a label within
>>> function, and in those cases, we use it for some validation and BB
>>> linking when creating CFG. Return, OTOH, always indicates exit from
>>> function.
>>>
>>> For SCFI purposes, above is the one use.  Future analyses may find other
>>> use-cases for an explicit return ginsn.  But IMO, keeping
>>> GINSN_TYPE_RETURN as an explicit insn makes the overall offering cleaner.
>>
>> Okay. And here you don't bother decoding operands. Hence why I'm
>> asking the same to be the case for (e.g.) CALL.
>>
> 
> It seems I will need to deal with operands of RETURN insn soon.  For 
> implementing "Warn if imbalanced stack at return", we will need this info.

Will you? Isn't stack state _before_ the RET what matters (and hence
the optional immediate still doesn't matter)?

>>>>>>> @@ -5830,6 +6804,14 @@ md_assemble (char *line)
>>>>>>>       /* We are ready to output the insn.  */
>>>>>>>       output_insn (last_insn);
>>>>>>>     
>>>>>>> +  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
>>>>>>> +     performed in i386_target_format.  */
>>>>>>
>>>>>> See my earlier comment - it's yet more restrictive (as in not covering
>>>>>> e.g. the Windows ABI, which importantly is also used in EFI).
>>>>>
>>>>> Not clear, can you please elaborate ?
>>>>
>>>> Hmm, it's not clear to me what's not clear in my earlier comment. The
>>>> set of preserved registers, for example, differs between the SysV and
>>>> the Windows ABIs (see x86_scfi_callee_saved_p()). Then again, thinking
>>>> of it, talking of anything ABI-ish in assembly code is questionable.
>>>> An assembly function calling another assembly function may use an
>>>> entirely custom "ABI". You just cannot guess upon preserved registers.
>>>>
>>>
>>> I think the confusion is stemming from my usage of AMD64 colloquially to
>>> refer to System V ABI for x86_64. I think "System V AMD64 ABI" is the
>>> correct term. I will use that.
>>>
>>> And yes, GAS can only work out SCFI if there is ABI adherence.  If input
>>> asm does not adhere to the supported ABIs, and the user invokes SCFI,
>>> then the user is on their own.  GAS will not (rather can not) warn about
>>> ABI incompliance.
>>
>> I don't recall documentation stating this explicitly. This is a pretty
>> important aspect for users to consider, after all.
>>
> 
> I added a reference in gas/NEWS for now.
> 
> * Experimental support in GAS to synthesize CFI for ABI-conformant,
>    hand-written asm using the new command line option 
> --scfi=experimental  on x86-64.
> 
> I will add a mention of "ABI conformance" in as.texi.

Maybe for NEWS this is enough; personally I don't think it is when there
are multiple ABIs for the/any affected architecture. You limiting support
to ELF doesn't mean the Windows ABI isn't in use. As said, UEFI uses that
ABI. And GNU ld can link ELF objects into EFI binaries.

Jan
  
Indu Bhagat Jan. 10, 2024, 7:43 p.m. UTC | #11
On 1/10/24 06:15, Jan Beulich wrote:
> On 10.01.2024 12:26, Indu Bhagat wrote:
>> On 1/10/24 01:44, Jan Beulich wrote:
>>> On 10.01.2024 07:10, Indu Bhagat wrote:
>>>> On 1/9/24 01:30, Jan Beulich wrote:
>>>>> On 08.01.2024 20:33, Indu Bhagat wrote:
>>>>>> On 1/5/24 05:58, Jan Beulich wrote:
>>>>>>> On 03.01.2024 08:15, Indu Bhagat wrote:
>>>>>>>> +/* Check whether a '66H' prefix accompanies the instruction.
>>>>>>>
>>>>>>> With APX 16-bit operand size isn't necessarily represented by a 66h
>>>>>>> prefix, but perhaps with an "embedded prefix" inside the EVEX one.
>>>>>>> Therefore both the comment and even more so ...
>>>>>>>
>>>>>>>> +   The current users of this API are in the handlers for PUSH, POP
>>>>>>>> +   instructions.  These instructions affect the stack pointer implicitly:  the
>>>>>>>> +   operand size (16, 32, or 64 bits) determines the amount by which the stack
>>>>>>>> +   pointer is decremented (2, 4 or 8).  When '66H' prefix is present, the
>>>>>>>> +   instruction has a 16-bit operand.  */
>>>>>>>> +
>>>>>>>> +static bool
>>>>>>>> +ginsn_prefix_66H_p (i386_insn insn)
>>>>>>>
>>>>>>> ... the function name would better not allude to just the legacy
>>>>>>> encoding. Maybe ginsn_opsize_prefix_p()?
>>>>>>>
>>>>>>
>>>>>> Isnt 66H_p more readable and easier to follow because that's what the
>>>>>> function is currently checking ?  If more scenarios were being handled,
>>>>>> ginsn_opsize_prefix_p () would fit better.
>>>>>
>>>>> Well, as said - with APX you can't get away with just 0x66 prefix checking.
>>>>> That prefix is simply illegal to use with EVEX-encoded insns.
>>>>>
>>>>
>>>> I am using the following in ginsn_opsize_prefix_p ():
>>>>
>>>> !(i.prefix[REX_PREFIX] & REX_W) && i.prefix[DATA_PREFIX] == 0x66
>>>
>>> That addresses one half of my earlier remarks. Note however that elsewhere
>>> we never check i.prefix[DATA_PREFIX] against being 0x66; we only ever check
>>> for it being zero or non-zero. I'd like this to remain consistent.
>>>
>>> For EVEX-encoded APX insns this isn't going to be sufficient though. See
>>> respective code in process_suffix():
>>>
>>> 	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
>>> 	     needs to be adjusted.  */
>>> 	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
>>> 	    {
>>> 	      gas_assert (!i.tm.opcode_modifier.opcodeprefix);
>>> 	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
>>> 	    }
>>> 	  else if (!add_prefix (prefix))
>>> 	    return 0;
>>>
>>> So you'll need to also check for that combination, plus take care of
>>> avoiding insns where PREFIX_0X66 alters operation, not operand size
>>> (ADCX being an example).
>>>
>>
>> But I am now disallowing APX insns for now, altogether, by doing this in
>> the x86_ginsn_new:
>>
>> /* Until it is clear how to handle APX NDD and other new opcodes,   disallow
>>        them from SCFI.  */
>>     if (is_apx_rex2_encoding ()
>>         || (i.tm.opcode_modifier.evex && is_apx_evex_encoding ()))
>>       {
>>         as_bad (_("SCFI: unsupported APX op %#x may cause incorrect CFI"),
>>                 opcode);
>>         return ginsn;
>>       }
> 
> Well, if that's what you want to do ... (Even if you wanted to not support
> APX for now, I would have expected the catch-all looking at just the
> destination register to properly diagnose any use.)
> 

I did intend to do it like that (tackle them in the unhandled insns 
codepath). I tested for some APX insns (push2, pop2, push2p, add, adc 
etc) and added a testcase scfi-unsupported-insn-1.s. So far so good.

But later I realized that I should test more APX insns in general: 
failures in translation to ginsn is what I would like to test more.

I will work it out in a separate patch later after the series goes in 
and remove this check and updating ginsn_opsize_prefix_p ().

>>>>>>>> +    }
>>>>>>>> +
>>>>>>>> +  ginsn_set_where (ginsn);
>>>>>>>> +
>>>>>>>> +  return ginsn;
>>>>>>>> +}
>>>>>>>
>>>>>>> Throughout this function (and perhaps others as well), how come you
>>>>>>> don't consider operand size at all? It matters whether results are
>>>>>>> 64-bit, 32-bit, or 16-bit, after all.
>>>>>>
>>>>>> Operation size matters: No, not for all instructions in context of SCFI.
>>>>>> For instructions using stack (save / restore), the size is important.
>>>>>> But for other instructions, operation size will not affect SCFI correctness.
>>>>>
>>>>> But aren't you wrongly treating an update of %rbp and an update of,
>>>>> say, %bp the same then? The latter should end up as untracable, aiui.
>>>>>
>>>>
>>>> For ALU ops:
>>>>      - ADD/SUB reg1, reg2
>>>>        If reg2 is the same as REG_CFA, then even in 64-bit mode, this
>>>> causes untraceability. So there is untraceability irrespective of
>>>> operation size. On the other hand, if uninteresting, it will remain
>>>> unintersting irrespective of operation size.
>>>>      - ADD/SUB imm, reg will never trigger untraceability, irrespective of
>>>> size. There is the element of ignoring the carry bit, but I think the
>>>> current diagnostics around "asymmetrical save/restore" and the planned
>>>> "unbalanced stack at return" should provide user with some safety net.
>>>
>>> I don't see how the carry flag would matter here. What does matter is the
>>> size of the register: Anything not 64-bit will need to trigger
>>> untraceability imo.
>>>
>>
>>    - For a 16-bit ADD/SUB imm, reg insn: if the 16-bit result had a
>> carry, it will be ignored as only 16-bit result is copied to the
>> destination address IIUC.  If there is no carry bit, 16-bit ADD/SUB imm,
>> reg is semantically the same insn as a 64-bit ADD/SUB imm, reg for the
>> same immediate.
> 
> Oh, that's where you see CF coming into play. I wouldn't view it from
> this angle - you simply don't know whether overflow/underflow would
> occur, so it's no different to ...
> 
>>    - For 32-bit ADD/SUB imm, reg insn: yes, I stand corrected. It should
>> trigger untraceability as upper 32-bits are zeroed out. IMO, this 32-bit
>> operation should likely be unintentional by the user; something better
>> alerted to the user if we can.
> 
> ... this case.
> 

Hmm. Yes, I agree. I will address handling of 8-bit/16-bit/32-bit ops in 
a separate patch later.

>>>>>>>> +/* Generate one or more generic GAS instructions, a.k.a, ginsns for the current
>>>>>>>> +   machine instruction.
>>>>>>>> +
>>>>>>>> +   Returns the head of linked list of ginsn(s) added, if success; Returns NULL
>>>>>>>> +   if failure.
>>>>>>>> +
>>>>>>>> +   The input ginsn_gen_mode GMODE determines the set of minimal necessary
>>>>>>>> +   ginsns necessary for correctness of any passes applicable for that mode.
>>>>>>>> +   For supporting the GINSN_GEN_SCFI generation mode, following is the list of
>>>>>>>> +   machine instructions that must be translated into the corresponding ginsns
>>>>>>>> +   to ensure correctness of SCFI:
>>>>>>>> +     - All instructions affecting the two registers that could potentially
>>>>>>>> +       be used as the base register for CFA tracking.  For SCFI, the base
>>>>>>>> +       register for CFA tracking is limited to REG_SP and REG_FP only for
>>>>>>>> +       now.
>>>>>>>> +     - All change of flow instructions: conditional and unconditional branches,
>>>>>>>> +       call and return from functions.
>>>>>>>> +     - All instructions that can potentially be a register save / restore
>>>>>>>> +       operation.
>>>>>>>
>>>>>>> This could do with being more fine grained, as "potentially" is pretty vague,
>>>>>>> and (as per earlier version review comments) my take on this is a much wider
>>>>>>> set than yours.
>>>>>>
>>>>>> I would like to understand more on this comment, especially the "my take
>>>>>> on this is a much wider set than yours".  I see its being hinted at in
>>>>>> different flavors in the current review.
>>>>>>
>>>>>> I see some issues pointed out in this review (addressing modes of mov
>>>>>> etc, safe to skip opcodes for TEST, CMP) etc., but it seems that your
>>>>>> concerns are wider than this.
>>>>>
>>>>> I earlier version review I mentioned that even vector or mask registers
>>>>> could in principle be use to hold preserved GPR values. I seem to recall
>>>>> that you said you wouldn't want to deal with such. Hence my use of
>>>>> "wider set": Just to give an example, "kmovq %rbp, %k0" plus later
>>>>> "kmovq %k0, %rbp" is a pair of "instructions that can potentially be a
>>>>> register save / restore operation".
>>>>>
>>>>
>>>> Hmm. I will need to understand them on a case to case basis.  For the
>>>> case of "kmovq %rbp, %k0" / "kmovq %k0, %rbp" how can this be used as
>>>> save/restore to/from stack ?
>>>
>>> Maybe I'm still not having a clear enough picture of what forms of insns
>>> you want to fully track. Said insn forms don't access the stack. But they
>>> could in principle be used to preserve a certain register. Such preserving
>>> of registers is part of what needs encoding in CFI, isn't it?
>>>
>>
>> The kind of preserving is usually on stack. It can also be in another
>> callee-saved register, in theory, but the latter defeats the purpose of
>> state saving across calls.
> 
> Callee-preserved registers, when they have a special purpose in the
> architecture (like %rsi, %rdi, and %rbx have) may be cheaper to
> preserve by moving to a call-clobbered register that isn't otherwise
> used in the function. In the SysV ABI this only affects %rbx, the
> special purpose of which is extremely limited in the ISA (xlatb). In
> the Windows ABI, otoh, %rsi and %rdi are callee-preserved, and those
> have very common uses in the string insns.
> 

I am not sure I follow completely. Call-clobbered registers are not of 
interest for SCFI...

>> BTW, I had earlier written down some notes about SCFI
>> https://sourceware.org/pipermail/binutils/2023-September/129558.html
>> Some bits are stale already though, but may be it helps.
> 
> I had read through this before first reviewing v1 (I think it was).
> 
>>>>>>>> +    case 0xc2:
>>>>>>>> +    case 0xc3:
>>>>>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>>>>>> +	break;
>>>>>>>> +      /* Near ret.  */
>>>>>>>> +      ginsn = ginsn_new_return (insn_end_sym, true);
>>>>>>>> +      ginsn_set_where (ginsn);
>>>>>>>> +      break;
>>>>>>>
>>>>>>> No tracking of the stack pointer adjustment?
>>>>>>
>>>>>> No stack unwind information for a function is relevant after the
>>>>>> function has returned.  So, tracking of stack pointer adjustment by
>>>>>> return is not necessary.
>>>>>
>>>>> What information does the "return" insn then carry, beyond it being
>>>>> an unconditional branch (which you have a different insn for)?
>>>>>
>>>>
>>>> "return" does not carry any more information than just the
>>>> GINSN_TYPE_RETURN as ginsn->type.
>>>>
>>>> So then why support both "return" and an unconditional branch: The
>>>> intention is to carry the semantic difference between ret and
>>>> unconditional jump.  Unconditional jumps may be to a label within
>>>> function, and in those cases, we use it for some validation and BB
>>>> linking when creating CFG. Return, OTOH, always indicates exit from
>>>> function.
>>>>
>>>> For SCFI purposes, above is the one use.  Future analyses may find other
>>>> use-cases for an explicit return ginsn.  But IMO, keeping
>>>> GINSN_TYPE_RETURN as an explicit insn makes the overall offering cleaner.
>>>
>>> Okay. And here you don't bother decoding operands. Hence why I'm
>>> asking the same to be the case for (e.g.) CALL.
>>>
>>
>> It seems I will need to deal with operands of RETURN insn soon.  For
>> implementing "Warn if imbalanced stack at return", we will need this info.
> 
> Will you? Isn't stack state _before_ the RET what matters (and hence
> the optional immediate still doesn't matter)?
> 

RET with operand makes this tricky.

My initial thought was:
"Balanced stack at function return" will check that the RSP at the entry 
of the function (after the call instruction) is the same as that at the 
return from the function (before the return instruction).

Now if RET with operand (which tells how much stack to pop before an 
eventual return) is in effect, I do need to check the RSP value right 
before the RETURN (RETURN being the microOP/ginsn equivalent).

Basically we want to alert if the RIP is not where RSP points to at RET.

>>>>>>>> @@ -5830,6 +6804,14 @@ md_assemble (char *line)
>>>>>>>>        /* We are ready to output the insn.  */
>>>>>>>>        output_insn (last_insn);
>>>>>>>>      
>>>>>>>> +  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
>>>>>>>> +     performed in i386_target_format.  */
>>>>>>>
>>>>>>> See my earlier comment - it's yet more restrictive (as in not covering
>>>>>>> e.g. the Windows ABI, which importantly is also used in EFI).
>>>>>>
>>>>>> Not clear, can you please elaborate ?
>>>>>
>>>>> Hmm, it's not clear to me what's not clear in my earlier comment. The
>>>>> set of preserved registers, for example, differs between the SysV and
>>>>> the Windows ABIs (see x86_scfi_callee_saved_p()). Then again, thinking
>>>>> of it, talking of anything ABI-ish in assembly code is questionable.
>>>>> An assembly function calling another assembly function may use an
>>>>> entirely custom "ABI". You just cannot guess upon preserved registers.
>>>>>
>>>>
>>>> I think the confusion is stemming from my usage of AMD64 colloquially to
>>>> refer to System V ABI for x86_64. I think "System V AMD64 ABI" is the
>>>> correct term. I will use that.
>>>>
>>>> And yes, GAS can only work out SCFI if there is ABI adherence.  If input
>>>> asm does not adhere to the supported ABIs, and the user invokes SCFI,
>>>> then the user is on their own.  GAS will not (rather can not) warn about
>>>> ABI incompliance.
>>>
>>> I don't recall documentation stating this explicitly. This is a pretty
>>> important aspect for users to consider, after all.
>>>
>>
>> I added a reference in gas/NEWS for now.
>>
>> * Experimental support in GAS to synthesize CFI for ABI-conformant,
>>     hand-written asm using the new command line option
>> --scfi=experimental  on x86-64.
>>
>> I will add a mention of "ABI conformance" in as.texi.
> 
> Maybe for NEWS this is enough; personally I don't think it is when there
> are multiple ABIs for the/any affected architecture. You limiting support
> to ELF doesn't mean the Windows ABI isn't in use. As said, UEFI uses that
> ABI. And GNU ld can link ELF objects into EFI binaries.
> 

I can add "Supported for System V AMD64 ABI" too in the gas/NEWS and 
as.texi for now. I dont have a resolution to "disallow Windows ABI on 
ELF with SCFI".  I will take suggestions.
  
Jan Beulich Jan. 11, 2024, 8:13 a.m. UTC | #12
On 10.01.2024 20:43, Indu Bhagat wrote:
> On 1/10/24 06:15, Jan Beulich wrote:
>> On 10.01.2024 12:26, Indu Bhagat wrote:
>>> On 1/10/24 01:44, Jan Beulich wrote:
>>>> On 10.01.2024 07:10, Indu Bhagat wrote:
>>>>> On 1/9/24 01:30, Jan Beulich wrote:
>>>>>> On 08.01.2024 20:33, Indu Bhagat wrote:
>>>>>>> On 1/5/24 05:58, Jan Beulich wrote:
>>>>>>>> On 03.01.2024 08:15, Indu Bhagat wrote:
>>>>>>>>> +/* Generate one or more generic GAS instructions, a.k.a, ginsns for the current
>>>>>>>>> +   machine instruction.
>>>>>>>>> +
>>>>>>>>> +   Returns the head of linked list of ginsn(s) added, if success; Returns NULL
>>>>>>>>> +   if failure.
>>>>>>>>> +
>>>>>>>>> +   The input ginsn_gen_mode GMODE determines the set of minimal necessary
>>>>>>>>> +   ginsns necessary for correctness of any passes applicable for that mode.
>>>>>>>>> +   For supporting the GINSN_GEN_SCFI generation mode, following is the list of
>>>>>>>>> +   machine instructions that must be translated into the corresponding ginsns
>>>>>>>>> +   to ensure correctness of SCFI:
>>>>>>>>> +     - All instructions affecting the two registers that could potentially
>>>>>>>>> +       be used as the base register for CFA tracking.  For SCFI, the base
>>>>>>>>> +       register for CFA tracking is limited to REG_SP and REG_FP only for
>>>>>>>>> +       now.
>>>>>>>>> +     - All change of flow instructions: conditional and unconditional branches,
>>>>>>>>> +       call and return from functions.
>>>>>>>>> +     - All instructions that can potentially be a register save / restore
>>>>>>>>> +       operation.
>>>>>>>>
>>>>>>>> This could do with being more fine grained, as "potentially" is pretty vague,
>>>>>>>> and (as per earlier version review comments) my take on this is a much wider
>>>>>>>> set than yours.
>>>>>>>
>>>>>>> I would like to understand more on this comment, especially the "my take
>>>>>>> on this is a much wider set than yours".  I see its being hinted at in
>>>>>>> different flavors in the current review.
>>>>>>>
>>>>>>> I see some issues pointed out in this review (addressing modes of mov
>>>>>>> etc, safe to skip opcodes for TEST, CMP) etc., but it seems that your
>>>>>>> concerns are wider than this.
>>>>>>
>>>>>> I earlier version review I mentioned that even vector or mask registers
>>>>>> could in principle be use to hold preserved GPR values. I seem to recall
>>>>>> that you said you wouldn't want to deal with such. Hence my use of
>>>>>> "wider set": Just to give an example, "kmovq %rbp, %k0" plus later
>>>>>> "kmovq %k0, %rbp" is a pair of "instructions that can potentially be a
>>>>>> register save / restore operation".
>>>>>>
>>>>>
>>>>> Hmm. I will need to understand them on a case to case basis.  For the
>>>>> case of "kmovq %rbp, %k0" / "kmovq %k0, %rbp" how can this be used as
>>>>> save/restore to/from stack ?
>>>>
>>>> Maybe I'm still not having a clear enough picture of what forms of insns
>>>> you want to fully track. Said insn forms don't access the stack. But they
>>>> could in principle be used to preserve a certain register. Such preserving
>>>> of registers is part of what needs encoding in CFI, isn't it?
>>>>
>>>
>>> The kind of preserving is usually on stack. It can also be in another
>>> callee-saved register, in theory, but the latter defeats the purpose of
>>> state saving across calls.
>>
>> Callee-preserved registers, when they have a special purpose in the
>> architecture (like %rsi, %rdi, and %rbx have) may be cheaper to
>> preserve by moving to a call-clobbered register that isn't otherwise
>> used in the function. In the SysV ABI this only affects %rbx, the
>> special purpose of which is extremely limited in the ISA (xlatb). In
>> the Windows ABI, otoh, %rsi and %rdi are callee-preserved, and those
>> have very common uses in the string insns.
>>
> 
> I am not sure I follow completely. Call-clobbered registers are not of 
> interest for SCFI...

Well, what's x86_scfi_callee_saved_p() about if the distinction isn't
relevant?

>>>>>>>>> +    case 0xc2:
>>>>>>>>> +    case 0xc3:
>>>>>>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>>>>>>> +	break;
>>>>>>>>> +      /* Near ret.  */
>>>>>>>>> +      ginsn = ginsn_new_return (insn_end_sym, true);
>>>>>>>>> +      ginsn_set_where (ginsn);
>>>>>>>>> +      break;
>>>>>>>>
>>>>>>>> No tracking of the stack pointer adjustment?
>>>>>>>
>>>>>>> No stack unwind information for a function is relevant after the
>>>>>>> function has returned.  So, tracking of stack pointer adjustment by
>>>>>>> return is not necessary.
>>>>>>
>>>>>> What information does the "return" insn then carry, beyond it being
>>>>>> an unconditional branch (which you have a different insn for)?
>>>>>>
>>>>>
>>>>> "return" does not carry any more information than just the
>>>>> GINSN_TYPE_RETURN as ginsn->type.
>>>>>
>>>>> So then why support both "return" and an unconditional branch: The
>>>>> intention is to carry the semantic difference between ret and
>>>>> unconditional jump.  Unconditional jumps may be to a label within
>>>>> function, and in those cases, we use it for some validation and BB
>>>>> linking when creating CFG. Return, OTOH, always indicates exit from
>>>>> function.
>>>>>
>>>>> For SCFI purposes, above is the one use.  Future analyses may find other
>>>>> use-cases for an explicit return ginsn.  But IMO, keeping
>>>>> GINSN_TYPE_RETURN as an explicit insn makes the overall offering cleaner.
>>>>
>>>> Okay. And here you don't bother decoding operands. Hence why I'm
>>>> asking the same to be the case for (e.g.) CALL.
>>>>
>>>
>>> It seems I will need to deal with operands of RETURN insn soon.  For
>>> implementing "Warn if imbalanced stack at return", we will need this info.
>>
>> Will you? Isn't stack state _before_ the RET what matters (and hence
>> the optional immediate still doesn't matter)?
>>
> 
> RET with operand makes this tricky.
> 
> My initial thought was:
> "Balanced stack at function return" will check that the RSP at the entry 
> of the function (after the call instruction) is the same as that at the 
> return from the function (before the return instruction).
> 
> Now if RET with operand (which tells how much stack to pop before an 
> eventual return) is in effect, I do need to check the RSP value right 
> before the RETURN (RETURN being the microOP/ginsn equivalent).

No, that's not how it works. RET with operand discards arguments passed
to the function (see Windows' __stdcall calling convention for an example
use). Naturally arguments are pushed _before_ the return address.

Jan
  
Indu Bhagat Jan. 11, 2024, 6:14 p.m. UTC | #13
On 1/11/24 00:13, Jan Beulich wrote:
>>>>>>>>>> +    case 0xc2:
>>>>>>>>>> +    case 0xc3:
>>>>>>>>>> +      if (i.tm.opcode_space != SPACE_BASE)
>>>>>>>>>> +	break;
>>>>>>>>>> +      /* Near ret.  */
>>>>>>>>>> +      ginsn = ginsn_new_return (insn_end_sym, true);
>>>>>>>>>> +      ginsn_set_where (ginsn);
>>>>>>>>>> +      break;
>>>>>>>>> No tracking of the stack pointer adjustment?
>>>>>>>> No stack unwind information for a function is relevant after the
>>>>>>>> function has returned.  So, tracking of stack pointer adjustment by
>>>>>>>> return is not necessary.
>>>>>>> What information does the "return" insn then carry, beyond it being
>>>>>>> an unconditional branch (which you have a different insn for)?
>>>>>>>
>>>>>> "return" does not carry any more information than just the
>>>>>> GINSN_TYPE_RETURN as ginsn->type.
>>>>>>
>>>>>> So then why support both "return" and an unconditional branch: The
>>>>>> intention is to carry the semantic difference between ret and
>>>>>> unconditional jump.  Unconditional jumps may be to a label within
>>>>>> function, and in those cases, we use it for some validation and BB
>>>>>> linking when creating CFG. Return, OTOH, always indicates exit from
>>>>>> function.
>>>>>>
>>>>>> For SCFI purposes, above is the one use.  Future analyses may find other
>>>>>> use-cases for an explicit return ginsn.  But IMO, keeping
>>>>>> GINSN_TYPE_RETURN as an explicit insn makes the overall offering cleaner.
>>>>> Okay. And here you don't bother decoding operands. Hence why I'm
>>>>> asking the same to be the case for (e.g.) CALL.
>>>>>
>>>> It seems I will need to deal with operands of RETURN insn soon.  For
>>>> implementing "Warn if imbalanced stack at return", we will need this info.
>>> Will you? Isn't stack state_before_  the RET what matters (and hence
>>> the optional immediate still doesn't matter)?
>>>
>> RET with operand makes this tricky.
>>
>> My initial thought was:
>> "Balanced stack at function return" will check that the RSP at the entry
>> of the function (after the call instruction) is the same as that at the
>> return from the function (before the return instruction).
>>
>> Now if RET with operand (which tells how much stack to pop before an
>> eventual return) is in effect, I do need to check the RSP value right
>> before the RETURN (RETURN being the microOP/ginsn equivalent).
> No, that's not how it works. RET with operand discards arguments passed
> to the function (see Windows' __stdcall calling convention for an example
> use). Naturally arguments are pushed_before_  the return address.

Correct.  Even the manual clearly says "The optional source operand 
specifies the number of stack bytes to be released after the return 
address is popped; the default is none. This operand can be used to 
release parameters from the stack that were passed to the called
procedure and are no longer needed.".

So we will not need to deal with operands of RET insn.
  
Indu Bhagat Jan. 17, 2024, 1:20 a.m. UTC | #14
On 1/10/24 01:44, Jan Beulich wrote:
> On 10.01.2024 07:10, Indu Bhagat wrote:
>> On 1/9/24 01:30, Jan Beulich wrote:
>>> On 08.01.2024 20:33, Indu Bhagat wrote:
>>>> On 1/5/24 05:58, Jan Beulich wrote:
>>>>> On 03.01.2024 08:15, Indu Bhagat wrote:
>>>>>> +/* Check whether a '66H' prefix accompanies the instruction.
>>>>> With APX 16-bit operand size isn't necessarily represented by a 66h
>>>>> prefix, but perhaps with an "embedded prefix" inside the EVEX one.
>>>>> Therefore both the comment and even more so ...
>>>>>
>>>>>> +   The current users of this API are in the handlers for PUSH, POP
>>>>>> +   instructions.  These instructions affect the stack pointer implicitly:  the
>>>>>> +   operand size (16, 32, or 64 bits) determines the amount by which the stack
>>>>>> +   pointer is decremented (2, 4 or 8).  When '66H' prefix is present, the
>>>>>> +   instruction has a 16-bit operand.  */
>>>>>> +
>>>>>> +static bool
>>>>>> +ginsn_prefix_66H_p (i386_insn insn)
>>>>> ... the function name would better not allude to just the legacy
>>>>> encoding. Maybe ginsn_opsize_prefix_p()?
>>>>>
>>>> Isnt 66H_p more readable and easier to follow because that's what the
>>>> function is currently checking ?  If more scenarios were being handled,
>>>> ginsn_opsize_prefix_p () would fit better.
>>> Well, as said - with APX you can't get away with just 0x66 prefix checking.
>>> That prefix is simply illegal to use with EVEX-encoded insns.
>>>
>> I am using the following in ginsn_opsize_prefix_p ():
>>
>> !(i.prefix[REX_PREFIX] & REX_W) && i.prefix[DATA_PREFIX] == 0x66
> That addresses one half of my earlier remarks. Note however that elsewhere
> we never check i.prefix[DATA_PREFIX] against being 0x66; we only ever check
> for it being zero or non-zero. I'd like this to remain consistent.
> 
> For EVEX-encoded APX insns this isn't going to be sufficient though. See
> respective code in process_suffix():
> 
> 	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
> 	     needs to be adjusted.  */
> 	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
> 	    {
> 	      gas_assert (!i.tm.opcode_modifier.opcodeprefix);
> 	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
> 	    }
> 	  else if (!add_prefix (prefix))
> 	    return 0;
> 
> So you'll need to also check for that combination, plus take care of
> avoiding insns where PREFIX_0X66 alters operation, not operand size
> (ADCX being an example).

[V5 is now committed. I am continuing to work on some of the discussed 
pending items from V4.]

Thanks. So looks like to correctly check for prefix 66H, one needs to 
check that:
   - !(i.prefix[REX_PREFIX] & REX_W) && i.prefix[DATA_PREFIX]
   - (i.tm.opcode_space == SPACE_EVEXMAP4
      && i.tm.opcode_modifier.opcodeprefix == PREFIX_0X66);
   - selectively handle the specific ops where PREFIX_0x66 alters 
operation.  I tried looking around but haven't found a targetted way to 
identify such ops.  Is there a way ?

Alternatively, since x86_ginsn_new () will be called after 
process_suffix (), I wonder if I can update the code to simply check for 
i.suffix to be 'w' for reliably detecting the 16-bit ops for all x86 
insns.  Is the check on suffix correct and reliable ?
  
Jan Beulich Jan. 17, 2024, 8:09 a.m. UTC | #15
On 17.01.2024 02:20, Indu Bhagat wrote:
> On 1/10/24 01:44, Jan Beulich wrote:
>> On 10.01.2024 07:10, Indu Bhagat wrote:
>>> On 1/9/24 01:30, Jan Beulich wrote:
>>>> On 08.01.2024 20:33, Indu Bhagat wrote:
>>>>> On 1/5/24 05:58, Jan Beulich wrote:
>>>>>> On 03.01.2024 08:15, Indu Bhagat wrote:
>>>>>>> +/* Check whether a '66H' prefix accompanies the instruction.
>>>>>> With APX 16-bit operand size isn't necessarily represented by a 66h
>>>>>> prefix, but perhaps with an "embedded prefix" inside the EVEX one.
>>>>>> Therefore both the comment and even more so ...
>>>>>>
>>>>>>> +   The current users of this API are in the handlers for PUSH, POP
>>>>>>> +   instructions.  These instructions affect the stack pointer implicitly:  the
>>>>>>> +   operand size (16, 32, or 64 bits) determines the amount by which the stack
>>>>>>> +   pointer is decremented (2, 4 or 8).  When '66H' prefix is present, the
>>>>>>> +   instruction has a 16-bit operand.  */
>>>>>>> +
>>>>>>> +static bool
>>>>>>> +ginsn_prefix_66H_p (i386_insn insn)
>>>>>> ... the function name would better not allude to just the legacy
>>>>>> encoding. Maybe ginsn_opsize_prefix_p()?
>>>>>>
>>>>> Isnt 66H_p more readable and easier to follow because that's what the
>>>>> function is currently checking ?  If more scenarios were being handled,
>>>>> ginsn_opsize_prefix_p () would fit better.
>>>> Well, as said - with APX you can't get away with just 0x66 prefix checking.
>>>> That prefix is simply illegal to use with EVEX-encoded insns.
>>>>
>>> I am using the following in ginsn_opsize_prefix_p ():
>>>
>>> !(i.prefix[REX_PREFIX] & REX_W) && i.prefix[DATA_PREFIX] == 0x66
>> That addresses one half of my earlier remarks. Note however that elsewhere
>> we never check i.prefix[DATA_PREFIX] against being 0x66; we only ever check
>> for it being zero or non-zero. I'd like this to remain consistent.
>>
>> For EVEX-encoded APX insns this isn't going to be sufficient though. See
>> respective code in process_suffix():
>>
>> 	  /* The DATA PREFIX of EVEX promoted from legacy APX instructions
>> 	     needs to be adjusted.  */
>> 	  if (i.tm.opcode_space == SPACE_EVEXMAP4)
>> 	    {
>> 	      gas_assert (!i.tm.opcode_modifier.opcodeprefix);
>> 	      i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66;
>> 	    }
>> 	  else if (!add_prefix (prefix))
>> 	    return 0;
>>
>> So you'll need to also check for that combination, plus take care of
>> avoiding insns where PREFIX_0X66 alters operation, not operand size
>> (ADCX being an example).
> 
> [V5 is now committed. I am continuing to work on some of the discussed 
> pending items from V4.]
> 
> Thanks. So looks like to correctly check for prefix 66H, one needs to 
> check that:
>    - !(i.prefix[REX_PREFIX] & REX_W) && i.prefix[DATA_PREFIX]
>    - (i.tm.opcode_space == SPACE_EVEXMAP4
>       && i.tm.opcode_modifier.opcodeprefix == PREFIX_0X66);
>    - selectively handle the specific ops where PREFIX_0x66 alters 
> operation.  I tried looking around but haven't found a targetted way to 
> identify such ops.  Is there a way ?

I'm afraid I'm not aware of any.

> Alternatively, since x86_ginsn_new () will be called after 
> process_suffix (), I wonder if I can update the code to simply check for 
> i.suffix to be 'w' for reliably detecting the 16-bit ops for all x86 
> insns.  Is the check on suffix correct and reliable ?

Without doing a full audit I'm inclined to say "perhaps". One important
thing to consider here is Intel syntax, where (memory) operand size is
normally expressed via (here) "word ptr" rather than a suffix. And iirc
while suffix derivation (when none was specified) would look at
register operands, it wouldn't normally look at memory ones.

It feels to me as if it was more robust if you simply set an indicator
in SHORT_MNEM_SUFFIX handling within process_suffix(), or if
alternatively you broke out the conditional there into a helper function
which you then could re-use for your purpose (especially in the latter
case beware of the JUMP_BYTE handling, though).

Jan
  

Patch

diff --git a/gas/Makefile.am b/gas/Makefile.am
index e174305ca62..b477d74cb53 100644
--- a/gas/Makefile.am
+++ b/gas/Makefile.am
@@ -82,6 +82,7 @@  GAS_CFILES = \
 	flonum-mult.c \
 	frags.c \
 	gen-sframe.c \
+	ginsn.c \
 	hash.c \
 	input-file.c \
 	input-scrub.c \
@@ -94,6 +95,7 @@  GAS_CFILES = \
 	remap.c \
 	sb.c \
 	scfidw2gen.c \
+	scfi.c \
 	sframe-opt.c \
 	stabs.c \
 	subsegs.c \
@@ -119,6 +121,7 @@  HFILES = \
 	flonum.h \
 	frags.h \
 	gen-sframe.h \
+	ginsn.h \
 	hash.h \
 	input-file.h \
 	itbl-lex.h \
@@ -130,6 +133,7 @@  HFILES = \
 	read.h \
 	sb.h \
 	scfidw2gen.h \
+	scfi.h \
 	subsegs.h \
 	symbols.h \
 	tc.h \
diff --git a/gas/Makefile.in b/gas/Makefile.in
index fe9f4e06195..90f6bc224de 100644
--- a/gas/Makefile.in
+++ b/gas/Makefile.in
@@ -173,12 +173,13 @@  am__objects_1 = app.$(OBJEXT) as.$(OBJEXT) atof-generic.$(OBJEXT) \
 	ecoff.$(OBJEXT) ehopt.$(OBJEXT) expr.$(OBJEXT) \
 	flonum-copy.$(OBJEXT) flonum-konst.$(OBJEXT) \
 	flonum-mult.$(OBJEXT) frags.$(OBJEXT) gen-sframe.$(OBJEXT) \
-	hash.$(OBJEXT) input-file.$(OBJEXT) input-scrub.$(OBJEXT) \
-	listing.$(OBJEXT) literal.$(OBJEXT) macro.$(OBJEXT) \
-	messages.$(OBJEXT) output-file.$(OBJEXT) read.$(OBJEXT) \
-	remap.$(OBJEXT) sb.$(OBJEXT) scfidw2gen.$(OBJEXT) \
-	sframe-opt.$(OBJEXT) stabs.$(OBJEXT) subsegs.$(OBJEXT) \
-	symbols.$(OBJEXT) write.$(OBJEXT)
+	ginsn.$(OBJEXT) hash.$(OBJEXT) input-file.$(OBJEXT) \
+	input-scrub.$(OBJEXT) listing.$(OBJEXT) literal.$(OBJEXT) \
+	macro.$(OBJEXT) messages.$(OBJEXT) output-file.$(OBJEXT) \
+	read.$(OBJEXT) remap.$(OBJEXT) sb.$(OBJEXT) \
+	scfidw2gen.$(OBJEXT) scfi.$(OBJEXT) sframe-opt.$(OBJEXT) \
+	stabs.$(OBJEXT) subsegs.$(OBJEXT) symbols.$(OBJEXT) \
+	write.$(OBJEXT)
 am_as_new_OBJECTS = $(am__objects_1)
 am__dirstamp = $(am__leading_dot)dirstamp
 as_new_OBJECTS = $(am_as_new_OBJECTS)
@@ -581,6 +582,7 @@  GAS_CFILES = \
 	flonum-mult.c \
 	frags.c \
 	gen-sframe.c \
+	ginsn.c \
 	hash.c \
 	input-file.c \
 	input-scrub.c \
@@ -593,6 +595,7 @@  GAS_CFILES = \
 	remap.c \
 	sb.c \
 	scfidw2gen.c \
+	scfi.c \
 	sframe-opt.c \
 	stabs.c \
 	subsegs.c \
@@ -617,6 +620,7 @@  HFILES = \
 	flonum.h \
 	frags.h \
 	gen-sframe.h \
+	ginsn.h \
 	hash.h \
 	input-file.h \
 	itbl-lex.h \
@@ -628,6 +632,7 @@  HFILES = \
 	read.h \
 	sb.h \
 	scfidw2gen.h \
+	scfi.h \
 	subsegs.h \
 	symbols.h \
 	tc.h \
@@ -1336,6 +1341,7 @@  distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/flonum-mult.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/frags.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/gen-sframe.Po@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ginsn.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/hash.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/input-file.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/input-scrub.Po@am__quote@
@@ -1350,6 +1356,7 @@  distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/read.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/remap.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sb.Po@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/scfi.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/scfidw2gen.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sframe-opt.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/stabs.Po@am__quote@
diff --git a/gas/as.c b/gas/as.c
index 953b9bc0b99..da85525ab48 100644
--- a/gas/as.c
+++ b/gas/as.c
@@ -45,6 +45,7 @@ 
 #include "codeview.h"
 #include "bfdver.h"
 #include "write.h"
+#include "ginsn.h"
 
 #ifdef HAVE_ITBL_CPU
 #include "itbl-ops.h"
@@ -245,6 +246,7 @@  Options:\n\
                       	  d      omit debugging directives\n\
                       	  g      include general info\n\
                       	  h      include high-level source\n\
+                      	  i      include ginsn and synthesized CFI info\n\
                       	  l      include assembly\n\
                       	  m      include macro expansions\n\
                       	  n      omit forms processing\n\
@@ -1089,6 +1091,9 @@  This program has absolutely no warranty.\n"));
 		    case 'h':
 		      listing |= LISTING_HLL;
 		      break;
+		    case 'i':
+		      listing |= LISTING_GINSN_SCFI;
+		      break;
 		    case 'l':
 		      listing |= LISTING_LISTING;
 		      break;
diff --git a/gas/config/obj-elf.c b/gas/config/obj-elf.c
index 1b77b2715d1..6aa9376c5a7 100644
--- a/gas/config/obj-elf.c
+++ b/gas/config/obj-elf.c
@@ -24,6 +24,7 @@ 
 #include "subsegs.h"
 #include "obstack.h"
 #include "dwarf2dbg.h"
+#include "ginsn.h"
 
 #ifndef ECOFF_DEBUGGING
 #define ECOFF_DEBUGGING 0
@@ -2311,6 +2312,13 @@  obj_elf_size (int ignore ATTRIBUTE_UNUSED)
       symbol_get_obj (sym)->size = XNEW (expressionS);
       *symbol_get_obj (sym)->size = exp;
     }
+
+  /* If the symbol in the directive matches the current function being
+     processed, indicate end of the current stream of ginsns.  */
+  if (flag_synth_cfi
+      && S_IS_FUNCTION (sym) && sym == ginsn_data_func_symbol ())
+    ginsn_data_end (symbol_temp_new_now ());
+
   demand_empty_rest_of_line ();
 }
 
@@ -2499,6 +2507,14 @@  obj_elf_type (int ignore ATTRIBUTE_UNUSED)
 	elfsym->symbol.flags &= ~mask;
     }
 
+  if (S_IS_FUNCTION (sym) && flag_synth_cfi)
+    {
+      /* Wrap up processing the previous block of ginsns first.  */
+      if (frchain_now->frch_ginsn_data)
+	ginsn_data_end (symbol_temp_new_now ());
+      ginsn_data_begin (sym);
+    }
+
   demand_empty_rest_of_line ();
 }
 
diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 694c494edec..d76765c2bb5 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -30,6 +30,7 @@ 
 #include "subsegs.h"
 #include "dwarf2dbg.h"
 #include "dw2gencfi.h"
+#include "scfi.h"
 #include "gen-sframe.h"
 #include "sframe.h"
 #include "elf/x86-64.h"
@@ -5287,6 +5288,978 @@  static INLINE bool may_need_pass2 (const insn_template *t)
 	       && t->base_opcode == 0x63);
 }
 
+bool
+x86_scfi_callee_saved_p (unsigned int dw2reg_num)
+{
+  if (dw2reg_num == 3 /* rbx.  */
+      || dw2reg_num == REG_FP /* rbp.  */
+      || dw2reg_num == REG_SP /* rsp.  */
+      || (dw2reg_num >= 12 && dw2reg_num <= 15) /* r12 - r15.  */)
+    return true;
+
+  return false;
+}
+
+/* Check whether a '66H' prefix accompanies the instruction.
+   The current users of this API are in the handlers for PUSH, POP
+   instructions.  These instructions affect the stack pointer implicitly:  the
+   operand size (16, 32, or 64 bits) determines the amount by which the stack
+   pointer is decremented (2, 4 or 8).  When '66H' prefix is present, the
+   instruction has a 16-bit operand.  */
+
+static bool
+ginsn_prefix_66H_p (i386_insn insn)
+{
+  return (insn.prefix[DATA_PREFIX] == 0x66);
+}
+
+/* Get the DWARF register number for the given register entry.
+   For specific byte/word register accesses like al, cl, ah, ch, r8dyte etc.,
+   there is no valid DWARF register number.  This function is a hack - it
+   relies on relative ordering of reg entries in the i386_regtab.  FIXME - it
+   will be good to allow a more direct way to get this information.  */
+
+static unsigned int
+ginsn_dw2_regnum (const reg_entry *ireg)
+{
+  /* PS: Note the data type here as int32_t, because of Dw2Inval (-1).  */
+  int32_t dwarf_reg = Dw2Inval;
+  const reg_entry *temp = ireg;
+
+  /* ginsn creation is available for AMD64 abi only ATM.  Other flag_code
+     are not expected.  */
+  gas_assert (flag_code == CODE_64BIT);
+
+  if (ireg <= &i386_regtab[3])
+    /* For al, cl, dl, bl, bump over to axl, cxl, dxl, bxl respectively by
+       adding 8.  */
+    temp = ireg + 8;
+  else if (ireg <= &i386_regtab[7])
+    /* For ah, ch, dh, bh, bump over to axl, cxl, dxl, bxl respectively by
+       adding 4.  */
+    temp = ireg + 4;
+
+  dwarf_reg = temp->dw2_regnum[flag_code >> 1];
+  if (dwarf_reg == Dw2Inval)
+    {
+      /* The code relies on the relative ordering of the reg entries in
+	 i386_regtab.  The assertion here ensures the code does not recurse
+	 indefinitely.  */
+      gas_assert (temp + 16 < &i386_regtab[i386_regtab_size - 1]);
+      temp = temp + 16;
+      dwarf_reg = ginsn_dw2_regnum (temp);
+    }
+
+  gas_assert (dwarf_reg != Dw2Inval); /* Needs to be addressed.  */
+
+  return (unsigned int) dwarf_reg;
+}
+
+static ginsnS *
+x86_ginsn_addsub_reg_mem (const symbolS *insn_end_sym)
+{
+  unsigned int dw2_regnum;
+  unsigned int src2_dw2_regnum;
+  ginsnS *ginsn = NULL;
+  ginsnS * (*ginsn_func) (const symbolS *, bool,
+			  enum ginsn_src_type, unsigned int, offsetT,
+			  enum ginsn_src_type, unsigned int, offsetT,
+			  enum ginsn_dst_type, unsigned int, offsetT);
+  uint16_t opcode = i.tm.base_opcode;
+
+  gas_assert (opcode == 0x1 || opcode == 0x29);
+  ginsn_func = (opcode == 0x1) ? ginsn_new_add : ginsn_new_sub;
+
+  /* op %reg, symbol.  */
+  if (i.mem_operands == 1 && !i.base_reg && !i.index_reg)
+    return ginsn;
+
+  /* op reg, reg/mem.  */
+  dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
+  if (i.reg_operands == 2)
+    {
+      src2_dw2_regnum = ginsn_dw2_regnum (i.op[1].regs);
+      ginsn = ginsn_func (insn_end_sym, true,
+			  GINSN_SRC_REG, dw2_regnum, 0,
+			  GINSN_SRC_REG, src2_dw2_regnum, 0,
+			  GINSN_DST_REG, src2_dw2_regnum, 0);
+      ginsn_set_where (ginsn);
+    }
+  /* Other cases where destination involves indirect access are unnecessary
+     for SCFI correctness.  TBD_GINSN_GEN_NOT_SCFI.  */
+
+  return ginsn;
+}
+
+static ginsnS *
+x86_ginsn_addsub_mem_reg (const symbolS *insn_end_sym)
+{
+  unsigned int dw2_regnum;
+  unsigned int src2_dw2_regnum;
+  const reg_entry *mem_reg;
+  int32_t gdisp = 0;
+  ginsnS *ginsn = NULL;
+  ginsnS * (*ginsn_func) (const symbolS *, bool,
+			  enum ginsn_src_type, unsigned int, offsetT,
+			  enum ginsn_src_type, unsigned int, offsetT,
+			  enum ginsn_dst_type, unsigned int, offsetT);
+  uint16_t opcode = i.tm.base_opcode;
+
+  gas_assert (opcode == 0x3 || opcode == 0x2b);
+  ginsn_func = (opcode == 0x3) ? ginsn_new_add : ginsn_new_sub;
+
+  /* op symbol, %reg.  */
+  if (i.mem_operands && !i.base_reg && !i.index_reg)
+    return ginsn;
+  /* op mem, %reg.  */
+  dw2_regnum = ginsn_dw2_regnum (i.op[1].regs);
+
+  if (i.mem_operands)
+    {
+      mem_reg = (i.base_reg) ? i.base_reg : i.index_reg;
+      src2_dw2_regnum = ginsn_dw2_regnum (mem_reg);
+      if (i.disp_operands == 1)
+	gdisp = i.op[0].disps->X_add_number;
+      ginsn = ginsn_func (insn_end_sym, true,
+			  GINSN_SRC_INDIRECT, src2_dw2_regnum, gdisp,
+			  GINSN_SRC_REG, dw2_regnum, 0,
+			  GINSN_DST_REG, dw2_regnum, 0);
+      ginsn_set_where (ginsn);
+    }
+
+  return ginsn;
+}
+
+static ginsnS *
+x86_ginsn_alu_imm (const symbolS *insn_end_sym)
+{
+  offsetT src_imm;
+  unsigned int dw2_regnum;
+  ginsnS *ginsn = NULL;
+  enum ginsn_src_type src_type = GINSN_SRC_REG;
+  enum ginsn_dst_type dst_type = GINSN_DST_REG;
+
+  ginsnS * (*ginsn_func) (const symbolS *, bool,
+			  enum ginsn_src_type, unsigned int, offsetT,
+			  enum ginsn_src_type, unsigned int, offsetT,
+			  enum ginsn_dst_type, unsigned int, offsetT);
+
+  /* FIXME - create ginsn where dest is REG_SP / REG_FP only ? */
+  /* Map for insn.tm.extension_opcode
+     000 ADD    100 AND
+     001 OR     101 SUB
+     010 ADC    110 XOR
+     011 SBB    111 CMP  */
+
+  /* add/sub/and imm, %reg only at this time for SCFI.
+     Although all three (and, or , xor) make the destination reg untraceable,
+     and op is handled but not or/xor because we will look into supporting
+     the DRAP pattern at some point.  */
+  if (i.tm.extension_opcode == 5)
+    ginsn_func = ginsn_new_sub;
+  else if (i.tm.extension_opcode == 4)
+    ginsn_func = ginsn_new_and;
+  else if (i.tm.extension_opcode == 0)
+    ginsn_func = ginsn_new_add;
+  else
+    return ginsn;
+
+  /* TBD_GINSN_REPRESENTATION_LIMIT: There is no representation for when a
+     symbol is used as an operand, like so:
+	  addq    $simd_cmp_op+8, %rdx
+     Skip generating any ginsn for this.  */
+  if (i.imm_operands == 1
+      && i.op[0].imms->X_op != O_constant)
+    return ginsn;
+
+  /* addq    $1, symbol
+     addq    $1, -16(%rbp)
+     Such instructions are not of interest for SCFI.  */
+  if (i.mem_operands == 1)
+    return ginsn;
+
+  gas_assert (i.imm_operands == 1);
+  src_imm = i.op[0].imms->X_add_number;
+  /* The second operand may be a register or indirect access.  For SCFI, only
+     the case when the second opnd is a register is interesting.  Revisit this
+     if generating ginsns for a different gen mode TBD_GINSN_GEN_NOT_SCFI. */
+  if (i.reg_operands == 1)
+    {
+      dw2_regnum = ginsn_dw2_regnum (i.op[1].regs);
+      /* For ginsn, keep the imm as second src operand.  */
+      ginsn = ginsn_func (insn_end_sym, true,
+			  src_type, dw2_regnum, 0,
+			  GINSN_SRC_IMM, 0, src_imm,
+			  dst_type, dw2_regnum, 0);
+
+      ginsn_set_where (ginsn);
+    }
+
+  return ginsn;
+}
+
+static ginsnS *
+x86_ginsn_move (const symbolS *insn_end_sym)
+{
+  ginsnS *ginsn;
+  unsigned int dst_reg;
+  unsigned int src_reg;
+  offsetT src_disp = 0;
+  offsetT dst_disp = 0;
+  const reg_entry *dst = NULL;
+  const reg_entry *src = NULL;
+  uint16_t opcode = i.tm.base_opcode;
+  enum ginsn_src_type src_type = GINSN_SRC_REG;
+  enum ginsn_dst_type dst_type = GINSN_DST_REG;
+
+  if (opcode == 0x8b || opcode == 0x8a)
+    {
+      /* mov  disp(%reg), %reg.  */
+      if (i.mem_operands && i.base_reg)
+	{
+	  src = i.base_reg;
+	  if (i.disp_operands == 1)
+	    src_disp = i.op[0].disps->X_add_number;
+	  src_type = GINSN_SRC_INDIRECT;
+	}
+      else
+	src = i.op[0].regs;
+
+      dst = i.op[1].regs;
+    }
+  else if (opcode == 0x89 || opcode == 0x88)
+    {
+      /* mov %reg, disp(%reg).  */
+      src = i.op[0].regs;
+      if (i.mem_operands && i.base_reg)
+	{
+	  dst = i.base_reg;
+	  if (i.disp_operands == 1)
+	    dst_disp = i.op[1].disps->X_add_number;
+	  dst_type = GINSN_DST_INDIRECT;
+	}
+      else
+	dst = i.op[1].regs;
+    }
+
+  src_reg = ginsn_dw2_regnum (src);
+  dst_reg = ginsn_dw2_regnum (dst);
+
+  ginsn = ginsn_new_mov (insn_end_sym, true,
+			 src_type, src_reg, src_disp,
+			 dst_type, dst_reg, dst_disp);
+  ginsn_set_where (ginsn);
+
+  return ginsn;
+}
+
+/* Generate appropriate ginsn for lea.
+   Sub-cases marked with TBD_GINSN_INFO_LOSS indicate some loss of information
+   in the ginsn.  But these are fine for now for GINSN_GEN_SCFI generation
+   mode.  */
+
+static ginsnS *
+x86_ginsn_lea (const symbolS *insn_end_sym)
+{
+  offsetT src_disp = 0;
+  ginsnS *ginsn = NULL;
+  unsigned int base_reg;
+  unsigned int index_reg;
+  offsetT index_scale;
+  unsigned int dst_reg;
+
+  if (!i.index_reg && !i.base_reg)
+    {
+      /* lea symbol, %rN.  */
+      dst_reg = ginsn_dw2_regnum (i.op[1].regs);
+      /* TBD_GINSN_INFO_LOSS - Skip encoding information about the symbol.  */
+      ginsn = ginsn_new_mov (insn_end_sym, false,
+			     GINSN_SRC_IMM, 0xf /* arbitrary const.  */, 0,
+			     GINSN_DST_REG, dst_reg, 0);
+    }
+  else if (i.base_reg && !i.index_reg)
+    {
+      /* lea    -0x2(%base),%dst.  */
+      base_reg = ginsn_dw2_regnum (i.base_reg);
+      dst_reg = ginsn_dw2_regnum (i.op[1].regs);
+
+      if (i.disp_operands)
+	src_disp = i.op[0].disps->X_add_number;
+
+      if (src_disp)
+	/* Generate an ADD ginsn.  */
+	ginsn = ginsn_new_add (insn_end_sym, true,
+			       GINSN_SRC_REG, base_reg, 0,
+			       GINSN_SRC_IMM, 0, src_disp,
+			       GINSN_DST_REG, dst_reg, 0);
+      else
+	/* Generate a MOV ginsn.  */
+	ginsn = ginsn_new_mov (insn_end_sym, true,
+			       GINSN_SRC_REG, base_reg, 0,
+			       GINSN_DST_REG, dst_reg, 0);
+    }
+  else if (!i.base_reg && i.index_reg)
+    {
+      /* lea (,%index,imm), %dst.  */
+      /* TBD_GINSN_INFO_LOSS - There is no explicit ginsn multiply operation,
+	 instead use GINSN_TYPE_OTHER.  */
+      index_scale = i.log2_scale_factor;
+      index_reg = ginsn_dw2_regnum (i.index_reg);
+      dst_reg = ginsn_dw2_regnum (i.op[1].regs);
+      ginsn = ginsn_new_other (insn_end_sym, true,
+			       GINSN_SRC_REG, index_reg,
+			       GINSN_SRC_IMM, index_scale,
+			       GINSN_DST_REG, dst_reg);
+    }
+  else
+    {
+      /* lea disp(%base,%index,imm) %dst.  */
+      /* TBD_GINSN_INFO_LOSS - Skip adding information about the disp and imm
+	 for index reg. */
+      base_reg = ginsn_dw2_regnum (i.base_reg);
+      index_reg = ginsn_dw2_regnum (i.index_reg);
+      dst_reg = ginsn_dw2_regnum (i.op[1].regs);
+      /* Generate an ADD ginsn.  */
+      ginsn = ginsn_new_add (insn_end_sym, true,
+			     GINSN_SRC_REG, base_reg, 0,
+			     GINSN_SRC_REG, index_reg, 0,
+			     GINSN_DST_REG, dst_reg, 0);
+    }
+
+  ginsn_set_where (ginsn);
+
+  return ginsn;
+}
+
+static ginsnS *
+x86_ginsn_jump (const symbolS *insn_end_sym)
+{
+  ginsnS *ginsn = NULL;
+  symbolS *src_symbol;
+
+  gas_assert (i.disp_operands == 1);
+
+  /* A non-zero addend in jump target makes control-flow tracking difficult.
+     Skip SCFI for now.  */
+  if (i.op[0].disps->X_op == O_symbol && i.op[0].disps->X_add_number)
+    {
+      as_bad ("SCFI: jmp insn with non-zero addend to sym not supported");
+      return ginsn;
+    }
+
+  if (i.op[0].disps->X_op == O_symbol)
+    {
+      gas_assert (!i.op[0].disps->X_add_number);
+      src_symbol = i.op[0].disps->X_add_symbol;
+      ginsn = ginsn_new_jump (insn_end_sym, true,
+			      GINSN_SRC_SYMBOL, 0, src_symbol);
+
+      ginsn_set_where (ginsn);
+    }
+
+  return ginsn;
+}
+
+static ginsnS *
+x86_ginsn_jump_cond (const symbolS *insn_end_sym)
+{
+  ginsnS *ginsn = NULL;
+  symbolS *src_symbol;
+
+  gas_assert (i.disp_operands == 1);
+
+  /* A non-zero addend in JCC target makes control-flow tracking difficult.
+     Skip SCFI for now.  */
+  if (i.op[0].disps->X_op == O_symbol && i.op[0].disps->X_add_number)
+    {
+      as_bad ("SCFI: jcc insn with non-zero addend to sym not supported");
+      return ginsn;
+    }
+
+  if (i.op[0].disps->X_op == O_symbol)
+    {
+      gas_assert (i.op[0].disps->X_add_number == 0);
+      src_symbol = i.op[0].disps->X_add_symbol;
+      ginsn = ginsn_new_jump_cond (insn_end_sym, true,
+				   GINSN_SRC_SYMBOL, 0, src_symbol);
+      ginsn_set_where (ginsn);
+    }
+
+  return ginsn;
+}
+
+static ginsnS *
+x86_ginsn_enter (const symbolS *insn_end_sym)
+{
+  ginsnS *ginsn = NULL;
+  ginsnS *ginsn_next = NULL;
+  ginsnS *ginsn_last = NULL;
+
+  gas_assert (i.imm_operands == 2);
+
+  /* For non-zero size operands, bail out as untraceable for SCFI.  */
+  if ((i.op[0].imms->X_op != O_constant || i.op[0].imms->X_add_symbol != 0)
+      || (i.op[1].imms->X_op != O_constant || i.op[1].imms->X_add_symbol != 0))
+    {
+      as_bad ("SCFI: enter insn with non-zero operand not supported");
+      return ginsn;
+    }
+
+  /* If the nesting level is 0, the processor pushes the frame pointer from
+     the BP/EBP/RBP register onto the stack, copies the current stack
+     pointer from the SP/ESP/RSP register into the BP/EBP/RBP register, and
+     loads the SP/ESP/RSP register with the current stack-pointer value
+     minus the value in the size operand.  */
+  ginsn = ginsn_new_sub (insn_end_sym, false,
+			 GINSN_SRC_REG, REG_SP, 0,
+			 GINSN_SRC_IMM, 0, 8,
+			 GINSN_DST_REG, REG_SP, 0);
+  ginsn_set_where (ginsn);
+  ginsn_next = ginsn_new_store (insn_end_sym, false,
+				GINSN_SRC_REG, REG_FP,
+				GINSN_DST_INDIRECT, REG_SP, 0);
+  ginsn_set_where (ginsn_next);
+  gas_assert (!ginsn_link_next (ginsn, ginsn_next));
+  ginsn_last = ginsn_new_mov (insn_end_sym, false,
+			      GINSN_SRC_REG, REG_SP, 0,
+			      GINSN_DST_REG, REG_FP, 0);
+  ginsn_set_where (ginsn_last);
+  gas_assert (!ginsn_link_next (ginsn_next, ginsn_last));
+
+  return ginsn;
+}
+
+static bool
+x86_ginsn_safe_to_skip (void)
+{
+  bool skip_p = false;
+  uint16_t opcode = i.tm.base_opcode;
+
+  switch (opcode)
+    {
+    case 0x39:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* cmp reg, reg.  */
+      skip_p = true;
+      break;
+    case 0x85:
+      /* test reg, reg/mem.  */
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      skip_p = true;
+      break;
+    default:
+      break;
+    }
+
+  return skip_p;
+}
+
+#define X86_GINSN_UNHANDLED_NONE      0
+#define X86_GINSN_UNHANDLED_DEST_REG  1
+#define X86_GINSN_UNHANDLED_CFG       2
+#define X86_GINSN_UNHANDLED_STACKOP   3
+
+/* Check the input insn for its impact on the correctness of the synthesized
+   CFI.  Returns an error code to the caller.  */
+
+static int
+x86_ginsn_unhandled (void)
+{
+  int err = X86_GINSN_UNHANDLED_NONE;
+  const reg_entry *reg_op;
+  unsigned int dw2_regnum;
+
+  /* Keep an eye out for instructions affecting control flow.  */
+  if (i.tm.opcode_modifier.jump)
+    err = X86_GINSN_UNHANDLED_CFG;
+  /* Also, for any instructions involving an implicit update to the stack
+     pointer.  */
+  else if (i.tm.opcode_modifier.implicitstackop)
+    err = X86_GINSN_UNHANDLED_STACKOP;
+  /* Finally, also check if the missed instructions are affecting REG_SP or
+     REG_FP.  The destination operand is the last at all stages of assembly
+     (due to following AT&T syntax layout in the internal representation).  In
+     case of Intel syntax input, this still remains true as swap_operands ()
+     is done by now.
+     PS: These checks do not involve index / base reg, as indirect memory
+     accesses via REG_SP or REG_FP do not affect SCFI correctness.
+     (Also note these instructions are candidates for other ginsn generation
+     modes in future.  TBD_GINSN_GEN_NOT_SCFI.)  */
+  else if (i.operands && i.reg_operands
+	   && !(i.flags[i.operands - 1] & Operand_Mem))
+    {
+      reg_op = i.op[i.operands - 1].regs;
+      if (reg_op)
+	{
+	  dw2_regnum = ginsn_dw2_regnum (reg_op);
+	  if (dw2_regnum == REG_SP || dw2_regnum == REG_FP)
+	    err = X86_GINSN_UNHANDLED_DEST_REG;
+	}
+    }
+
+  return err;
+}
+
+/* Generate one or more generic GAS instructions, a.k.a, ginsns for the current
+   machine instruction.
+
+   Returns the head of linked list of ginsn(s) added, if success; Returns NULL
+   if failure.
+
+   The input ginsn_gen_mode GMODE determines the set of minimal necessary
+   ginsns necessary for correctness of any passes applicable for that mode.
+   For supporting the GINSN_GEN_SCFI generation mode, following is the list of
+   machine instructions that must be translated into the corresponding ginsns
+   to ensure correctness of SCFI:
+     - All instructions affecting the two registers that could potentially
+       be used as the base register for CFA tracking.  For SCFI, the base
+       register for CFA tracking is limited to REG_SP and REG_FP only for
+       now.
+     - All change of flow instructions: conditional and unconditional branches,
+       call and return from functions.
+     - All instructions that can potentially be a register save / restore
+       operation.
+     - All instructions that perform stack manipulation implicitly: the CALL,
+       RET, PUSH, POP, ENTER, and LEAVE instructions.
+
+   The function currently supports GINSN_GEN_SCFI ginsn generation mode only.
+   To support other generation modes will require work on this target-specific
+   process of creation of ginsns:
+     - Some of such places are tagged with TBD_GINSN_GEN_NOT_SCFI to serve as
+       possible starting points.
+     - Also note that ginsn representation may need enhancements.  Specifically,
+       note some TBD_GINSN_INFO_LOSS and TBD_GINSN_REPRESENTATION_LIMIT markers.
+   */
+
+static ginsnS *
+x86_ginsn_new (const symbolS *insn_end_sym, enum ginsn_gen_mode gmode)
+{
+  int err = 0;
+  uint16_t opcode;
+  unsigned int dw2_regnum;
+  ginsnS *ginsn = NULL;
+  ginsnS *ginsn_next = NULL;
+  ginsnS *ginsn_last = NULL;
+  /* In 64-bit mode, the default stack update size is 8 bytes.  */
+  int stack_opnd_size = 8;
+
+  /* Currently supports generation of selected ginsns, sufficient for
+     the use-case of SCFI only.  */
+  if (gmode != GINSN_GEN_SCFI)
+    return ginsn;
+
+  opcode = i.tm.base_opcode;
+
+  switch (opcode)
+    {
+    case 0x1:
+      /* add reg, reg/mem.  */
+    case 0x29:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* sub reg, reg/mem.  */
+      ginsn = x86_ginsn_addsub_reg_mem (insn_end_sym);
+      break;
+
+    case 0x3:
+      /* add reg/mem, reg.  */
+    case 0x2b:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* sub reg/mem, reg.  */
+      ginsn = x86_ginsn_addsub_mem_reg (insn_end_sym);
+      break;
+
+    case 0xa0:
+    case 0xa8:
+      /* push fs / push gs have opcode_space == SPACE_0F.  */
+      if (i.tm.opcode_space != SPACE_0F)
+	break;
+      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
+      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
+      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
+	stack_opnd_size = 2;
+      /* push fs / push gs.  */
+      ginsn = ginsn_new_sub (insn_end_sym, false,
+			     GINSN_SRC_REG, REG_SP, 0,
+			     GINSN_SRC_IMM, 0, stack_opnd_size,
+			     GINSN_DST_REG, REG_SP, 0);
+      ginsn_set_where (ginsn);
+      ginsn_next = ginsn_new_store (insn_end_sym, false,
+				    GINSN_SRC_REG, dw2_regnum,
+				    GINSN_DST_INDIRECT, REG_SP, 0);
+      ginsn_set_where (ginsn_next);
+      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
+      break;
+
+    case 0xa1:
+    case 0xa9:
+      /* pop fs / pop gs have opcode_space == SPACE_0F.  */
+      if (i.tm.opcode_space != SPACE_0F)
+	break;
+      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
+      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
+      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
+	stack_opnd_size = 2;
+      /* pop fs / pop gs.  */
+      ginsn = ginsn_new_load (insn_end_sym, false,
+			      GINSN_SRC_INDIRECT, REG_SP, 0,
+			      GINSN_DST_REG, dw2_regnum);
+      ginsn_set_where (ginsn);
+      ginsn_next = ginsn_new_add (insn_end_sym, false,
+				  GINSN_SRC_REG, REG_SP, 0,
+				  GINSN_SRC_IMM, 0, stack_opnd_size,
+				  GINSN_DST_REG, REG_SP, 0);
+      ginsn_set_where (ginsn_next);
+      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
+      break;
+
+    case 0x50 ... 0x57:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* push reg.  */
+      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
+      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
+      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
+	stack_opnd_size = 2;
+      ginsn = ginsn_new_sub (insn_end_sym, false,
+			     GINSN_SRC_REG, REG_SP, 0,
+			     GINSN_SRC_IMM, 0, stack_opnd_size,
+			     GINSN_DST_REG, REG_SP, 0);
+      ginsn_set_where (ginsn);
+      ginsn_next = ginsn_new_store (insn_end_sym, false,
+				    GINSN_SRC_REG, dw2_regnum,
+				    GINSN_DST_INDIRECT, REG_SP, 0);
+      ginsn_set_where (ginsn_next);
+      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
+      break;
+
+    case 0x58 ... 0x5f:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* pop reg.  */
+      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
+      ginsn = ginsn_new_load (insn_end_sym, false,
+			      GINSN_SRC_INDIRECT, REG_SP, 0,
+			      GINSN_DST_REG, dw2_regnum);
+      ginsn_set_where (ginsn);
+      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
+      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
+	stack_opnd_size = 2;
+      ginsn_next = ginsn_new_add (insn_end_sym, false,
+				  GINSN_SRC_REG, REG_SP, 0,
+				  GINSN_SRC_IMM, 0, stack_opnd_size,
+				  GINSN_DST_REG, REG_SP, 0);
+      ginsn_set_where (ginsn_next);
+      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
+      break;
+
+    case 0x6a:
+    case 0x68:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* push imm8/imm16/imm32.  */
+      if (opcode == 0x6a)
+	stack_opnd_size = 1;
+      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.
+	 However, this prefix may only be present when opcode is 0x68.  */
+      else if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
+	stack_opnd_size = 2;
+      else
+	stack_opnd_size = 4;
+      /* Skip getting the value of imm from machine instruction
+	 because this is not important for SCFI.  */
+      ginsn = ginsn_new_sub (insn_end_sym, false,
+			     GINSN_SRC_REG, REG_SP, 0,
+			     GINSN_SRC_IMM, 0, stack_opnd_size,
+			     GINSN_DST_REG, REG_SP, 0);
+      ginsn_set_where (ginsn);
+      ginsn_next = ginsn_new_store (insn_end_sym, false,
+				    GINSN_SRC_IMM, 0,
+				    GINSN_DST_INDIRECT, REG_SP, 0);
+      ginsn_set_where (ginsn_next);
+      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
+      break;
+
+    case 0x70 ... 0x7f:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      ginsn = x86_ginsn_jump_cond (insn_end_sym);
+      break;
+
+    case 0x80:
+    case 0x81:
+    case 0x83:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      ginsn = x86_ginsn_alu_imm (insn_end_sym);
+      break;
+
+    case 0x8a:
+    case 0x8b:
+      /* Move reg/mem, reg (8/16/32/64).  */
+    case 0x88:
+    case 0x89:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* mov reg, reg/mem. (8/16/32/64).  */
+      ginsn = x86_ginsn_move (insn_end_sym);
+      break;
+
+    case 0x8d:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* lea disp(%src), %dst */
+      ginsn = x86_ginsn_lea (insn_end_sym);
+      break;
+
+    case 0x8f:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* pop to mem.  */
+      gas_assert (i.base_reg);
+      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
+      ginsn = ginsn_new_load (insn_end_sym, false,
+			      GINSN_SRC_INDIRECT, REG_SP, 0,
+			      GINSN_DST_INDIRECT, dw2_regnum);
+      ginsn_set_where (ginsn);
+      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
+      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
+	stack_opnd_size = 2;
+      ginsn_next = ginsn_new_add (insn_end_sym, false,
+				  GINSN_SRC_REG, REG_SP, 0,
+				  GINSN_SRC_IMM, 0, stack_opnd_size,
+				  GINSN_DST_REG, REG_SP, 0);
+      ginsn_set_where (ginsn_next);
+      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
+      break;
+
+    case 0x9c:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* pushf / pushfq.  */
+      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
+      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
+	stack_opnd_size = 2;
+      ginsn = ginsn_new_sub (insn_end_sym, false,
+			     GINSN_SRC_REG, REG_SP, 0,
+			     GINSN_SRC_IMM, 0, stack_opnd_size,
+			     GINSN_DST_REG, REG_SP, 0);
+      ginsn_set_where (ginsn);
+      /* Tracking EFLAGS register by number is not necessary.  */
+      ginsn_next = ginsn_new_store (insn_end_sym, false,
+				    GINSN_SRC_IMM, 0,
+				    GINSN_DST_INDIRECT, REG_SP, 0);
+      ginsn_set_where (ginsn_next);
+      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
+      break;
+
+    case 0x9d:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* popf / popfq.  */
+      /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
+      if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
+	stack_opnd_size = 2;
+      /* FIXME - hardcode the actual DWARF reg number value.  As for SCFI
+	 correctness, although this behaves simply a placeholder value; its
+	 just clearer if the value is correct.  */
+      dw2_regnum = 49;
+      ginsn = ginsn_new_load (insn_end_sym, false,
+			      GINSN_SRC_INDIRECT, REG_SP, 0,
+			      GINSN_DST_REG, dw2_regnum);
+      ginsn_set_where (ginsn);
+      ginsn_next = ginsn_new_add (insn_end_sym, false,
+				  GINSN_SRC_REG, REG_SP, 0,
+				  GINSN_SRC_IMM, 0, stack_opnd_size,
+				  GINSN_DST_REG, REG_SP, 0);
+      ginsn_set_where (ginsn_next);
+      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
+      break;
+
+    case 0xff:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* push from mem.  */
+      if (i.tm.extension_opcode == 6)
+	{
+	  /* In 64-bit mode, presence of 66H prefix indicates 16-bit op.  */
+	  if (flag_code == CODE_64BIT && ginsn_prefix_66H_p (i))
+	    stack_opnd_size = 2;
+	  ginsn = ginsn_new_sub (insn_end_sym, false,
+				 GINSN_SRC_REG, REG_SP, 0,
+				 GINSN_SRC_IMM, 0, stack_opnd_size,
+				 GINSN_DST_REG, REG_SP, 0);
+	  ginsn_set_where (ginsn);
+	  /* These instructions have no imm, only indirect access.  */
+	  gas_assert (i.base_reg);
+	  dw2_regnum = ginsn_dw2_regnum (i.base_reg);
+	  ginsn_next = ginsn_new_store (insn_end_sym, false,
+					GINSN_SRC_INDIRECT, dw2_regnum,
+					GINSN_DST_INDIRECT, REG_SP, 0);
+	  ginsn_set_where (ginsn_next);
+	  gas_assert (!ginsn_link_next (ginsn, ginsn_next));
+	}
+      else if (i.tm.extension_opcode == 4)
+	{
+	  /* jmp r/m.  E.g., notrack jmp *%rax.  */
+	  if (i.reg_operands)
+	    {
+	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
+	      ginsn = ginsn_new_jump (insn_end_sym, true,
+				      GINSN_SRC_REG, dw2_regnum, NULL);
+	      ginsn_set_where (ginsn);
+	    }
+	  else if (i.mem_operands && i.index_reg)
+	    {
+	      /* jmp    *0x0(,%rax,8).  */
+	      dw2_regnum = ginsn_dw2_regnum (i.index_reg);
+	      ginsn = ginsn_new_jump (insn_end_sym, true,
+				      GINSN_SRC_REG, dw2_regnum, NULL);
+	      ginsn_set_where (ginsn);
+	    }
+	  else if (i.mem_operands && i.base_reg)
+	    {
+	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
+	      ginsn = ginsn_new_jump (insn_end_sym, true,
+				      GINSN_SRC_REG, dw2_regnum, NULL);
+	      ginsn_set_where (ginsn);
+	    }
+	}
+      else if (i.tm.extension_opcode == 2)
+	{
+	  /* 0xFF /2 (call).  */
+	  if (i.reg_operands)
+	    {
+	      dw2_regnum = ginsn_dw2_regnum (i.op[0].regs);
+	      ginsn = ginsn_new_call (insn_end_sym, true,
+				      GINSN_SRC_REG, dw2_regnum, NULL);
+	      ginsn_set_where (ginsn);
+	    }
+	  else if (i.mem_operands && i.base_reg)
+	    {
+	      dw2_regnum = ginsn_dw2_regnum (i.base_reg);
+	      ginsn = ginsn_new_call (insn_end_sym, true,
+				      GINSN_SRC_REG, dw2_regnum, NULL);
+	      ginsn_set_where (ginsn);
+	    }
+	}
+      break;
+
+    case 0xc2:
+    case 0xc3:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* Near ret.  */
+      ginsn = ginsn_new_return (insn_end_sym, true);
+      ginsn_set_where (ginsn);
+      break;
+
+    case 0xc8:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* enter.  */
+      ginsn = x86_ginsn_enter (insn_end_sym);
+      break;
+
+    case 0xc9:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* The 'leave' instruction copies the contents of the RBP register
+	 into the RSP register to release all stack space allocated to the
+	 procedure.  */
+      ginsn = ginsn_new_mov (insn_end_sym, false,
+			     GINSN_SRC_REG, REG_FP, 0,
+			     GINSN_DST_REG, REG_SP, 0);
+      ginsn_set_where (ginsn);
+      /* Then it restores the old value of the RBP register from the stack.  */
+      ginsn_next = ginsn_new_load (insn_end_sym, false,
+				   GINSN_SRC_INDIRECT, REG_SP, 0,
+				   GINSN_DST_REG, REG_FP);
+      ginsn_set_where (ginsn_next);
+      gas_assert (!ginsn_link_next (ginsn, ginsn_next));
+      ginsn_last = ginsn_new_add (insn_end_sym, false,
+				  GINSN_SRC_REG, REG_SP, 0,
+				  GINSN_SRC_IMM, 0, 8,
+				  GINSN_DST_REG, REG_SP, 0);
+      ginsn_set_where (ginsn_next);
+      gas_assert (!ginsn_link_next (ginsn_next, ginsn_last));
+      break;
+
+    case 0xe0 ... 0xe2:
+      /* loop / loope / loopne.  */
+    case 0xe3:
+      /* jecxz / jrcxz.  */
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      ginsn = x86_ginsn_jump_cond (insn_end_sym);
+      ginsn_set_where (ginsn);
+      break;
+
+    case 0xe8:
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* PS: SCFI machinery does not care about which func is being
+	 called.  OK to skip that info.  */
+      ginsn = ginsn_new_call (insn_end_sym, true,
+			      GINSN_SRC_SYMBOL, 0, NULL);
+      ginsn_set_where (ginsn);
+      break;
+
+    case 0xeb:
+      /* If opcode_space != SPACE_BASE, this is not a jmp insn.  Skip it
+	 for GINSN_GEN_SCFI.  */
+      if (i.tm.opcode_space != SPACE_BASE)
+	break;
+      /* Unconditional jmp.  */
+      ginsn = x86_ginsn_jump (insn_end_sym);
+      ginsn_set_where (ginsn);
+      break;
+
+    default:
+      /* TBD_GINSN_GEN_NOT_SCFI: Skip all other opcodes uninteresting for
+	 GINSN_GEN_SCFI mode.  */
+      break;
+    }
+
+  if (!ginsn && !x86_ginsn_safe_to_skip ())
+    {
+      /* For all unhandled insns that are not whitelisted, check that they do
+	 not impact SCFI correctness.  */
+      err = x86_ginsn_unhandled ();
+      switch (err)
+	{
+	case X86_GINSN_UNHANDLED_NONE:
+	  break;
+	case X86_GINSN_UNHANDLED_DEST_REG:
+	  /* Not all writes to REG_FP are harmful in context of SCFI.  Simply
+	     generate a GINSN_TYPE_OTHER with destination set to the
+	     appropriate register.  The SCFI machinery will bail out if this
+	     ginsn affects SCFI correctness.  */
+	  dw2_regnum = ginsn_dw2_regnum (i.op[i.operands - 1].regs);
+	  ginsn = ginsn_new_other (insn_end_sym, true,
+				   GINSN_SRC_IMM, 0,
+				   GINSN_SRC_IMM, 0,
+				   GINSN_DST_REG, dw2_regnum);
+	  ginsn_set_where (ginsn);
+	  break;
+	case X86_GINSN_UNHANDLED_CFG:
+	  /* Fall through.  */
+	case X86_GINSN_UNHANDLED_STACKOP:
+	  as_bad (_("SCFI: unhandled op 0x%x may cause incorrect CFI"),
+		  i.tm.base_opcode);
+	  break;
+	default:
+	  abort ();
+	  break;
+	}
+    }
+
+  return ginsn;
+}
+
 /* This is the guts of the machine-dependent assembler.  LINE points to a
    machine dependent instruction.  This function is supposed to emit
    the frags/bytes it assembles to.  */
@@ -5299,6 +6272,7 @@  md_assemble (char *line)
   const char *end, *pass1_mnem = NULL;
   enum i386_error pass1_err = 0;
   const insn_template *t;
+  ginsnS *ginsn;
   struct last_insn *last_insn
     = &seg_info(now_seg)->tc_segment_info_data.last_insn;
 
@@ -5830,6 +6804,14 @@  md_assemble (char *line)
   /* We are ready to output the insn.  */
   output_insn (last_insn);
 
+  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
+     performed in i386_target_format.  */
+  if (flag_synth_cfi)
+    {
+      ginsn = x86_ginsn_new (symbol_temp_new_now (), frch_ginsn_gen_mode ());
+      frch_ginsn_data_append (ginsn);
+    }
+
   insert_lfence_after ();
 
   if (i.tm.opcode_modifier.isprefix)
@@ -11333,6 +12315,7 @@  s_insn (int dummy ATTRIBUTE_UNUSED)
   const char *end;
   unsigned int j;
   valueT val;
+  ginsnS *ginsn;
   bool vex = false, xop = false, evex = false;
   struct last_insn *last_insn;
 
@@ -12104,6 +13087,14 @@  s_insn (int dummy ATTRIBUTE_UNUSED)
   last_insn->name = ".insn directive";
   last_insn->file = as_where (&last_insn->line);
 
+  /* PS: SCFI is enabled only for AMD64 ABI.  The ABI check has been
+     performed in i386_target_format.  */
+  if (flag_synth_cfi)
+    {
+      ginsn = x86_ginsn_new (symbol_temp_new_now (), frch_ginsn_gen_mode ());
+      frch_ginsn_data_append (ginsn);
+    }
+
  done:
   *saved_ilp = saved_char;
   input_line_pointer = line;
@@ -15748,6 +16739,11 @@  i386_target_format (void)
   else
     as_fatal (_("unknown architecture"));
 
+#if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
+  if (flag_synth_cfi && x86_elf_abi != X86_64_ABI)
+    as_fatal (_("Synthesizing CFI is not supported for this ABI"));
+#endif
+
   if (cpu_flags_all_zero (&cpu_arch_isa_flags))
     cpu_arch_isa_flags = cpu_arch[flag_code == CODE_64BIT].enable;
 
diff --git a/gas/config/tc-i386.h b/gas/config/tc-i386.h
index 44227a8376c..981e45c1786 100644
--- a/gas/config/tc-i386.h
+++ b/gas/config/tc-i386.h
@@ -407,6 +407,27 @@  extern void i386_elf_section_change_hook (void);
 extern void i386_solaris_fix_up_eh_frame (segT);
 #endif
 
+#define TARGET_USE_GINSN 1
+/* Allow GAS to synthesize DWARF CFI for hand-written asm.
+   PS: TARGET_USE_CFIPOP is a pre-condition.  */
+#define TARGET_USE_SCFI 1
+/* Identify the maximum DWARF register number of all the registers being
+   tracked for SCFI.  This is the last DWARF register number of the set
+   of SP, BP, and all callee-saved registers.  For AMD64, this means
+   R15 (15).  Use SCFI_CALLEE_SAVED_REG_P to identify which registers
+   are callee-saved from this set.  */
+#define SCFI_MAX_REG_ID 15
+/* Identify the DWARF register number of the frame-pointer register.  */
+#define REG_FP 6
+/* Identify the DWARF register number of the stack-pointer register.  */
+#define REG_SP 7
+/* Some ABIs, like AMD64, use stack for call instruction.  For such an ABI,
+   identify the initial (CFA) offset from RSP at the entry of function.  */
+#define SCFI_INIT_CFA_OFFSET 8
+
+#define SCFI_CALLEE_SAVED_REG_P(dw2reg)  x86_scfi_callee_saved_p (dw2reg)
+extern bool x86_scfi_callee_saved_p (uint32_t dw2reg_num);
+
 /* Support for SHF_X86_64_LARGE */
 extern bfd_vma x86_64_section_letter (int, const char **);
 #define md_elf_section_letter(LETTER, PTR_MSG)	x86_64_section_letter (LETTER, PTR_MSG)
diff --git a/gas/ginsn.c b/gas/ginsn.c
new file mode 100644
index 00000000000..292ad1a2931
--- /dev/null
+++ b/gas/ginsn.c
@@ -0,0 +1,1259 @@ 
+/* ginsn.h - GAS instruction representation.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of GAS, the GNU Assembler.
+
+   GAS is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GAS is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GAS; see the file COPYING.  If not, write to the Free
+   Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA
+   02110-1301, USA.  */
+
+#include "as.h"
+#include "subsegs.h"
+#include "ginsn.h"
+#include "scfi.h"
+
+#ifdef TARGET_USE_GINSN
+
+static const char *const ginsn_type_names[] =
+{
+#define _GINSN_TYPE_ITEM(NAME, STR) STR,
+  _GINSN_TYPES
+#undef _GINSN_TYPE_ITEM
+};
+
+static ginsnS *
+ginsn_alloc (void)
+{
+  ginsnS *ginsn = XCNEW (ginsnS);
+  return ginsn;
+}
+
+static ginsnS *
+ginsn_init (enum ginsn_type type, const symbolS *sym, bool real_p)
+{
+  ginsnS *ginsn = ginsn_alloc ();
+  ginsn->type = type;
+  ginsn->sym = sym;
+  if (real_p)
+    ginsn->flags |= GINSN_F_INSN_REAL;
+  return ginsn;
+}
+
+static void
+ginsn_cleanup (ginsnS **ginsnp)
+{
+  ginsnS *ginsn;
+
+  if (!ginsnp || !*ginsnp)
+    return;
+
+  ginsn = *ginsnp;
+  if (ginsn->scfi_ops)
+    {
+      scfi_ops_cleanup (ginsn->scfi_ops);
+      ginsn->scfi_ops = NULL;
+    }
+
+  free (ginsn);
+  *ginsnp = NULL;
+}
+
+static void
+ginsn_set_src (struct ginsn_src *src, enum ginsn_src_type type, unsigned int reg,
+	       offsetT immdisp)
+{
+  if (!src)
+    return;
+
+  src->type = type;
+  /* Even when the use-case is SCFI, the value of reg may be > SCFI_MAX_REG_ID.
+     E.g., in AMD64, push fs etc.  */
+  src->reg = reg;
+  src->immdisp = immdisp;
+}
+
+static void
+ginsn_set_dst (struct ginsn_dst *dst, enum ginsn_dst_type type, unsigned int reg,
+	       offsetT disp)
+{
+  if (!dst)
+    return;
+
+  dst->type = type;
+  dst->reg = reg;
+
+  if (type == GINSN_DST_INDIRECT)
+    dst->disp = disp;
+}
+
+static void
+ginsn_set_file_line (ginsnS *ginsn, const char *file, unsigned int line)
+{
+  if (!ginsn)
+    return;
+
+  ginsn->file = file;
+  ginsn->line = line;
+}
+
+struct ginsn_src *
+ginsn_get_src1 (ginsnS *ginsn)
+{
+  return &ginsn->src[0];
+}
+
+struct ginsn_src *
+ginsn_get_src2 (ginsnS *ginsn)
+{
+  return &ginsn->src[1];
+}
+
+struct ginsn_dst *
+ginsn_get_dst (ginsnS *ginsn)
+{
+  return &ginsn->dst;
+}
+
+unsigned int
+ginsn_get_src_reg (struct ginsn_src *src)
+{
+  return src->reg;
+}
+
+enum ginsn_src_type
+ginsn_get_src_type (struct ginsn_src *src)
+{
+  return src->type;
+}
+
+offsetT
+ginsn_get_src_disp (struct ginsn_src *src)
+{
+  return src->immdisp;
+}
+
+offsetT
+ginsn_get_src_imm (struct ginsn_src *src)
+{
+  return src->immdisp;
+}
+
+unsigned int
+ginsn_get_dst_reg (struct ginsn_dst *dst)
+{
+  return dst->reg;
+}
+
+enum ginsn_dst_type
+ginsn_get_dst_type (struct ginsn_dst *dst)
+{
+  return dst->type;
+}
+
+offsetT
+ginsn_get_dst_disp (struct ginsn_dst *dst)
+{
+  return dst->disp;
+}
+
+void
+label_ginsn_map_insert (const symbolS *label, ginsnS *ginsn)
+{
+  const char *name = S_GET_NAME (label);
+  str_hash_insert (frchain_now->frch_ginsn_data->label_ginsn_map,
+		   name, ginsn, 0 /* noreplace.  */);
+}
+
+ginsnS *
+label_ginsn_map_find (const symbolS *label)
+{
+  const char *name = S_GET_NAME (label);
+  ginsnS *ginsn
+    = (ginsnS *) str_hash_find (frchain_now->frch_ginsn_data->label_ginsn_map,
+				name);
+  return ginsn;
+}
+
+ginsnS *
+ginsn_new_phantom (const symbolS *sym)
+{
+  ginsnS *ginsn = ginsn_alloc ();
+  ginsn->type = GINSN_TYPE_PHANTOM;
+  ginsn->sym = sym;
+  /* By default, GINSN_F_INSN_REAL is not set in ginsn->flags.  */
+  return ginsn;
+}
+
+ginsnS *
+ginsn_new_symbol (const symbolS *sym, bool func_begin_p)
+{
+  ginsnS *ginsn = ginsn_alloc ();
+  ginsn->type = GINSN_TYPE_SYMBOL;
+  ginsn->sym = sym;
+  if (func_begin_p)
+    ginsn->flags |= GINSN_F_FUNC_MARKER;
+  return ginsn;
+}
+
+ginsnS *
+ginsn_new_symbol_func_begin (const symbolS *sym)
+{
+  return ginsn_new_symbol (sym, true);
+}
+
+ginsnS *
+ginsn_new_symbol_func_end (const symbolS *sym)
+{
+  return ginsn_new_symbol (sym, false);
+}
+
+ginsnS *
+ginsn_new_symbol_user_label (const symbolS *sym)
+{
+  ginsnS *ginsn = ginsn_new_symbol (sym, false);
+  ginsn->flags |= GINSN_F_USER_LABEL;
+  return ginsn;
+}
+
+ginsnS *
+ginsn_new_add (const symbolS *sym, bool real_p,
+	       enum ginsn_src_type src1_type, unsigned int src1_reg, offsetT src1_disp,
+	       enum ginsn_src_type src2_type, unsigned int src2_reg, offsetT src2_disp,
+	       enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp)
+{
+  ginsnS *ginsn = ginsn_init (GINSN_TYPE_ADD, sym, real_p);
+  /* src info.  */
+  ginsn_set_src (&ginsn->src[0], src1_type, src1_reg, src1_disp);
+  ginsn_set_src (&ginsn->src[1], src2_type, src2_reg, src2_disp);
+  /* dst info.  */
+  ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp);
+
+  return ginsn;
+}
+
+ginsnS *
+ginsn_new_and (const symbolS *sym, bool real_p,
+	       enum ginsn_src_type src1_type, unsigned int src1_reg, offsetT src1_disp,
+	       enum ginsn_src_type src2_type, unsigned int src2_reg, offsetT src2_disp,
+	       enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp)
+{
+  ginsnS *ginsn = ginsn_init (GINSN_TYPE_AND, sym, real_p);
+  /* src info.  */
+  ginsn_set_src (&ginsn->src[0], src1_type, src1_reg, src1_disp);
+  ginsn_set_src (&ginsn->src[1], src2_type, src2_reg, src2_disp);
+  /* dst info.  */
+  ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp);
+
+  return ginsn;
+}
+
+ginsnS *
+ginsn_new_call (const symbolS *sym, bool real_p,
+		enum ginsn_src_type src_type, unsigned int src_reg,
+		symbolS *src_text_sym)
+
+{
+  ginsnS *ginsn = ginsn_init (GINSN_TYPE_CALL, sym, real_p);
+  /* src info.  */
+  ginsn_set_src (&ginsn->src[0], src_type, src_reg, 0);
+
+  if (src_type == GINSN_SRC_SYMBOL)
+    ginsn->src[0].sym = src_text_sym;
+
+  return ginsn;
+}
+
+ginsnS *
+ginsn_new_jump (const symbolS *sym, bool real_p,
+		enum ginsn_src_type src_type, unsigned int src_reg,
+		symbolS *src_ginsn_sym)
+{
+  ginsnS *ginsn = ginsn_init (GINSN_TYPE_JUMP, sym, real_p);
+  /* src info.  */
+  ginsn_set_src (&ginsn->src[0], src_type, src_reg, 0);
+
+  if (src_type == GINSN_SRC_SYMBOL)
+    ginsn->src[0].sym = src_ginsn_sym;
+
+  return ginsn;
+}
+
+ginsnS *
+ginsn_new_jump_cond (const symbolS *sym, bool real_p,
+		     enum ginsn_src_type src_type, unsigned int src_reg,
+		     symbolS *src_ginsn_sym)
+{
+  ginsnS *ginsn = ginsn_init (GINSN_TYPE_JUMP_COND, sym, real_p);
+  /* src info.  */
+  ginsn_set_src (&ginsn->src[0], src_type, src_reg, 0);
+
+  if (src_type == GINSN_SRC_SYMBOL)
+    ginsn->src[0].sym = src_ginsn_sym;
+
+  return ginsn;
+}
+
+ginsnS *
+ginsn_new_mov (const symbolS *sym, bool real_p,
+	       enum ginsn_src_type src_type, unsigned int src_reg, offsetT src_disp,
+	       enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp)
+{
+  ginsnS *ginsn = ginsn_init (GINSN_TYPE_MOV, sym, real_p);
+  /* src info.  */
+  ginsn_set_src (&ginsn->src[0], src_type, src_reg, src_disp);
+  /* dst info.  */
+  ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp);
+
+  return ginsn;
+}
+
+ginsnS *
+ginsn_new_store (const symbolS *sym, bool real_p,
+		 enum ginsn_src_type src_type, unsigned int src_reg,
+		 enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp)
+{
+  ginsnS *ginsn = ginsn_init (GINSN_TYPE_STORE, sym, real_p);
+  /* src info.  */
+  ginsn_set_src (&ginsn->src[0], src_type, src_reg, 0);
+  /* dst info.  */
+  gas_assert (dst_type == GINSN_DST_INDIRECT);
+  ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp);
+
+  return ginsn;
+}
+
+ginsnS *
+ginsn_new_load (const symbolS *sym, bool real_p,
+		enum ginsn_src_type src_type, unsigned int src_reg, offsetT src_disp,
+		enum ginsn_dst_type dst_type, unsigned int dst_reg)
+{
+  ginsnS *ginsn = ginsn_init (GINSN_TYPE_LOAD, sym, real_p);
+  /* src info.  */
+  gas_assert (src_type == GINSN_SRC_INDIRECT);
+  ginsn_set_src (&ginsn->src[0], src_type, src_reg, src_disp);
+  /* dst info.  */
+  ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, 0);
+
+  return ginsn;
+}
+
+ginsnS *
+ginsn_new_sub (const symbolS *sym, bool real_p,
+	       enum ginsn_src_type src1_type, unsigned int src1_reg, offsetT src1_disp,
+	       enum ginsn_src_type src2_type, unsigned int src2_reg, offsetT src2_disp,
+	       enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp)
+{
+  ginsnS *ginsn = ginsn_init (GINSN_TYPE_SUB, sym, real_p);
+  /* src info.  */
+  ginsn_set_src (&ginsn->src[0], src1_type, src1_reg, src1_disp);
+  ginsn_set_src (&ginsn->src[1], src2_type, src2_reg, src2_disp);
+  /* dst info.  */
+  ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, dst_disp);
+
+  return ginsn;
+}
+
+/* PS: Note this API does not identify the displacement values of
+   src1/src2/dst.  At this time, it is unnecessary for correctness to support
+   the additional argument.  */
+
+ginsnS *
+ginsn_new_other (const symbolS *sym, bool real_p,
+		 enum ginsn_src_type src1_type, unsigned int src1_val,
+		 enum ginsn_src_type src2_type, unsigned int src2_val,
+		 enum ginsn_dst_type dst_type, unsigned int dst_reg)
+{
+  ginsnS *ginsn = ginsn_init (GINSN_TYPE_OTHER, sym, real_p);
+  /* src info.  */
+  ginsn_set_src (&ginsn->src[0], src1_type, src1_val, src1_val);
+  /* GINSN_SRC_INDIRECT src2_type is not expected.  */
+  gas_assert (src2_type != GINSN_SRC_INDIRECT);
+  ginsn_set_src (&ginsn->src[1], src2_type, src2_val, src2_val);
+  /* dst info.  */
+  ginsn_set_dst (&ginsn->dst, dst_type, dst_reg, 0);
+
+  return ginsn;
+}
+
+ginsnS *
+ginsn_new_return (const symbolS *sym, bool real_p)
+{
+  ginsnS *ginsn = ginsn_init (GINSN_TYPE_RETURN, sym, real_p);
+  return ginsn;
+}
+
+void
+ginsn_set_where (ginsnS *ginsn)
+{
+  const char *file;
+  unsigned int line;
+  file = as_where (&line);
+  ginsn_set_file_line (ginsn, file, line);
+}
+
+int
+ginsn_link_next (ginsnS *ginsn, ginsnS *next)
+{
+  int ret = 0;
+
+  /* Avoid data corruption by limiting the scope of the API.  */
+  if (!ginsn || ginsn->next)
+    return 1;
+
+  ginsn->next = next;
+
+  return ret;
+}
+
+bool
+ginsn_track_reg_p (unsigned int dw2reg, enum ginsn_gen_mode gmode)
+{
+  bool track_p = false;
+
+  if (gmode == GINSN_GEN_SCFI && dw2reg <= SCFI_MAX_REG_ID)
+    {
+      /* FIXME - rename this to tc_ ? */
+      track_p |= SCFI_CALLEE_SAVED_REG_P (dw2reg);
+      track_p |= (dw2reg == REG_FP);
+      track_p |= (dw2reg == REG_SP);
+    }
+
+  return track_p;
+}
+
+static bool
+ginsn_indirect_jump_p (ginsnS *ginsn)
+{
+  bool ret_p = false;
+  if (!ginsn)
+    return ret_p;
+
+  ret_p = (ginsn->type == GINSN_TYPE_JUMP
+	   && ginsn->src[0].type == GINSN_SRC_REG);
+  return ret_p;
+}
+
+static bool
+ginsn_direct_local_jump_p (ginsnS *ginsn)
+{
+  bool ret_p = false;
+  if (!ginsn)
+    return ret_p;
+
+  ret_p |= (ginsn->type == GINSN_TYPE_JUMP
+	    && ginsn->src[0].type == GINSN_SRC_SYMBOL);
+  return ret_p;
+}
+
+static char *
+ginsn_src_print (struct ginsn_src *src)
+{
+  size_t len = 40;
+  char *src_str = XNEWVEC (char, len);
+
+  memset (src_str, 0, len);
+
+  switch (src->type)
+    {
+    case GINSN_SRC_REG:
+      snprintf (src_str, len, "%%r%d, ", ginsn_get_src_reg (src));
+      break;
+    case GINSN_SRC_IMM:
+      snprintf (src_str, len, "%lld, ",
+		(long long int) ginsn_get_src_imm (src));
+      break;
+    case GINSN_SRC_INDIRECT:
+      snprintf (src_str, len, "[%%r%d+%lld], ", ginsn_get_src_reg (src),
+		(long long int) ginsn_get_src_disp (src));
+      break;
+    default:
+      break;
+    }
+
+  return src_str;
+}
+
+static char*
+ginsn_dst_print (struct ginsn_dst *dst)
+{
+  size_t len = GINSN_LISTING_OPND_LEN;
+  char *dst_str = XNEWVEC (char, len);
+
+  memset (dst_str, 0, len);
+
+  if (dst->type == GINSN_DST_REG)
+    {
+      char *buf = XNEWVEC (char, 32);
+      sprintf (buf, "%%r%d", ginsn_get_dst_reg (dst));
+      strcat (dst_str, buf);
+    }
+  else if (dst->type == GINSN_DST_INDIRECT)
+    {
+      char *buf = XNEWVEC (char, 32);
+      sprintf (buf, "[%%r%d+%lld]", ginsn_get_dst_reg (dst),
+		 (long long int) ginsn_get_dst_disp (dst));
+      strcat (dst_str, buf);
+    }
+
+  gas_assert (strlen (dst_str) < GINSN_LISTING_OPND_LEN);
+
+  return dst_str;
+}
+
+static const char*
+ginsn_type_func_marker_print (ginsnS *ginsn)
+{
+  int id = 0;
+  static const char * const ginsn_sym_strs[] =
+    { "", "FUNC_BEGIN", "FUNC_END" };
+
+  if (GINSN_F_FUNC_BEGIN_P (ginsn))
+    id = 1;
+  else if (GINSN_F_FUNC_END_P (ginsn))
+    id = 2;
+
+  return ginsn_sym_strs[id];
+}
+
+static char*
+ginsn_print (ginsnS *ginsn)
+{
+  struct ginsn_src *src;
+  struct ginsn_dst *dst;
+  int str_size = 0;
+  size_t len = GINSN_LISTING_LEN;
+  char *ginsn_str = XNEWVEC (char, len);
+
+  memset (ginsn_str, 0, len);
+
+  str_size = snprintf (ginsn_str, GINSN_LISTING_LEN, "ginsn: %s",
+		       ginsn_type_names[ginsn->type]);
+  gas_assert (str_size >= 0 && str_size < GINSN_LISTING_LEN);
+
+  /* For some ginsn types, no further information is printed for now.  */
+  if (ginsn->type == GINSN_TYPE_CALL
+      || ginsn->type == GINSN_TYPE_RETURN)
+    goto end;
+  else if (ginsn->type == GINSN_TYPE_SYMBOL)
+    {
+      if (GINSN_F_USER_LABEL_P (ginsn))
+	str_size += snprintf (ginsn_str + str_size,
+			      GINSN_LISTING_LEN - str_size,
+			      " %s", S_GET_NAME (ginsn->sym));
+      else
+	str_size += snprintf (ginsn_str + str_size,
+			      GINSN_LISTING_LEN - str_size,
+			      " %s", ginsn_type_func_marker_print (ginsn));
+      goto end;
+    }
+
+  /* src 1.  */
+  src = ginsn_get_src1 (ginsn);
+  str_size += snprintf (ginsn_str + str_size, GINSN_LISTING_LEN - str_size,
+			" %s", ginsn_src_print (src));
+  gas_assert (str_size >= 0 && str_size < GINSN_LISTING_LEN);
+
+  /* src 2.  */
+  src = ginsn_get_src2 (ginsn);
+  str_size += snprintf (ginsn_str + str_size, GINSN_LISTING_LEN - str_size,
+			"%s", ginsn_src_print (src));
+  gas_assert (str_size >= 0 && str_size < GINSN_LISTING_LEN);
+
+  /* dst.  */
+  dst = ginsn_get_dst (ginsn);
+  str_size += snprintf (ginsn_str + str_size, GINSN_LISTING_LEN - str_size,
+			"%s", ginsn_dst_print (dst));
+
+end:
+  gas_assert (str_size >= 0 && str_size < GINSN_LISTING_LEN);
+  return ginsn_str;
+}
+
+static void
+gbb_cleanup (gbbS **bbp)
+{
+  gbbS *bb = NULL;
+
+  if (!bbp && !*bbp)
+    return;
+
+  bb = *bbp;
+
+  if (bb->entry_state)
+    {
+      free (bb->entry_state);
+      bb->entry_state = NULL;
+    }
+  if (bb->exit_state)
+    {
+      free (bb->exit_state);
+      bb->exit_state = NULL;
+    }
+  free (bb);
+  *bbp = NULL;
+}
+
+static void
+bb_add_edge (gbbS* from_bb, gbbS *to_bb)
+{
+  gedgeS *tmpedge = NULL;
+  gedgeS *gedge;
+  bool exists = false;
+
+  if (!from_bb || !to_bb)
+    return;
+
+  /* Create a new edge object.  */
+  gedge = XCNEW (gedgeS);
+  gedge->dst_bb = to_bb;
+  gedge->next = NULL;
+  gedge->visited = false;
+
+  /* Add it in.  */
+  if (from_bb->out_gedges == NULL)
+    {
+      from_bb->out_gedges = gedge;
+      from_bb->num_out_gedges++;
+    }
+  else
+    {
+      /* Get the tail of the list.  */
+      tmpedge = from_bb->out_gedges;
+      while (tmpedge)
+	{
+	  /* Do not add duplicate edges.  Duplicated edges will cause unwanted
+	     failures in the forward and backward passes for SCFI.  */
+	  if (tmpedge->dst_bb == to_bb)
+	    {
+	      exists = true;
+	      break;
+	    }
+	  if (tmpedge->next)
+	    tmpedge = tmpedge->next;
+	  else
+	    break;
+	}
+
+      if (!exists)
+	{
+	  tmpedge->next = gedge;
+	  from_bb->num_out_gedges++;
+	}
+      else
+	free (gedge);
+    }
+}
+
+static void
+cfg_add_bb (gcfgS *gcfg, gbbS *gbb)
+{
+  gbbS *last_bb = NULL;
+
+  if (!gcfg->root_bb)
+    gcfg->root_bb = gbb;
+  else
+    {
+      last_bb = gcfg->root_bb;
+      while (last_bb->next)
+	last_bb = last_bb->next;
+
+      last_bb->next = gbb;
+    }
+  gcfg->num_gbbs++;
+
+  gbb->id = gcfg->num_gbbs;
+}
+
+static gbbS *
+add_bb_at_ginsn (const symbolS *func, gcfgS *gcfg, ginsnS *ginsn, gbbS *prev_bb,
+		 int *errp);
+
+static gbbS *
+find_bb (gcfgS *gcfg, ginsnS *ginsn)
+{
+  gbbS *found_bb = NULL;
+  gbbS *gbb = NULL;
+
+  if (!ginsn)
+    return found_bb;
+
+  if (ginsn->visited)
+    {
+      cfg_for_each_bb (gcfg, gbb)
+	{
+	  if (gbb->first_ginsn == ginsn)
+	    {
+	      found_bb = gbb;
+	      break;
+	    }
+	}
+      /* Must be found if ginsn is visited.  */
+      gas_assert (found_bb);
+    }
+
+  return found_bb;
+}
+
+static gbbS *
+find_or_make_bb (const symbolS *func, gcfgS *gcfg, ginsnS *ginsn, gbbS *prev_bb,
+		 int *errp)
+{
+  gbbS *found_bb = NULL;
+
+  found_bb = find_bb (gcfg, ginsn);
+  if (found_bb)
+    return found_bb;
+
+  return add_bb_at_ginsn (func, gcfg, ginsn, prev_bb, errp);
+}
+
+/* Add the basic block starting at GINSN to the given GCFG.
+   Also adds an edge from the PREV_BB to the newly added basic block.
+
+   This is a recursive function which returns the root of the added
+   basic blocks.  */
+
+static gbbS *
+add_bb_at_ginsn (const symbolS *func, gcfgS *gcfg, ginsnS *ginsn, gbbS *prev_bb,
+		 int *errp)
+{
+  gbbS *current_bb = NULL;
+  ginsnS *target_ginsn = NULL;
+  const symbolS *taken_label;
+
+  while (ginsn)
+    {
+      /* Skip these as they may be right after a GINSN_TYPE_RETURN.
+	 For GINSN_TYPE_RETURN, we have already considered that as
+	 end of bb, and a logical exit from function.  */
+      if (GINSN_F_FUNC_END_P (ginsn))
+	{
+	  ginsn = ginsn->next;
+	  continue;
+	}
+
+      if (ginsn->visited)
+	{
+	  /* If the ginsn has been visited earlier, the bb must exist by now
+	     in the cfg.  */
+	  prev_bb = current_bb;
+	  current_bb = find_bb (gcfg, ginsn);
+	  gas_assert (current_bb);
+	  /* Add edge from the prev_bb.  */
+	  if (prev_bb)
+	    bb_add_edge (prev_bb, current_bb);
+	  break;
+	}
+      else if (current_bb && GINSN_F_USER_LABEL_P (ginsn))
+	{
+	  /* Create new bb starting at this label ginsn.  */
+	  prev_bb = current_bb;
+	  find_or_make_bb (func, gcfg, ginsn, prev_bb, errp);
+	  break;
+	}
+
+      if (current_bb == NULL)
+	{
+	  /* Create a new bb.  */
+	  current_bb = XCNEW (gbbS);
+	  cfg_add_bb (gcfg, current_bb);
+	  /* Add edge for the Not Taken, or Fall-through path.  */
+	  if (prev_bb)
+	    bb_add_edge (prev_bb, current_bb);
+	}
+
+      if (current_bb->first_ginsn == NULL)
+	current_bb->first_ginsn = ginsn;
+
+      ginsn->visited = true;
+      current_bb->num_ginsns++;
+      current_bb->last_ginsn = ginsn;
+
+      /* Note that BB is _not_ split on ginsn of type GINSN_TYPE_CALL.  */
+      if (ginsn->type == GINSN_TYPE_JUMP
+	  || ginsn->type == GINSN_TYPE_JUMP_COND
+	  || ginsn->type == GINSN_TYPE_RETURN)
+	{
+	  /* Indirect Jumps or direct jumps to symbols non-local to the
+	     function must not be seen here.  The caller must have already
+	     checked for that.  */
+	  gas_assert (!ginsn_indirect_jump_p (ginsn));
+	  if (ginsn->type == GINSN_TYPE_JUMP)
+	    gas_assert (ginsn_direct_local_jump_p (ginsn));
+
+	  /* Direct Jumps.  May include conditional or unconditional change of
+	     flow.  What is important for CFG creation is that the target be
+	     local to function.  */
+	  if (ginsn->type == GINSN_TYPE_JUMP_COND
+	      || ginsn_direct_local_jump_p (ginsn))
+	    {
+	      gas_assert (ginsn->src[0].type == GINSN_SRC_SYMBOL);
+	      taken_label = ginsn->src[0].sym;
+	      gas_assert (taken_label);
+
+	      /* Preserve the prev_bb to be the dominator bb as we are
+		 going to follow the taken path of the conditional branch
+		 soon.  */
+	      prev_bb = current_bb;
+
+	      /* Follow the target on the taken path.  */
+	      target_ginsn = label_ginsn_map_find (taken_label);
+	      /* Add the bb for the target of the taken branch.  */
+	      if (target_ginsn)
+		find_or_make_bb (func, gcfg, target_ginsn, prev_bb, errp);
+	      else
+		{
+		  *errp = GCFG_JLABEL_NOT_PRESENT;
+		  as_warn_where (ginsn->file, ginsn->line,
+				 _("missing label '%s' in func '%s' may result in imprecise cfg"),
+				 S_GET_NAME (taken_label), S_GET_NAME (func));
+		}
+	      /* Add the bb for the fall through path.  */
+	      find_or_make_bb (func, gcfg, ginsn->next, prev_bb, errp);
+	    }
+	 else if (ginsn->type == GINSN_TYPE_RETURN)
+	   {
+	     /* We'll come back to the ginsns following GINSN_TYPE_RETURN
+		from another path if they are indeed reachable code.  */
+	     break;
+	   }
+
+	 /* Current BB has been processed.  */
+	 current_bb = NULL;
+	}
+      ginsn = ginsn->next;
+    }
+
+  return current_bb;
+}
+
+static int
+gbbs_compare (const void *v1, const void *v2)
+{
+  const gbbS *bb1 = *(const gbbS **) v1;
+  const gbbS *bb2 = *(const gbbS **) v2;
+
+  if (bb1->first_ginsn->id < bb2->first_ginsn->id)
+    return -1;
+  else if (bb1->first_ginsn->id > bb2->first_ginsn->id)
+    return 1;
+  else if (bb1->first_ginsn->id == bb2->first_ginsn->id)
+    return 0;
+
+  return 0;
+}
+
+/* Synthesize DWARF CFI and emit it.  */
+
+static int
+ginsn_pass_execute_scfi (const symbolS *func, gcfgS *gcfg, gbbS *root_bb)
+{
+  int err = scfi_synthesize_dw2cfi (func, gcfg, root_bb);
+  if (!err)
+    scfi_emit_dw2cfi (func);
+
+  return err;
+}
+
+/* Traverse the list of ginsns for the function and warn if some
+   ginsns are not visited.
+
+   FIXME - this code assumes the caller has already performed a pass over
+   ginsns such that the reachable ginsns are already marked.  Revisit this - we
+   should ideally make this pass self-sufficient.  */
+
+static int
+ginsn_pass_warn_unreachable_code (const symbolS *func,
+				  gcfgS *gcfg ATTRIBUTE_UNUSED,
+				  ginsnS *root_ginsn)
+{
+  ginsnS *ginsn;
+  bool unreach_p = false;
+
+  if (!gcfg || !func || !root_ginsn)
+    return 0;
+
+  ginsn = root_ginsn;
+
+  while (ginsn)
+    {
+      /* Some ginsns of type GINSN_TYPE_SYMBOL remain unvisited.  Some
+	 may even be excluded from the CFG as they are not reachable, given
+	 their function, e.g., user labels after return machine insn.  */
+      if (!ginsn->visited
+	  && !GINSN_F_FUNC_END_P (ginsn)
+	  && !GINSN_F_USER_LABEL_P (ginsn))
+	{
+	  unreach_p = true;
+	  break;
+	}
+      ginsn = ginsn->next;
+    }
+
+  if (unreach_p)
+    as_warn_where (ginsn->file, ginsn->line,
+		   _("GINSN: found unreachable code in func '%s'"),
+		   S_GET_NAME (func));
+
+  return unreach_p;
+}
+
+void
+gcfg_get_bbs_in_prog_order (gcfgS *gcfg, gbbS **prog_order_bbs)
+{
+  uint64_t i = 0;
+  gbbS *gbb;
+
+  if (!prog_order_bbs)
+    return;
+
+  cfg_for_each_bb (gcfg, gbb)
+    {
+      gas_assert (i < gcfg->num_gbbs);
+      prog_order_bbs[i++] = gbb;
+    }
+
+  qsort (prog_order_bbs, gcfg->num_gbbs, sizeof (gbbS *), gbbs_compare);
+}
+
+/* Build the control flow graph for the ginsns of the function.
+
+   It is important that the target adds an appropriate ginsn:
+     - GINSN_TYPE_JUMP,
+     - GINSN_TYPE_JUMP_COND,
+     - GINSN_TYPE_CALL,
+     - GINSN_TYPE_RET
+  at the associated points in the function.  The correctness of the CFG
+  depends on the accuracy of these 'change of flow instructions'.  */
+
+gcfgS *
+gcfg_build (const symbolS *func, int *errp)
+{
+  gcfgS *gcfg;
+  ginsnS *first_ginsn;
+
+  gcfg = XCNEW (gcfgS);
+  first_ginsn = frchain_now->frch_ginsn_data->gins_rootP;
+  add_bb_at_ginsn (func, gcfg, first_ginsn, NULL /* prev_bb.  */, errp);
+
+  return gcfg;
+}
+
+void
+gcfg_cleanup (gcfgS **gcfgp)
+{
+  gcfgS *cfg;
+  gbbS *bb, *next_bb;
+  gedgeS *edge, *next_edge;
+
+  if (!gcfgp || !*gcfgp)
+    return;
+
+  cfg = *gcfgp;
+  bb = gcfg_get_rootbb (cfg);
+
+  while (bb)
+    {
+      next_bb = bb->next;
+
+      /* Cleanup all the edges.  */
+      edge = bb->out_gedges;
+      while (edge)
+	{
+	  next_edge = edge->next;
+	  free (edge);
+	  edge = next_edge;
+	}
+
+      gbb_cleanup (&bb);
+      bb = next_bb;
+    }
+
+  free (cfg);
+  *gcfgp = NULL;
+}
+
+gbbS *
+gcfg_get_rootbb (gcfgS *gcfg)
+{
+  gbbS *rootbb = NULL;
+
+  if (!gcfg || !gcfg->num_gbbs)
+    return NULL;
+
+  rootbb = gcfg->root_bb;
+
+  return rootbb;
+}
+
+void
+gcfg_print (const gcfgS *gcfg, FILE *outfile)
+{
+  gbbS *gbb = NULL;
+  gedgeS *gedge = NULL;
+  uint64_t total_ginsns = 0;
+
+  cfg_for_each_bb(gcfg, gbb)
+    {
+      fprintf (outfile, "BB [%" PRIu64 "] with num insns: %" PRIu64,
+	       gbb->id, gbb->num_ginsns);
+      fprintf (outfile, " [insns: %u to %u]\n",
+	       gbb->first_ginsn->line, gbb->last_ginsn->line);
+      total_ginsns += gbb->num_ginsns;
+      bb_for_each_edge(gbb, gedge)
+	fprintf (outfile, "  outgoing edge to %" PRIu64 "\n",
+		 gedge->dst_bb->id);
+    }
+  fprintf (outfile, "\nTotal ginsns in all GBBs = %" PRIu64 "\n",
+	   total_ginsns);
+}
+
+void
+frch_ginsn_data_init (const symbolS *func, symbolS *start_addr,
+		      enum ginsn_gen_mode gmode)
+{
+  /* FIXME - error out if prev object is not free'd ?  */
+  frchain_now->frch_ginsn_data = XCNEW (struct frch_ginsn_data);
+
+  frchain_now->frch_ginsn_data->mode = gmode;
+  /* Annotate with the current function symbol.  */
+  frchain_now->frch_ginsn_data->func = func;
+  /* Create a new start address symbol now.  */
+  frchain_now->frch_ginsn_data->start_addr = start_addr;
+  /* Assume the set of ginsn are apt for CFG creation, by default.  */
+  frchain_now->frch_ginsn_data->gcfg_apt_p = true;
+
+  frchain_now->frch_ginsn_data->label_ginsn_map = str_htab_create ();
+}
+
+void
+frch_ginsn_data_cleanup (void)
+{
+  ginsnS *ginsn = NULL;
+  ginsnS *next_ginsn = NULL;
+
+  ginsn = frchain_now->frch_ginsn_data->gins_rootP;
+  while (ginsn)
+    {
+      next_ginsn = ginsn->next;
+      ginsn_cleanup (&ginsn);
+      ginsn = next_ginsn;
+    }
+
+  if (frchain_now->frch_ginsn_data->label_ginsn_map)
+    htab_delete (frchain_now->frch_ginsn_data->label_ginsn_map);
+
+  free (frchain_now->frch_ginsn_data);
+  frchain_now->frch_ginsn_data = NULL;
+}
+
+/* Append GINSN to the list of ginsns for the current function being
+   assembled.  */
+
+int
+frch_ginsn_data_append (ginsnS *ginsn)
+{
+  ginsnS *last = NULL;
+  ginsnS *temp = NULL;
+  uint64_t id = 0;
+
+  if (!ginsn)
+    return 1;
+
+  if (frchain_now->frch_ginsn_data->gins_lastP)
+    id = frchain_now->frch_ginsn_data->gins_lastP->id;
+
+  /* Do the necessary preprocessing on the set of input GINSNs:
+       - Update each ginsn with its ID.
+     While you iterate, also keep gcfg_apt_p updated by checking whether any
+     ginsn is inappropriate for GCFG creation.  */
+  temp = ginsn;
+  while (temp)
+    {
+      temp->id = ++id;
+
+      if (ginsn_indirect_jump_p (temp)
+	  || (ginsn->type == GINSN_TYPE_JUMP
+	      && !ginsn_direct_local_jump_p (temp)))
+	frchain_now->frch_ginsn_data->gcfg_apt_p = false;
+
+      if (listing & LISTING_GINSN_SCFI)
+	listing_newline (ginsn_print (temp));
+
+      /* The input GINSN may be a linked list of multiple ginsns chained
+	 together.  Find the last ginsn in the input chain of ginsns.  */
+      last = temp;
+
+      temp = temp->next;
+    }
+
+  /* Link in the ginsn to the tail.  */
+  if (!frchain_now->frch_ginsn_data->gins_rootP)
+    frchain_now->frch_ginsn_data->gins_rootP = ginsn;
+  else
+    ginsn_link_next (frchain_now->frch_ginsn_data->gins_lastP, ginsn);
+
+  frchain_now->frch_ginsn_data->gins_lastP = last;
+
+  return 0;
+}
+
+enum ginsn_gen_mode
+frch_ginsn_gen_mode (void)
+{
+  enum ginsn_gen_mode gmode = GINSN_GEN_NONE;
+
+  if (frchain_now->frch_ginsn_data)
+    gmode = frchain_now->frch_ginsn_data->mode;
+
+  return gmode;
+}
+
+int
+ginsn_data_begin (const symbolS *func)
+{
+  ginsnS *ginsn;
+
+  /* The previous block of asm must have been processed by now.  */
+  if (frchain_now->frch_ginsn_data)
+    as_bad (_("GINSN process for prev func not done"));
+
+  /* FIXME - hard code the mode to GINSN_GEN_SCFI.
+     This can be changed later when other passes on ginsns are formalised.  */
+  frch_ginsn_data_init (func, symbol_temp_new_now (), GINSN_GEN_SCFI);
+
+  /* Create and insert ginsn with function begin marker.  */
+  ginsn = ginsn_new_symbol_func_begin (func);
+  frch_ginsn_data_append (ginsn);
+
+  return 0;
+}
+
+int
+ginsn_data_end (const symbolS *label)
+{
+  ginsnS *ginsn;
+  gbbS *root_bb;
+  gcfgS *gcfg = NULL;
+  const symbolS *func;
+  int err = 0;
+
+  if (!frchain_now->frch_ginsn_data)
+    return err;
+
+  /* Insert Function end marker.  */
+  ginsn = ginsn_new_symbol_func_end (label);
+  frch_ginsn_data_append (ginsn);
+
+  func = frchain_now->frch_ginsn_data->func;
+
+  /* Build the cfg of ginsn(s) of the function.  */
+  if (!frchain_now->frch_ginsn_data->gcfg_apt_p)
+    {
+      as_warn (_("Untraceable control flow for func '%s'; Skipping SCFI"),
+	       S_GET_NAME (func));
+      goto end;
+    }
+
+  gcfg = gcfg_build (func, &err);
+
+  root_bb = gcfg_get_rootbb (gcfg);
+  if (!root_bb)
+    {
+      as_bad (_("Bad cfg of ginsn of func '%s'"), S_GET_NAME (func));
+      goto end;
+    }
+
+  /* Execute the desired passes on ginsns.  */
+  err = ginsn_pass_execute_scfi (func, gcfg, root_bb);
+  if (err)
+    goto end;
+
+  /* Other passes, e.g., warn for unreachable code can be enabled too.  */
+  ginsn = frchain_now->frch_ginsn_data->gins_rootP;
+  err = ginsn_pass_warn_unreachable_code (func, gcfg, ginsn);
+
+end:
+  if (gcfg)
+    gcfg_cleanup (&gcfg);
+  frch_ginsn_data_cleanup ();
+
+  return err;
+}
+
+/* Add GINSN_TYPE_SYMBOL type ginsn for user-defined labels.  These may be
+   branch targets, and hence are necessary for control flow graph.  */
+
+void
+ginsn_frob_label (const symbolS *label)
+{
+  ginsnS *label_ginsn;
+  const char *file;
+  unsigned int line;
+
+  if (frchain_now->frch_ginsn_data)
+    {
+      /* PS: Note how we keep the actual LABEL symbol as ginsn->sym.
+	 Take care to avoid inadvertent updates or cleanups of symbols.  */
+      label_ginsn = ginsn_new_symbol_user_label (label);
+      /* Keep the location updated.  */
+      file = as_where (&line);
+      ginsn_set_file_line (label_ginsn, file, line);
+
+      frch_ginsn_data_append (label_ginsn);
+
+      label_ginsn_map_insert (label, label_ginsn);
+    }
+}
+
+const symbolS *
+ginsn_data_func_symbol (void)
+{
+  const symbolS *func = NULL;
+
+  if (frchain_now->frch_ginsn_data)
+    func = frchain_now->frch_ginsn_data->func;
+
+  return func;
+}
+
+#else
+
+int
+ginsn_data_begin (const symbolS *func ATTRIBUTE_UNUSED)
+{
+  as_bad (_("ginsn unsupported for target"));
+  return 1;
+}
+
+int
+ginsn_data_end (const symbolS *label ATTRIBUTE_UNUSED)
+{
+  as_bad (_("ginsn unsupported for target"));
+  return 1;
+}
+
+void
+ginsn_frob_label (const symbolS *sym ATTRIBUTE_UNUSED)
+{
+  return;
+}
+
+const symbolS *
+ginsn_data_func_symbol (void)
+{
+  return NULL;
+}
+
+#endif  /* TARGET_USE_GINSN.  */
diff --git a/gas/ginsn.h b/gas/ginsn.h
new file mode 100644
index 00000000000..70c2fe57526
--- /dev/null
+++ b/gas/ginsn.h
@@ -0,0 +1,384 @@ 
+/* ginsn.h - GAS instruction representation.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of GAS, the GNU Assembler.
+
+   GAS is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GAS is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GAS; see the file COPYING.  If not, write to the Free
+   Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA
+   02110-1301, USA.  */
+
+#ifndef GINSN_H
+#define GINSN_H
+
+#include "as.h"
+
+/* Maximum number of source operands of a ginsn.  */
+#define GINSN_NUM_SRC_OPNDS   2
+
+/* A ginsn in printed in the following format:
+      "ginsn: OPCD SRC1, SRC2, DST"
+      "<-5->  <--------125------->"
+   where each of SRC1, SRC2, and DST are in the form:
+      "%rNN,"  (up to 5 chars)
+      "imm,"   (up to int32_t+1 chars)
+      "[%rNN+-imm]," (up to int32_t+9 chars)
+      Hence a max of 19 chars.  */
+
+#define GINSN_LISTING_OPND_LEN	40
+#define GINSN_LISTING_LEN 156
+
+enum ginsn_gen_mode
+{
+  GINSN_GEN_NONE,
+  /* Generate ginsns for program validation passes.  */
+  GINSN_GEN_FVAL,
+  /* Generate ginsns for synthesizing DWARF CFI.  */
+  GINSN_GEN_SCFI,
+};
+
+/* ginsn types.
+
+   GINSN_TYPE_PHANTOM are phantom ginsns.  They are used where there is no real
+   machine instruction counterpart, but a ginsn is needed only to carry
+   information to GAS.  For example, to carry an SCFI Op.
+
+   Note that, ginsns do not have a push / pop instructions.
+   Instead, following are used:
+      type=GINSN_TYPE_LOAD, src=GINSN_SRC_INDIRECT, REG_SP: Load from stack.
+      type=GINSN_TYPE_STORE, dst=GINSN_DST_INDIRECT, REG_SP: Store to stack.
+*/
+
+#define _GINSN_TYPES \
+  _GINSN_TYPE_ITEM (GINSN_TYPE_SYMBOL, "SYM") \
+  _GINSN_TYPE_ITEM (GINSN_TYPE_PHANTOM, "PHANTOM")  \
+  _GINSN_TYPE_ITEM (GINSN_TYPE_ADD, "ADD")  \
+  _GINSN_TYPE_ITEM (GINSN_TYPE_AND, "AND")  \
+  _GINSN_TYPE_ITEM (GINSN_TYPE_CALL, "CALL") \
+  _GINSN_TYPE_ITEM (GINSN_TYPE_JUMP, "JMP") \
+  _GINSN_TYPE_ITEM (GINSN_TYPE_JUMP_COND, "JCC")  \
+  _GINSN_TYPE_ITEM (GINSN_TYPE_MOV, "MOV")  \
+  _GINSN_TYPE_ITEM (GINSN_TYPE_LOAD, "LOAD")  \
+  _GINSN_TYPE_ITEM (GINSN_TYPE_STORE, "STORE")  \
+  _GINSN_TYPE_ITEM (GINSN_TYPE_RETURN, "RET") \
+  _GINSN_TYPE_ITEM (GINSN_TYPE_SUB, "SUB")  \
+  _GINSN_TYPE_ITEM (GINSN_TYPE_OTHER, "OTH")
+
+enum ginsn_type
+{
+#define _GINSN_TYPE_ITEM(NAME, STR) NAME,
+  _GINSN_TYPES
+#undef _GINSN_TYPE_ITEM
+};
+
+enum ginsn_src_type
+{
+  GINSN_SRC_UNKNOWN,
+  GINSN_SRC_REG,
+  GINSN_SRC_IMM,
+  GINSN_SRC_INDIRECT,
+  GINSN_SRC_SYMBOL,
+};
+
+/* GAS instruction source operand representation.  */
+
+struct ginsn_src
+{
+  enum ginsn_src_type type;
+  /* DWARF register number.  */
+  unsigned int reg;
+  /* Immediate or disp for indirect memory access.  */
+  offsetT immdisp;
+  /* Src symbol.  May be needed for some control flow instructions.  */
+  const symbolS *sym;
+};
+
+enum ginsn_dst_type
+{
+  GINSN_DST_UNKNOWN,
+  GINSN_DST_REG,
+  GINSN_DST_INDIRECT,
+};
+
+/* GAS instruction destination operand representation.  */
+
+struct ginsn_dst
+{
+  enum ginsn_dst_type type;
+  /* DWARF register number.  */
+  unsigned int reg;
+  /* Disp for indirect memory access.  */
+  offsetT disp;
+};
+
+/* Various flags for additional information per GAS instruction.  */
+
+/* Function begin or end symbol.  */
+#define GINSN_F_FUNC_MARKER	    0x1
+/* Identify real or implicit GAS insn.
+   Some targets employ CISC-like instructions.  Multiple ginsn's may be used
+   for a single machine instruction in some ISAs.  For some optimizations,
+   there is need to identify whether a ginsn, e.g., GINSN_TYPE_ADD or
+   GINSN_TYPE_SUB is a result of an user-specified instruction or not.  */
+#define GINSN_F_INSN_REAL	    0x2
+/* Identify if the GAS insn of type GINSN_TYPE_SYMBOL is due to a user-defined
+   label.  Each user-defined labels in a function will cause addition of a new
+   ginsn.  This simplifies control flow graph creation.
+   See htab_t label_ginsn_map usage.  */
+#define GINSN_F_USER_LABEL	    0x4
+/* Max bit position for flags (uint32_t).  */
+#define GINSN_F_MAX		    0x20
+
+#define GINSN_F_FUNC_BEGIN_P(ginsn)	    \
+  ((ginsn != NULL)			    \
+   && (ginsn->type == GINSN_TYPE_SYMBOL)    \
+   && (ginsn->flags & GINSN_F_FUNC_MARKER))
+
+/* PS: For ginsn associated with a user-defined symbol location,
+   GINSN_F_FUNC_MARKER is unset, but GINSN_F_USER_LABEL is set.  */
+#define GINSN_F_FUNC_END_P(ginsn)	    \
+  ((ginsn != NULL)			    \
+   && (ginsn->type == GINSN_TYPE_SYMBOL)    \
+   && !(ginsn->flags & GINSN_F_FUNC_MARKER) \
+   && !(ginsn->flags & GINSN_F_USER_LABEL))
+
+#define GINSN_F_INSN_REAL_P(ginsn)	    \
+  ((ginsn != NULL)			    \
+   && (ginsn->flags & GINSN_F_INSN_REAL))
+
+#define GINSN_F_USER_LABEL_P(ginsn)	    \
+  ((ginsn != NULL)			    \
+   && (ginsn->type == GINSN_TYPE_SYMBOL)    \
+   && !(ginsn->flags & GINSN_F_FUNC_MARKER) \
+   && (ginsn->flags & GINSN_F_USER_LABEL))
+
+typedef struct ginsn ginsnS;
+typedef struct scfi_op scfi_opS;
+typedef struct scfi_state scfi_stateS;
+
+/* GAS generic instruction.
+
+   Generic instructions are used by GAS to abstract out the binary machine
+   instructions.  In other words, ginsn is a target/ABI independent internal
+   representation for GAS.  Note that, depending on the target, there may be
+   more than one ginsn per binary machine instruction.
+
+   ginsns can be used by GAS to perform validations, or even generate
+   additional information like, sythesizing DWARF CFI for hand-written asm.  */
+
+struct ginsn
+{
+  enum ginsn_type type;
+  /* GAS instructions are simple instructions with GINSN_NUM_SRC_OPNDS number
+     of source operands and one destination operand at this time.  */
+  struct ginsn_src src[GINSN_NUM_SRC_OPNDS];
+  struct ginsn_dst dst;
+  /* Additional information per instruction.  */
+  uint32_t flags;
+  /* Symbol.  For ginsn of type other than GINSN_TYPE_SYMBOL, this identifies
+     the end of the corresponding machine instruction in the .text segment.
+     These symbols are created anew by the targets and are not used elsewhere
+     in GAS.  The only exception is some ginsns of type GINSN_TYPE_SYMBOL, when
+     generated for the user-defined labels.  See ginsn_frob_label.  */
+  const symbolS *sym;
+  /* Identifier (linearly increasing natural number) for each ginsn.  Used as
+     a proxy for program order of ginsns.  */
+  uint64_t id;
+  /* Location information for user-interfacing messaging.  Only ginsns with
+     GINSN_F_FUNC_BEGIN_P and GINSN_F_FUNC_END_P may present themselves with no
+     file or line information.  */
+  const char *file;
+  unsigned int line;
+
+  /* Information needed for synthesizing CFI.  */
+  scfi_opS **scfi_ops;
+  uint32_t num_scfi_ops;
+
+  /* Flag to keep track of visited instructions for CFG creation.  */
+  bool visited;
+
+  ginsnS *next; /* A linked list.  */
+};
+
+struct ginsn_src *ginsn_get_src1 (ginsnS *ginsn);
+struct ginsn_src *ginsn_get_src2 (ginsnS *ginsn);
+struct ginsn_dst *ginsn_get_dst (ginsnS *ginsn);
+
+unsigned int ginsn_get_src_reg (struct ginsn_src *src);
+enum ginsn_src_type ginsn_get_src_type (struct ginsn_src *src);
+offsetT ginsn_get_src_disp (struct ginsn_src *src);
+offsetT ginsn_get_src_imm (struct ginsn_src *src);
+
+unsigned int ginsn_get_dst_reg (struct ginsn_dst *dst);
+enum ginsn_dst_type ginsn_get_dst_type (struct ginsn_dst *dst);
+offsetT ginsn_get_dst_disp (struct ginsn_dst *dst);
+
+/* Data object for book-keeping information related to GAS generic
+   instructions.  */
+struct frch_ginsn_data
+{
+  /* Mode for GINSN creation.  */
+  enum ginsn_gen_mode mode;
+  /* Head of the list of ginsns.  */
+  ginsnS *gins_rootP;
+  /* Tail of the list of ginsns.  */
+  ginsnS *gins_lastP;
+  /* Function symbol.  */
+  const symbolS *func;
+  /* Start address of the function.  */
+  symbolS *start_addr;
+  /* User-defined label to ginsn mapping.  */
+  htab_t label_ginsn_map;
+  /* Is the list of ginsn apt for creating CFG.  */
+  bool gcfg_apt_p;
+};
+
+int ginsn_data_begin (const symbolS *func);
+int ginsn_data_end (const symbolS *label);
+const symbolS *ginsn_data_func_symbol (void);
+void ginsn_frob_label (const symbolS *sym);
+
+void frch_ginsn_data_init (const symbolS *func, symbolS *start_addr,
+			   enum ginsn_gen_mode gmode);
+void frch_ginsn_data_cleanup (void);
+int frch_ginsn_data_append (ginsnS *ginsn);
+enum ginsn_gen_mode frch_ginsn_gen_mode (void);
+
+void label_ginsn_map_insert (const symbolS *label, ginsnS *ginsn);
+ginsnS *label_ginsn_map_find (const symbolS *label);
+
+ginsnS *ginsn_new_symbol_func_begin (const symbolS *sym);
+ginsnS *ginsn_new_symbol_func_end (const symbolS *sym);
+ginsnS *ginsn_new_symbol_user_label (const symbolS *sym);
+
+ginsnS *ginsn_new_phantom (const symbolS *sym);
+ginsnS *ginsn_new_symbol (const symbolS *sym, bool real_p);
+ginsnS *ginsn_new_add (const symbolS *sym, bool real_p,
+		       enum ginsn_src_type src1_type, unsigned int src1_reg, offsetT src1_disp,
+		       enum ginsn_src_type src2_type, unsigned int src2_reg, offsetT src2_disp,
+		       enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp);
+ginsnS *ginsn_new_and (const symbolS *sym, bool real_p,
+		       enum ginsn_src_type src1_type, unsigned int src1_reg, offsetT src1_disp,
+		       enum ginsn_src_type src2_type, unsigned int src2_reg, offsetT src2_disp,
+		       enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp);
+ginsnS *ginsn_new_call (const symbolS *sym, bool real_p,
+			enum ginsn_src_type src_type, unsigned int src_reg,
+			symbolS *src_text_sym);
+ginsnS *ginsn_new_jump (const symbolS *sym, bool real_p,
+			enum ginsn_src_type src_type, unsigned int src_reg,
+			symbolS *src_ginsn_sym);
+ginsnS *ginsn_new_jump_cond (const symbolS *sym, bool real_p,
+			     enum ginsn_src_type src_type, unsigned int src_reg,
+			     symbolS *src_ginsn_sym);
+ginsnS *ginsn_new_mov (const symbolS *sym, bool real_p,
+		       enum ginsn_src_type src_type, unsigned int src_reg, offsetT src_disp,
+		       enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp);
+ginsnS *ginsn_new_store (const symbolS *sym, bool real_p,
+			 enum ginsn_src_type src_type, unsigned int src_reg,
+			 enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp);
+ginsnS *ginsn_new_load (const symbolS *sym, bool real_p,
+			enum ginsn_src_type src_type, unsigned int src_reg, offsetT src_disp,
+			enum ginsn_dst_type dst_type, unsigned int dst_reg);
+ginsnS *ginsn_new_sub (const symbolS *sym, bool real_p,
+		       enum ginsn_src_type src1_type, unsigned int src1_reg, offsetT src1_disp,
+		       enum ginsn_src_type src2_type, unsigned int src2_reg, offsetT src2_disp,
+		       enum ginsn_dst_type dst_type, unsigned int dst_reg, offsetT dst_disp);
+ginsnS *ginsn_new_other (const symbolS *sym, bool real_p,
+			 enum ginsn_src_type src1_type, unsigned int src1_val,
+			 enum ginsn_src_type src2_type, unsigned int src2_val,
+			 enum ginsn_dst_type dst_type, unsigned int dst_reg);
+ginsnS *ginsn_new_return (const symbolS *sym, bool real_p);
+
+void ginsn_set_where (ginsnS *ginsn);
+
+bool ginsn_track_reg_p (unsigned int dw2reg, enum ginsn_gen_mode);
+
+int ginsn_link_next (ginsnS *ginsn, ginsnS *next);
+
+enum gcfg_err_code
+{
+  GCFG_OK = 0,
+  GCFG_JLABEL_NOT_PRESENT = 1, /* Warning-level code.  */
+};
+
+typedef struct gbb gbbS;
+typedef struct gedge gedgeS;
+
+/* GBB - Basic block of generic GAS instructions.  */
+
+struct gbb
+{
+  ginsnS *first_ginsn;
+  ginsnS *last_ginsn;
+  uint64_t num_ginsns;
+
+  /* Identifier (linearly increasing natural number) for each gbb.  Added for
+     debugging purpose only.  */
+  uint64_t id;
+
+  bool visited;
+
+  uint32_t num_out_gedges;
+  gedgeS *out_gedges;
+
+  /* Members for SCFI purposes.  */
+  /* SCFI state at the entry of basic block.  */
+  scfi_stateS *entry_state;
+  /* SCFI state at the exit of basic block.  */
+  scfi_stateS *exit_state;
+
+  /* A linked list.  In order of addition.  */
+  gbbS *next;
+};
+
+struct gedge
+{
+  gbbS *dst_bb;
+  /* A linked list.  In order of addition.  */
+  gedgeS *next;
+  bool visited;
+};
+
+/* Control flow graph of generic GAS instructions.  */
+
+struct gcfg
+{
+  uint64_t num_gbbs;
+  gbbS *root_bb;
+};
+
+typedef struct gcfg gcfgS;
+
+#define bb_for_each_insn(bb, ginsn)  \
+  for (ginsn = bb->first_ginsn; ginsn; \
+       ginsn = (ginsn != bb->last_ginsn) ? ginsn->next : NULL)
+
+#define bb_for_each_edge(bb, edge) \
+  for (edge = (edge == NULL) ? bb->out_gedges : edge; edge; edge = edge->next)
+
+#define cfg_for_each_bb(cfg, bb) \
+  for (bb = cfg->root_bb; bb; bb = bb->next)
+
+#define bb_get_first_ginsn(bb)	  \
+  (bb->first_ginsn)
+
+#define bb_get_last_ginsn(bb)	  \
+  (bb->last_ginsn)
+
+gcfgS *gcfg_build (const symbolS *func, int *errp);
+void gcfg_cleanup (gcfgS **gcfg);
+void gcfg_print (const gcfgS *gcfg, FILE *outfile);
+gbbS *gcfg_get_rootbb (gcfgS *gcfg);
+void gcfg_get_bbs_in_prog_order (gcfgS *gcfg, gbbS **prog_order_bbs);
+
+#endif /* GINSN_H.  */
diff --git a/gas/listing.h b/gas/listing.h
index a2cc1f51958..ff89f2219dd 100644
--- a/gas/listing.h
+++ b/gas/listing.h
@@ -29,6 +29,7 @@ 
 #define LISTING_NOCOND    32
 #define LISTING_MACEXP    64
 #define LISTING_GENERAL  128
+#define LISTING_GINSN_SCFI  256
 
 #define LISTING_DEFAULT    (LISTING_LISTING | LISTING_HLL | LISTING_SYMBOLS)
 
diff --git a/gas/read.c b/gas/read.c
index 0ae6d830e9c..7eebe009339 100644
--- a/gas/read.c
+++ b/gas/read.c
@@ -42,6 +42,7 @@ 
 #include "codeview.h"
 #include "wchar.h"
 #include "filenames.h"
+#include "ginsn.h"
 
 #include <limits.h>
 
@@ -1384,6 +1385,9 @@  read_a_source_file (const char *name)
     }
 #endif
 
+  if (flag_synth_cfi)
+    ginsn_data_end (symbol_temp_new_now ());
+
 #ifdef md_cleanup
   md_cleanup ();
 #endif
@@ -4198,6 +4202,12 @@  cons_worker (int nbytes,	/* 1=.byte, 2=.word, 4=.long.  */
 
   if (flag_mri)
     mri_comment_end (stop, stopc);
+
+  /* Disallow hand-crafting instructions using .byte.  FIXME - what about
+     .word, .long etc ?  */
+  if (flag_synth_cfi && frchain_now && frchain_now->frch_ginsn_data
+      && nbytes == 1)
+    as_bad (_("SCFI: hand-crafting instructions not supported"));
 }
 
 void
diff --git a/gas/scfi.c b/gas/scfi.c
new file mode 100644
index 00000000000..37cc85cfed7
--- /dev/null
+++ b/gas/scfi.c
@@ -0,0 +1,1232 @@ 
+/* scfi.c - Support for synthesizing DWARF CFI for hand-written asm.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of GAS, the GNU Assembler.
+
+   GAS is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GAS is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GAS; see the file COPYING.  If not, write to the Free
+   Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA
+   02110-1301, USA.  */
+
+#include "as.h"
+#include "scfi.h"
+#include "subsegs.h"
+#include "scfidw2gen.h"
+#include "dw2gencfi.h"
+
+#if defined (TARGET_USE_SCFI) && defined (TARGET_USE_GINSN)
+
+/* Beyond the target defined number of registers to be tracked
+   (SCFI_MAX_REG_ID), keep the next register ID, in sequence, for REG_CFA.  */
+#define REG_CFA	      (SCFI_MAX_REG_ID+1)
+/* Define the total number of registers being tracked.
+   Used as index into an array of cfi_reglocS.  Note that a ginsn may carry a
+   register number greater than MAX_NUM_SCFI_REGS, e.g., for the ginsns
+   corresponding to push fs/gs in AMD64.  */
+#define MAX_NUM_SCFI_REGS   (REG_CFA+1)
+
+#define REG_INVALID	    ((unsigned int)-1)
+
+enum cfi_reglocstate
+{
+  CFI_UNDEFINED,
+  CFI_IN_REG,
+  CFI_ON_STACK
+};
+
+/* Location at which CFI register is saved.
+
+   A CFI register (callee-saved registers, RA/LR) are always an offset from
+   the CFA.  REG_CFA itself, however, may have REG_SP or REG_FP as base
+   register.  Hence, keep the base reg ID and offset per tracked register.  */
+
+struct cfi_regloc
+{
+  /* Base reg ID (DWARF register number).  */
+  unsigned int base;
+  /* Location as offset from the CFA.  */
+  offsetT offset;
+  /* Current state of the CFI register.  */
+  enum cfi_reglocstate state;
+};
+
+typedef struct cfi_regloc cfi_reglocS;
+
+struct scfi_op_data
+{
+  const char *name;
+};
+
+typedef struct scfi_op_data scfi_op_dataS;
+
+/* SCFI operation.
+
+   An SCFI operation represents a single atomic change to the SCFI state.
+   This can also be understood as an abstraction for what eventually gets
+   emitted as a DWARF CFI operation.  */
+
+struct scfi_op
+{
+  /* An SCFI op updates the state of either the CFA or other tracked
+     (callee-saved, REG_SP etc) registers.  'reg' is in the DWARF register
+     number space and must be strictly less than MAX_NUM_SCFI_REGS.  */
+  unsigned int reg;
+  /* Location of the reg.  */
+  cfi_reglocS loc;
+  /* DWARF CFI opcode.  */
+  uint32_t dw2cfi_op;
+  /* Some SCFI ops, e.g., for CFI_label, may need to carry additional data.  */
+  scfi_op_dataS *op_data;
+  /* A linked list.  */
+  struct scfi_op *next;
+};
+
+/* SCFI State - accumulated unwind information at a PC.
+
+   SCFI state is the accumulated unwind information encompassing:
+      - REG_SP, REG_FP,
+      - RA, and
+      - all callee-saved registers.
+
+    Note that SCFI_MAX_REG_ID is target/ABI dependent and is provided by the
+    backends.  The backend must also identify the DWARF register numbers for
+    the REG_SP, and REG_FP registers.  */
+
+struct scfi_state
+{
+  cfi_reglocS regs[MAX_NUM_SCFI_REGS];
+  cfi_reglocS scratch[MAX_NUM_SCFI_REGS];
+  /* Current stack size.  */
+  offsetT stack_size;
+  /* Whether the stack size is known.
+     Stack size may become untraceable depending on the specific stack
+     manipulation machine instruction, e.g., rsp = rsp op reg instruction
+     makes the stack size untraceable.  */
+  bool traceable_p;
+};
+
+/* Initialize a new SCFI op.  */
+
+static scfi_opS *
+init_scfi_op (void)
+{
+  scfi_opS *op = XCNEW (scfi_opS);
+
+  return op;
+}
+
+/* Free the SCFI ops, given the HEAD of the list.  */
+
+void
+scfi_ops_cleanup (scfi_opS **head)
+{
+  scfi_opS *op;
+  scfi_opS *next;
+
+  if (!head || !*head)
+    return;
+
+  op = *head;
+  next = op->next;
+
+  while (op)
+    {
+      free (op);
+      op = next;
+      next = op ? op->next : NULL;
+    }
+}
+
+/* Compare two SCFI states.  */
+
+static int
+cmp_scfi_state (scfi_stateS *state1, scfi_stateS *state2)
+{
+  int ret;
+
+  if (!state1 || !state2)
+    ret = 1;
+
+  /* Skip comparing the scratch[] value of registers.  The user visible
+     unwind information is derived from the regs[] from the SCFI state.  */
+  ret = memcmp (state1->regs, state2->regs,
+		sizeof (cfi_reglocS) * MAX_NUM_SCFI_REGS);
+
+  /* For user functions which perform dynamic stack allocation, after switching
+     t REG_FP based CFA tracking, it is perfectly possible to have stack usage
+     in some control flows.  However, double-checking that all control flows
+     have the same idea of CFA tracking before this wont hurt.  */
+  gas_assert (state1->regs[REG_CFA].base == state2->regs[REG_CFA].base);
+  if (state1->regs[REG_CFA].base == REG_SP)
+    ret |= state1->stack_size != state2->stack_size;
+
+  ret |= state1->traceable_p != state2->traceable_p;
+
+  return ret;
+}
+
+#if 0
+static void
+scfi_state_update_reg (scfi_stateS *state, uint32_t dst, uint32_t base,
+		       int32_t offset)
+{
+  if (dst >= MAX_NUM_SCFI_REGS)
+    return;
+
+  state->regs[dst].base = base;
+  state->regs[dst].offset = offset;
+}
+#endif
+
+/* Update the SCFI state of REG as available on execution stack at OFFSET
+   from REG_CFA (BASE).
+
+   Note that BASE must be REG_CFA, because any other base (REG_SP, REG_FP)
+   is by definition transitory in the function.  */
+
+static void
+scfi_state_save_reg (scfi_stateS *state, unsigned int reg, unsigned int base,
+		     offsetT offset)
+{
+  if (reg >= MAX_NUM_SCFI_REGS)
+    return;
+
+  gas_assert (base == REG_CFA);
+
+  state->regs[reg].base = base;
+  state->regs[reg].offset = offset;
+  state->regs[reg].state = CFI_ON_STACK;
+}
+
+static void
+scfi_state_restore_reg (scfi_stateS *state, unsigned int reg)
+{
+  if (reg >= MAX_NUM_SCFI_REGS)
+    return;
+
+  /* Sanity check.  See Rule 4.  */
+  gas_assert (state->regs[reg].state == CFI_ON_STACK);
+  gas_assert (state->regs[reg].base == REG_CFA);
+
+  state->regs[reg].base = reg;
+  state->regs[reg].offset = 0;
+  /* PS: the register may still be on stack much after the restore, but the
+     SCFI state keeps the state as 'in register'.  */
+  state->regs[reg].state = CFI_IN_REG;
+}
+
+/* Identify if the given GAS instruction GINSN saves a register
+   (of interest) on stack.  */
+
+static bool
+ginsn_scfi_save_reg_p (ginsnS *ginsn, scfi_stateS *state)
+{
+  bool save_reg_p = false;
+  struct ginsn_src *src;
+  struct ginsn_dst *dst;
+
+  src = ginsn_get_src1 (ginsn);
+  dst = ginsn_get_dst (ginsn);
+
+  /* The first save to stack of callee-saved register is deemed as
+     register save.  */
+  if (!ginsn_track_reg_p (ginsn_get_src_reg (src), GINSN_GEN_SCFI)
+      || state->regs[ginsn_get_src_reg (src)].state == CFI_ON_STACK)
+    return save_reg_p;
+
+  /* A register save insn may be an indirect mov.  */
+  if (ginsn->type == GINSN_TYPE_MOV
+      && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT
+      && (ginsn_get_dst_reg (dst) == REG_SP
+	  || (ginsn_get_dst_reg (dst) == REG_FP
+	      && state->regs[REG_CFA].base == REG_FP)))
+    save_reg_p = true;
+  /* or an explicit store to stack.  */
+  else if (ginsn->type == GINSN_TYPE_STORE
+	   && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT
+	   && ginsn_get_dst_reg (dst) == REG_SP)
+    save_reg_p = true;
+
+  return save_reg_p;
+}
+
+/* Identify if the given GAS instruction GINSN restores a register
+   (of interest) on stack.  */
+
+static bool
+ginsn_scfi_restore_reg_p (ginsnS *ginsn, scfi_stateS *state)
+{
+  bool restore_reg_p = false;
+  struct ginsn_dst *dst;
+  struct ginsn_src *src1;
+
+  dst = ginsn_get_dst (ginsn);
+  src1 = ginsn_get_src1 (ginsn);
+
+  if (!ginsn_track_reg_p (ginsn_get_dst_reg (dst), GINSN_GEN_SCFI))
+    return restore_reg_p;
+
+  /* A register restore insn may be an indirect mov...  */
+  if (ginsn->type == GINSN_TYPE_MOV
+      && ginsn_get_src_type (src1) == GINSN_SRC_INDIRECT
+      && (ginsn_get_src_reg (src1) == REG_SP
+	  || (ginsn_get_src_reg (src1) == REG_FP
+	      && state->regs[REG_CFA].base == REG_FP)))
+    restore_reg_p = true;
+  /* ...or an explicit load from stack.  */
+  else if (ginsn->type == GINSN_TYPE_LOAD
+	   && ginsn_get_src_type (src1) == GINSN_SRC_INDIRECT
+	   && ginsn_get_src_reg (src1) == REG_SP)
+    restore_reg_p = true;
+
+  return restore_reg_p;
+}
+
+/* Append the SCFI operation OP to the list of SCFI operations in the
+   given GINSN.  */
+
+static int
+ginsn_append_scfi_op (ginsnS *ginsn, scfi_opS *op)
+{
+  scfi_opS *sop;
+
+  if (!ginsn || !op)
+    return 1;
+
+  if (!ginsn->scfi_ops)
+    {
+      ginsn->scfi_ops = XCNEW (scfi_opS *);
+      *ginsn->scfi_ops = op;
+    }
+  else
+    {
+      /* Add to tail.  Most ginsns have a single SCFI operation,
+	 so this traversal for every insertion is acceptable for now.  */
+      sop = *ginsn->scfi_ops;
+      while (sop->next)
+	sop = sop->next;
+
+      sop->next = op;
+    }
+  ginsn->num_scfi_ops++;
+
+  return 0;
+}
+
+static void
+scfi_op_add_def_cfa_reg (scfi_stateS *state, ginsnS *ginsn, unsigned int reg)
+{
+  scfi_opS *op = NULL;
+
+  state->regs[REG_CFA].base = reg;
+
+  op = init_scfi_op ();
+
+  op->dw2cfi_op = DW_CFA_def_cfa_register;
+  op->reg = REG_CFA;
+  op->loc = state->regs[REG_CFA];
+
+  ginsn_append_scfi_op (ginsn, op);
+}
+
+static void
+scfi_op_add_cfa_offset_inc (scfi_stateS *state, ginsnS *ginsn, offsetT num)
+{
+  scfi_opS *op = NULL;
+
+  state->regs[REG_CFA].offset -= num;
+
+  op = init_scfi_op ();
+
+  op->dw2cfi_op = DW_CFA_def_cfa_offset;
+  op->reg = REG_CFA;
+  op->loc = state->regs[REG_CFA];
+
+  ginsn_append_scfi_op (ginsn, op);
+}
+
+static void
+scfi_op_add_cfa_offset_dec (scfi_stateS *state, ginsnS *ginsn, offsetT num)
+{
+  scfi_opS *op = NULL;
+
+  state->regs[REG_CFA].offset += num;
+
+  op = init_scfi_op ();
+
+  op->dw2cfi_op = DW_CFA_def_cfa_offset;
+  op->reg = REG_CFA;
+  op->loc = state->regs[REG_CFA];
+
+  ginsn_append_scfi_op (ginsn, op);
+}
+
+static void
+scfi_op_add_def_cfa (scfi_stateS *state, ginsnS *ginsn, unsigned int reg,
+		     offsetT num)
+{
+  scfi_opS *op = NULL;
+
+  state->regs[REG_CFA].base = reg;
+  state->regs[REG_CFA].offset = num;
+
+  op = init_scfi_op ();
+
+  op->dw2cfi_op = DW_CFA_def_cfa;
+  op->reg = REG_CFA;
+  op->loc = state->regs[REG_CFA];
+
+  ginsn_append_scfi_op (ginsn, op);
+}
+
+static void
+scfi_op_add_cfi_offset (scfi_stateS *state, ginsnS *ginsn, unsigned int reg)
+{
+  scfi_opS *op = NULL;
+
+  op = init_scfi_op ();
+
+  op->dw2cfi_op = DW_CFA_offset;
+  op->reg = reg;
+  op->loc = state->regs[reg];
+
+  ginsn_append_scfi_op (ginsn, op);
+}
+
+static void
+scfi_op_add_cfa_restore (ginsnS *ginsn, unsigned int reg)
+{
+  scfi_opS *op = NULL;
+
+  op = init_scfi_op ();
+
+  op->dw2cfi_op = DW_CFA_restore;
+  op->reg = reg;
+  op->loc.base = REG_INVALID;
+  op->loc.offset = 0;
+
+  ginsn_append_scfi_op (ginsn, op);
+}
+
+static void
+scfi_op_add_cfi_remember_state (ginsnS *ginsn)
+{
+  scfi_opS *op = NULL;
+
+  op = init_scfi_op ();
+
+  op->dw2cfi_op = DW_CFA_remember_state;
+
+  ginsn_append_scfi_op (ginsn, op);
+}
+
+static void
+scfi_op_add_cfi_restore_state (ginsnS *ginsn)
+{
+  scfi_opS *op = NULL;
+
+  op = init_scfi_op ();
+
+  op->dw2cfi_op = DW_CFA_restore_state;
+
+  /* FIXME - add to the beginning of the scfi_ops.  */
+  ginsn_append_scfi_op (ginsn, op);
+}
+
+void
+scfi_op_add_cfi_label (ginsnS *ginsn, const char *name)
+{
+  scfi_opS *op = NULL;
+
+  op = init_scfi_op ();
+  op->dw2cfi_op = CFI_label;
+  op->op_data = XCNEW (scfi_op_dataS);
+  op->op_data->name = name;
+
+  ginsn_append_scfi_op (ginsn, op);
+}
+
+void
+scfi_op_add_signal_frame (ginsnS *ginsn)
+{
+  scfi_opS *op = NULL;
+
+  op = init_scfi_op ();
+  op->dw2cfi_op = CFI_signal_frame;
+
+  ginsn_append_scfi_op (ginsn, op);
+}
+
+static int
+verify_heuristic_traceable_reg_fp (ginsnS *ginsn, scfi_stateS *state)
+{
+  /* The function uses this variable to issue error to user right away.  */
+  int fp_traceable_p = 0;
+  struct ginsn_dst *dst;
+  struct ginsn_src *src1;
+  struct ginsn_src *src2;
+
+  src1 = ginsn_get_src1 (ginsn);
+  src2 = ginsn_get_src2 (ginsn);
+  dst = ginsn_get_dst (ginsn);
+
+  /* Stack manipulation can be done in a variety of ways.  A program may
+     allocate stack statically or may perform dynamic stack allocation in
+     the prologue.
+
+     The SCFI machinery in GAS is based on some heuristics:
+
+       - Rule 3 If the base register for CFA tracking is REG_FP, the program
+       must not clobber REG_FP, unless it is for switch to REG_SP based CFA
+       tracking (via say, a pop %rbp in X86).  */
+
+  /* Check all applicable instructions with dest REG_FP, when the CFA base
+     register is REG_FP.  */
+  if (state->regs[REG_CFA].base == REG_FP && ginsn_get_dst_reg (dst) == REG_FP)
+    {
+      /* Excuse the add/sub with imm usage: They are OK.  */
+      if ((ginsn->type == GINSN_TYPE_ADD || ginsn->type == GINSN_TYPE_SUB)
+	  && ginsn_get_src_reg (src1) == REG_FP
+	  && ginsn_get_src_type (src2) == GINSN_SRC_IMM)
+	fp_traceable_p = 0;
+      /* REG_FP restore is OK too.  */
+      else if (ginsn->type == GINSN_TYPE_LOAD)
+	fp_traceable_p = 0;
+      /* mov's to memory with REG_FP base do not make REG_FP untraceable.  */
+      else if (ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT
+	       && (ginsn->type == GINSN_TYPE_MOV
+		   || ginsn->type == GINSN_TYPE_STORE))
+	fp_traceable_p = 0;
+      /* Manipulations of the values possibly on stack are OK too.  */
+      else if ((ginsn->type == GINSN_TYPE_ADD || ginsn->type == GINSN_TYPE_SUB
+		|| ginsn->type == GINSN_TYPE_AND)
+	       && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT)
+	fp_traceable_p = 0;
+      /* All other ginsns with REG_FP as destination make REG_FP not
+	 traceable.  */
+      else
+	fp_traceable_p = 1;
+    }
+
+  if (fp_traceable_p)
+    as_bad_where (ginsn->file, ginsn->line,
+		  _("SCFI: usage of REG_FP as scratch not supported"));
+
+  return fp_traceable_p;
+}
+
+static int
+verify_heuristic_traceable_stack_manipulation (ginsnS *ginsn,
+					       scfi_stateS *state)
+{
+  /* The function uses this variable to issue error to user right away.  */
+  int sp_untraceable_p = 0;
+  bool possibly_untraceable = false;
+  struct ginsn_dst *dst;
+  struct ginsn_src *src1;
+  struct ginsn_src *src2;
+
+  src1 = ginsn_get_src1 (ginsn);
+  src2 = ginsn_get_src2 (ginsn);
+  dst = ginsn_get_dst (ginsn);
+
+  /* Stack manipulation can be done in a variety of ways.  A program may
+     allocate stack statically in prologue or may need to do dynamic stack
+     allocation.
+
+     The SCFI machinery in GAS is based on some heuristics:
+
+       - Rule 1 The base register for CFA tracking may be either REG_SP or
+       REG_FP.
+
+       - Rule 2 If the base register for CFA tracking is REG_SP, the precise
+       amount of stack usage (and hence, the value of rsp) must be known at
+       all times.  */
+
+  if (ginsn->type == GINSN_TYPE_MOV
+      && ginsn_get_dst_type (dst) == GINSN_DST_REG
+      && ginsn_get_dst_reg (dst) == REG_SP
+      && ginsn_get_src_type (src1) == GINSN_SRC_REG
+      /* Exclude mov %rbp, %rsp from this check.  */
+      && ginsn_get_src_reg (src1) != REG_FP)
+    {
+      /* mov %reg, %rsp.  */
+      /* A previous mov %rsp, %reg must have been seen earlier for this to be
+	 an OK for stack manipulation.  */
+      if (state->scratch[ginsn_get_src_reg (src1)].base != REG_CFA
+	  || state->scratch[ginsn_get_src_reg (src1)].state != CFI_IN_REG)
+	{
+	  possibly_untraceable = true;
+	}
+    }
+  /* Check add/sub/and insn usage when CFA base register is REG_SP.
+     Any stack size manipulation, including stack realignment is not allowed
+     if CFA base register is REG_SP.  */
+  else if (ginsn_get_dst_type (dst) == GINSN_DST_REG
+	   && ginsn_get_dst_reg (dst) == REG_SP
+	   && (((ginsn->type == GINSN_TYPE_ADD || ginsn->type == GINSN_TYPE_SUB)
+		&& ginsn_get_src_type (src2) != GINSN_SRC_IMM)
+	       || ginsn->type == GINSN_TYPE_AND
+	       || ginsn->type == GINSN_TYPE_OTHER))
+    possibly_untraceable = true;
+  /* If a register save operation is seen when REG_SP is untraceable,
+     CFI cannot be synthesized for register saves, hence bail out.  */
+  else if (ginsn_scfi_save_reg_p (ginsn, state) && !state->traceable_p)
+    {
+      sp_untraceable_p = 1;
+      /* If, however, the register save is an REG_FP-based, indirect mov
+	 like: mov reg, disp(%rbp) and CFA base register is REG_BP,
+	 untraceable REG_SP is not a problem.  */
+      if (ginsn->type == GINSN_TYPE_MOV
+	  && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT
+	  && (ginsn_get_dst_reg (dst) == REG_FP
+	      && state->regs[REG_CFA].base == REG_FP))
+	sp_untraceable_p = 0;
+    }
+  else if (ginsn_scfi_restore_reg_p (ginsn, state) && !state->traceable_p)
+    {
+      if (ginsn->type == GINSN_TYPE_MOV
+	  && ginsn_get_dst_type (dst) == GINSN_DST_INDIRECT
+	  && (ginsn_get_src_reg (src1) == REG_SP
+	      || (ginsn_get_src_reg (src1) == REG_FP
+		  && state->regs[REG_CFA].base != REG_FP)))
+	sp_untraceable_p = 1;
+    }
+
+  if (possibly_untraceable)
+    {
+      /* See Rule 2.  For SP-based CFA, this makes CFA tracking not possible.
+	 Propagate now to caller.  */
+      if (state->regs[REG_CFA].base == REG_SP)
+	sp_untraceable_p = 1;
+      else if (state->traceable_p)
+	{
+	  /* An extension of Rule 2.
+	     For FP-based CFA, this may be a problem *if* certain specific
+	     changes to the SCFI state are seen beyond this point, e.g.,
+	     register save / restore from stack.  */
+	  gas_assert (state->regs[REG_CFA].base == REG_FP);
+	  /* Simply make a note in the SCFI state object for now and
+	     continue.  Indicate an error when register save / restore
+	     for callee-saved registers is seen.  */
+	  sp_untraceable_p = 0;
+	  state->traceable_p = false;
+	}
+    }
+
+  if (sp_untraceable_p)
+    as_bad_where (ginsn->file, ginsn->line,
+		  _("SCFI: unsupported stack manipulation pattern"));
+
+  return sp_untraceable_p;
+}
+
+static int
+verify_heuristic_symmetrical_restore_reg (scfi_stateS *state, ginsnS* ginsn)
+{
+  int sym_restore = true;
+  offsetT expected_offset = 0;
+  struct ginsn_src *src1;
+  struct ginsn_dst *dst;
+  unsigned int reg;
+
+  /* Rule 4: Save and Restore of callee-saved registers must be symmetrical.
+     It is expected that value of the saved register is restored correctly.
+     E.g.,
+	push  reg1
+	push  reg2
+	...
+	body of func which uses reg1 , reg2 as scratch,
+	and may be even spills them to stack.
+	...
+	pop   reg2
+	pop   reg1
+     It is difficult to verify the Rule 4 in all cases.  For the SCFI machinery,
+     it is difficult to separate prologue-epilogue from the body of the function
+
+     Hence, the SCFI machinery at this time, should only warn on an asymetrical
+     restore.  */
+  src1 = ginsn_get_src1 (ginsn);
+  dst = ginsn_get_dst (ginsn);
+  reg = ginsn_get_dst_reg (dst);
+
+  /* For non callee-saved registers, calling the API is meaningless.  */
+  if (!ginsn_track_reg_p (ginsn_get_dst_reg (dst), GINSN_GEN_SCFI))
+    return sym_restore;
+
+  /* The register must have been saved on stack, for sure.  */
+  gas_assert (state->regs[reg].state == CFI_ON_STACK);
+  gas_assert (state->regs[reg].base == REG_CFA);
+
+  if ((ginsn->type == GINSN_TYPE_MOV
+       || ginsn->type == GINSN_TYPE_LOAD)
+      && ginsn_get_src_type (src1) == GINSN_SRC_INDIRECT
+      && (ginsn_get_src_reg (src1) == REG_SP
+	  || (ginsn_get_src_reg (src1) == REG_FP
+	      && state->regs[REG_CFA].base == REG_FP)))
+    {
+      /* mov disp(%rsp), reg.  */
+      /* mov disp(%rbp), reg.  */
+      expected_offset = (((ginsn_get_src_reg (src1) == REG_SP)
+			  ? -state->stack_size
+			  : state->regs[REG_FP].offset)
+			 + ginsn_get_src_disp (src1));
+    }
+
+  sym_restore = (expected_offset == state->regs[reg].offset);
+
+  return sym_restore;
+}
+
+/* Perform symbolic execution of the GINSN and update its list of scfi_ops.
+   scfi_ops are later used to directly generate the DWARF CFI directives.
+   Also update the SCFI state object STATE for the caller.  */
+
+static int
+gen_scfi_ops (ginsnS *ginsn, scfi_stateS *state)
+{
+  int ret = 0;
+  offsetT offset;
+  struct ginsn_src *src1;
+  struct ginsn_src *src2;
+  struct ginsn_dst *dst;
+
+  if (!ginsn || !state)
+    ret = 1;
+
+  /* For the first ginsn (of type GINSN_TYPE_SYMBOL) in the gbb, generate
+     the SCFI op with DW_CFA_def_cfa.  Note that the register and offset are
+     target-specific.  */
+  if (GINSN_F_FUNC_BEGIN_P (ginsn))
+    {
+      scfi_op_add_def_cfa (state, ginsn, REG_SP, SCFI_INIT_CFA_OFFSET);
+      state->stack_size += SCFI_INIT_CFA_OFFSET;
+      return ret;
+    }
+
+  src1 = ginsn_get_src1 (ginsn);
+  src2 = ginsn_get_src2 (ginsn);
+  dst = ginsn_get_dst (ginsn);
+
+  ret = verify_heuristic_traceable_stack_manipulation (ginsn, state);
+  if (ret)
+    return ret;
+
+  ret = verify_heuristic_traceable_reg_fp (ginsn, state);
+  if (ret)
+    return ret;
+
+  switch (ginsn->dst.type)
+    {
+    case GINSN_DST_REG:
+      switch (ginsn->type)
+	{
+	case GINSN_TYPE_MOV:
+	  if (ginsn_get_src_type (src1) == GINSN_SRC_REG
+	      && ginsn_get_src_reg (src1) == REG_SP
+	      && ginsn_get_dst_reg (dst) == REG_FP
+	      && state->regs[REG_CFA].base == REG_SP)
+	    {
+	      /* mov %rsp, %rbp.  */
+	      scfi_op_add_def_cfa_reg (state, ginsn, ginsn_get_dst_reg (dst));
+	    }
+	  else if (ginsn_get_src_type (src1) == GINSN_SRC_REG
+		   && ginsn_get_src_reg (src1) == REG_FP
+		   && ginsn_get_dst_reg (dst) == REG_SP
+		   && state->regs[REG_CFA].base == REG_FP)
+	    {
+	      /* mov %rbp, %rsp.  */
+	      state->stack_size = -state->regs[REG_FP].offset;
+	      scfi_op_add_def_cfa_reg (state, ginsn, ginsn_get_dst_reg (dst));
+	      state->traceable_p = true;
+	    }
+	  else if (ginsn_get_src_type (src1) == GINSN_SRC_INDIRECT
+		   && (ginsn_get_src_reg (src1) == REG_SP
+		       || ginsn_get_src_reg (src1) == REG_FP)
+		   && ginsn_track_reg_p (ginsn_get_dst_reg (dst), GINSN_GEN_SCFI))
+	    {
+	      /* mov disp(%rsp), reg.  */
+	      /* mov disp(%rbp), reg.  */
+	      if (verify_heuristic_symmetrical_restore_reg (state, ginsn))
+		{
+		  scfi_state_restore_reg (state, ginsn_get_dst_reg (dst));
+		  scfi_op_add_cfa_restore (ginsn, ginsn_get_dst_reg (dst));
+		}
+	      else
+		as_warn_where (ginsn->file, ginsn->line,
+			       _("SCFI: asymetrical register restore"));
+	    }
+	  else if (ginsn_get_src_type (src1) == GINSN_SRC_REG
+		   && ginsn_get_dst_type (dst) == GINSN_DST_REG
+		   && ginsn_get_src_reg (src1) == REG_SP)
+	    {
+	      /* mov %rsp, %reg.  */
+	      /* The value of rsp is taken directly from state->stack_size.
+		 IMP: The workflow in gen_scfi_ops must keep it updated.
+		 PS: Not taking the value from state->scratch[REG_SP] is
+		 intentional.  */
+	      state->scratch[ginsn_get_dst_reg (dst)].base = REG_CFA;
+	      state->scratch[ginsn_get_dst_reg (dst)].offset = -state->stack_size;
+	      state->scratch[ginsn_get_dst_reg (dst)].state = CFI_IN_REG;
+	    }
+	  else if (ginsn_get_src_type (src1) == GINSN_SRC_REG
+		   && ginsn_get_dst_type (dst) == GINSN_DST_REG
+		   && ginsn_get_dst_reg (dst) == REG_SP)
+	    {
+	      /* mov %reg, %rsp.  */
+	      /* Keep the value of REG_SP updated.  */
+	      if (state->scratch[ginsn_get_src_reg (src1)].state == CFI_IN_REG)
+		{
+		  state->stack_size = -state->scratch[ginsn_get_src_reg (src1)].offset;
+		  state->traceable_p = true;
+		}
+# if 0
+	      scfi_state_update_reg (state, ginsn_get_dst_reg (dst),
+				     state->scratch[ginsn_get_src_reg (src1)].base,
+				     state->scratch[ginsn_get_src_reg (src1)].offset);
+#endif
+
+	    }
+	  break;
+	case GINSN_TYPE_SUB:
+	  if (ginsn_get_src_reg (src1) == REG_SP
+	      && ginsn_get_dst_reg (dst) == REG_SP)
+	    {
+	      /* Stack inc/dec offset, when generated due to stack push and pop is
+		 target-specific.  Use the value encoded in the ginsn.  */
+	      state->stack_size += ginsn_get_src_imm (src2);
+	      if (state->regs[REG_CFA].base == REG_SP)
+		{
+		  /* push reg.  */
+		  scfi_op_add_cfa_offset_dec (state, ginsn, ginsn_get_src_imm (src2));
+		}
+	    }
+	  break;
+	case GINSN_TYPE_ADD:
+	  if (ginsn_get_src_reg (src1) == REG_SP
+	      && ginsn_get_dst_reg (dst) == REG_SP)
+	    {
+	      /* Stack inc/dec offset is target-specific.  Use the value
+		 encoded in the ginsn.  */
+	      state->stack_size -= ginsn_get_src_imm (src2);
+	      /* pop %reg affects CFA offset only if CFA is currently
+		 stack-pointer based.  */
+	      if (state->regs[REG_CFA].base == REG_SP)
+		{
+		  scfi_op_add_cfa_offset_inc (state, ginsn, ginsn_get_src_imm (src2));
+		}
+	    }
+	  else if (ginsn_get_src_reg (src1) == REG_FP
+		   && ginsn_get_dst_reg (dst) == REG_SP
+		   && state->regs[REG_CFA].base == REG_FP)
+	    {
+	      /* FIXME - what is this for ? */
+	      state->stack_size =  0 - (state->regs[REG_FP].offset + ginsn_get_src_imm (src2));
+	    }
+	  break;
+	case GINSN_TYPE_LOAD:
+	  /* If this is a load from stack.  */
+	  if (ginsn_get_src_type (src1) == GINSN_SRC_INDIRECT
+	      && (ginsn_get_src_reg (src1) == REG_SP
+		  || (ginsn_get_src_reg (src1) == REG_FP
+		      && state->regs[REG_CFA].base == REG_FP)))
+	    {
+	      /* pop %rbp when CFA tracking is REG_FP based.  */
+	      if (ginsn_get_dst_reg (dst) == REG_FP
+		  && state->regs[REG_CFA].base == REG_FP)
+		{
+		  scfi_op_add_def_cfa_reg (state, ginsn, REG_SP);
+		  if (state->regs[REG_CFA].offset != state->stack_size)
+		    scfi_op_add_cfa_offset_inc (state, ginsn,
+						(state->regs[REG_CFA].offset - state->stack_size));
+		}
+	      if (ginsn_track_reg_p (ginsn_get_dst_reg (dst), GINSN_GEN_SCFI))
+		{
+		  if (verify_heuristic_symmetrical_restore_reg (state, ginsn))
+		    {
+		      scfi_state_restore_reg (state, ginsn_get_dst_reg (dst));
+		      scfi_op_add_cfa_restore (ginsn, ginsn_get_dst_reg (dst));
+		    }
+		  else
+		    as_warn_where (ginsn->file, ginsn->line,
+				   _("SCFI: asymetrical register restore"));
+		}
+	    }
+	  break;
+	default:
+	  break;
+	}
+      break;
+
+    case GINSN_DST_INDIRECT:
+      /* Some operations with an indirect access to memory (or even to stack)
+	 may still be uninteresting for SCFI purpose (e.g, addl %edx, -32(%rsp)
+	 in x86).  In case of x86_64, these can neither be a register
+	 save / unsave, nor can alter the stack size.
+	 PS: This condition may need to be revisited for other arches.  */
+      if (ginsn->type == GINSN_TYPE_ADD || ginsn->type == GINSN_TYPE_SUB
+	  || ginsn->type == GINSN_TYPE_AND)
+	break;
+      gas_assert (ginsn->type == GINSN_TYPE_MOV
+		  || ginsn->type == GINSN_TYPE_STORE
+		  || ginsn->type == GINSN_TYPE_LOAD);
+      /* mov reg, disp(%rbp) */
+      /* mov reg, disp(%rsp) */
+      if (ginsn_scfi_save_reg_p (ginsn, state))
+	{
+	  if (ginsn_get_dst_reg (dst) == REG_SP)
+	    {
+	      /* mov reg, disp(%rsp) */
+	      offset = 0 - state->stack_size + ginsn_get_dst_disp (dst);
+	      scfi_state_save_reg (state, ginsn_get_src_reg (src1), REG_CFA, offset);
+	      scfi_op_add_cfi_offset (state, ginsn, ginsn_get_src_reg (src1));
+	    }
+	  else if (ginsn_get_dst_reg (dst) == REG_FP)
+	    {
+	      gas_assert (state->regs[REG_CFA].base == REG_FP);
+	      /* mov reg, disp(%rbp) */
+	      offset = 0 - state->regs[REG_CFA].offset + ginsn_get_dst_disp (dst);
+	      scfi_state_save_reg (state, ginsn_get_src_reg (src1), REG_CFA, offset);
+	      scfi_op_add_cfi_offset (state, ginsn, ginsn_get_src_reg (src1));
+	    }
+	}
+      break;
+
+    default:
+      /* Skip GINSN_DST_UNKNOWN and GINSN_DST_MEM as they are uninteresting
+	 currently for SCFI.  */
+      break;
+    }
+
+  return ret;
+}
+
+/* Recursively perform forward flow of the (unwind information) SCFI state
+   starting at basic block GBB.
+
+   The forward flow process propagates the SCFI state at exit of a basic block
+   to the successor basic block.
+
+   Returns error code, if any.  */
+
+static int
+forward_flow_scfi_state (gcfgS *gcfg, gbbS *gbb, scfi_stateS *state)
+{
+  ginsnS *ginsn;
+  gbbS *prev_bb;
+  gedgeS *gedge = NULL;
+  int ret = 0;
+
+  if (gbb->visited)
+    {
+      /* Check that the SCFI state is the same as previous.  */
+      ret = cmp_scfi_state (state, gbb->entry_state);
+      if (ret)
+	as_bad (_("SCFI: Bad CFI propagation perhaps"));
+      return ret;
+    }
+
+  gbb->visited = true;
+
+  gbb->entry_state = XCNEW (scfi_stateS);
+  memcpy (gbb->entry_state, state, sizeof (scfi_stateS));
+
+  /* Perform symbolic execution of each ginsn in the gbb and update the
+     scfi_ops list of each ginsn (and also update the STATE object).   */
+  bb_for_each_insn(gbb, ginsn)
+    {
+      ret = gen_scfi_ops (ginsn, state);
+      if (ret)
+	goto fail;
+    }
+
+  gbb->exit_state = XCNEW (scfi_stateS);
+  memcpy (gbb->exit_state, state, sizeof (scfi_stateS));
+
+  /* Forward flow the SCFI state.  Currently, we process the next basic block
+     in DFS order.  But any forward traversal order should be fine.  */
+  prev_bb = gbb;
+  if (gbb->num_out_gedges)
+    {
+      bb_for_each_edge(gbb, gedge)
+	{
+	  gbb = gedge->dst_bb;
+	  if (gbb->visited)
+	    {
+	      ret = cmp_scfi_state (gbb->entry_state, state);
+	      if (ret)
+		goto fail;
+	    }
+
+	  if (!gedge->visited)
+	    {
+	      gedge->visited = true;
+
+	      /* Entry SCFI state for the destination bb of the edge is the
+		 same as the exit SCFI state of the source bb of the edge.  */
+	      memcpy (state, prev_bb->exit_state, sizeof (scfi_stateS));
+	      ret = forward_flow_scfi_state (gcfg, gbb, state);
+	      if (ret)
+		goto fail;
+	    }
+	}
+    }
+
+  return 0;
+
+fail:
+
+  if (gedge)
+    gedge->visited = true;
+  return 1;
+}
+
+static int
+backward_flow_scfi_state (const symbolS *func ATTRIBUTE_UNUSED, gcfgS *gcfg)
+{
+  gbbS **prog_order_bbs;
+  gbbS **restore_bbs;
+  gbbS *current_bb;
+  gbbS *prev_bb;
+  gbbS *dst_bb;
+  ginsnS *ginsn;
+  gedgeS *gedge = NULL;
+
+  int ret = 0;
+  uint64_t i, j;
+
+  /* Basic blocks in reverse program order.  */
+  prog_order_bbs = XCNEWVEC (gbbS *, gcfg->num_gbbs);
+  /* Basic blocks for which CFI remember op needs to be generated.  */
+  restore_bbs = XCNEWVEC (gbbS *, gcfg->num_gbbs);
+
+  gcfg_get_bbs_in_prog_order (gcfg, prog_order_bbs);
+
+  i = gcfg->num_gbbs - 1;
+  /* Traverse in reverse program order.  */
+  while (i > 0)
+    {
+      current_bb = prog_order_bbs[i];
+      prev_bb = prog_order_bbs[i-1];
+      if (cmp_scfi_state (prev_bb->exit_state, current_bb->entry_state))
+	{
+	  /* Candidate for .cfi_restore_state found.  */
+	  ginsn = bb_get_first_ginsn (current_bb);
+	  scfi_op_add_cfi_restore_state (ginsn);
+	  /* Memorize current_bb now to find location for its remember state
+	     later.  */
+	  restore_bbs[i] = current_bb;
+	}
+      else
+	{
+	  bb_for_each_edge (current_bb, gedge)
+	    {
+	      dst_bb = gedge->dst_bb;
+	      for (j = 0; j < gcfg->num_gbbs; j++)
+		if (restore_bbs[j] == dst_bb)
+		  {
+		    ginsn = bb_get_last_ginsn (current_bb);
+		    scfi_op_add_cfi_remember_state (ginsn);
+		    /* Remove the memorised restore_bb from the list.  */
+		    restore_bbs[j] = NULL;
+		    break;
+		  }
+	    }
+	}
+      i--;
+    }
+
+  /* All .cfi_restore_state pseudo-ops must have a corresponding
+     .cfi_remember_state by now.  */
+  for (j = 0; j < gcfg->num_gbbs; j++)
+    if (restore_bbs[j] != NULL)
+      {
+	ret = 1;
+	break;
+      }
+
+  free (restore_bbs);
+  free (prog_order_bbs);
+
+  return ret;
+}
+
+/* Synthesize DWARF CFI for a function.  */
+
+int
+scfi_synthesize_dw2cfi (const symbolS *func, gcfgS *gcfg, gbbS *root_bb)
+{
+  int ret;
+  scfi_stateS *init_state;
+
+  init_state = XCNEW (scfi_stateS);
+  init_state->traceable_p = true;
+
+  /* Traverse the input GCFG and perform forward flow of information.
+     Update the scfi_op(s) per ginsn.  */
+  ret = forward_flow_scfi_state (gcfg, root_bb, init_state);
+  if (ret)
+    {
+      as_bad (_("SCFI: forward pass failed for func '%s'"), S_GET_NAME (func));
+      goto end;
+    }
+
+  ret = backward_flow_scfi_state (func, gcfg);
+  if (ret)
+    {
+      as_bad (_("SCFI: backward pass failed for func '%s'"), S_GET_NAME (func));
+      goto end;
+    }
+
+end:
+  free (init_state);
+  return ret;
+}
+
+static int
+handle_scfi_dot_cfi (ginsnS *ginsn)
+{
+  scfi_opS *op;
+
+  /* Nothing to do.  */
+  if (!ginsn->scfi_ops)
+    return 0;
+
+  op = *ginsn->scfi_ops;
+  if (!op)
+    goto bad;
+
+  while (op)
+    {
+      switch (op->dw2cfi_op)
+	{
+	case DW_CFA_def_cfa_register:
+	  scfi_dot_cfi (DW_CFA_def_cfa_register, op->loc.base, 0, 0, NULL,
+			ginsn->sym);
+	  break;
+	case DW_CFA_def_cfa_offset:
+	  scfi_dot_cfi (DW_CFA_def_cfa_offset, op->loc.base, 0,
+			op->loc.offset, NULL, ginsn->sym);
+	  break;
+	case DW_CFA_def_cfa:
+	  scfi_dot_cfi (DW_CFA_def_cfa, op->loc.base, 0, op->loc.offset,
+			NULL, ginsn->sym);
+	  break;
+	case DW_CFA_offset:
+	  scfi_dot_cfi (DW_CFA_offset, op->reg, 0, op->loc.offset, NULL,
+			ginsn->sym);
+	  break;
+	case DW_CFA_restore:
+	  scfi_dot_cfi (DW_CFA_restore, op->reg, 0, 0, NULL, ginsn->sym);
+	  break;
+	case DW_CFA_remember_state:
+	  scfi_dot_cfi (DW_CFA_remember_state, 0, 0, 0, NULL, ginsn->sym);
+	  break;
+	case DW_CFA_restore_state:
+	  scfi_dot_cfi (DW_CFA_restore_state, 0, 0, 0, NULL, ginsn->sym);
+	  break;
+	case CFI_label:
+	  scfi_dot_cfi (CFI_label, 0, 0, 0, op->op_data->name, ginsn->sym);
+	  break;
+	case CFI_signal_frame:
+	  scfi_dot_cfi (CFI_signal_frame, 0, 0, 0, NULL, ginsn->sym);
+	  break;
+	default:
+	  goto bad;
+	  break;
+	}
+      op = op->next;
+    }
+
+  return 0;
+bad:
+  as_bad (_("SCFI: Invalid DWARF CFI opcode data"));
+  return 1;
+}
+
+/* Emit Synthesized DWARF CFI.  */
+
+int
+scfi_emit_dw2cfi (const symbolS *func)
+{
+  struct frch_ginsn_data *frch_gdata;
+  ginsnS* ginsn = NULL;
+
+  frch_gdata = frchain_now->frch_ginsn_data;
+  ginsn = frch_gdata->gins_rootP;
+
+  while (ginsn)
+    {
+      switch (ginsn->type)
+	{
+	  case GINSN_TYPE_SYMBOL:
+	    /* .cfi_startproc and .cfi_endproc pseudo-ops.  */
+	    if (GINSN_F_FUNC_BEGIN_P (ginsn))
+	      {
+		scfi_dot_cfi_startproc (frch_gdata->start_addr);
+		break;
+	      }
+	    else if (GINSN_F_FUNC_END_P (ginsn))
+	      {
+		scfi_dot_cfi_endproc (ginsn->sym);
+		break;
+	      }
+	    /* Fall through.  */
+	  case GINSN_TYPE_ADD:
+	  case GINSN_TYPE_AND:
+	  case GINSN_TYPE_CALL:
+	  case GINSN_TYPE_JUMP:
+	  case GINSN_TYPE_JUMP_COND:
+	  case GINSN_TYPE_MOV:
+	  case GINSN_TYPE_LOAD:
+	  case GINSN_TYPE_PHANTOM:
+	  case GINSN_TYPE_STORE:
+	  case GINSN_TYPE_SUB:
+	  case GINSN_TYPE_OTHER:
+	  case GINSN_TYPE_RETURN:
+
+	    /* For all other SCFI ops, invoke the handler.  */
+	    if (ginsn->scfi_ops)
+	      handle_scfi_dot_cfi (ginsn);
+	    break;
+
+	  default:
+	    /* No other GINSN_TYPE_* expected.  */
+	    as_bad (_("SCFI: bad ginsn for func '%s'"),
+		    S_GET_NAME (func));
+	    break;
+	}
+      ginsn = ginsn->next;
+    }
+  return 0;
+}
+
+#else
+
+int
+scfi_emit_dw2cfi (const symbolS *func ATTRIBUTE_UNUSED)
+{
+  as_bad (_("SCFI: unsupported for target"));
+  return 1;
+}
+
+int
+scfi_synthesize_dw2cfi (const symbolS *func ATTRIBUTE_UNUSED,
+			gcfgS *gcfg ATTRIBUTE_UNUSED,
+			gbbS *root_bb ATTRIBUTE_UNUSED)
+{
+  as_bad (_("SCFI: unsupported for target"));
+  return 1;
+}
+
+#endif  /* defined (TARGET_USE_SCFI) && defined (TARGET_USE_GINSN).  */
diff --git a/gas/scfi.h b/gas/scfi.h
new file mode 100644
index 00000000000..07abe99ab27
--- /dev/null
+++ b/gas/scfi.h
@@ -0,0 +1,38 @@ 
+/* scfi.h - Support for synthesizing CFI for asm.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of GAS, the GNU Assembler.
+
+   GAS is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GAS is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GAS; see the file COPYING.  If not, write to the Free
+   Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA
+   02110-1301, USA.  */
+
+#ifndef SCFI_H
+#define SCFI_H
+
+#include "as.h"
+#include "ginsn.h"
+
+void scfi_ops_cleanup (scfi_opS **head);
+
+/* Some SCFI ops are not synthesized and are only added externally when parsing
+   the assembler input.  Two examples are CFI_label, and CFI_signal_frame.  */
+void scfi_op_add_cfi_label (ginsnS *ginsn, const char *name);
+void scfi_op_add_signal_frame (ginsnS *ginsn);
+
+int scfi_emit_dw2cfi (const symbolS *func);
+
+int scfi_synthesize_dw2cfi (const symbolS *func, gcfgS *gcfg, gbbS *root_bb);
+
+#endif /* SCFI_H.  */
diff --git a/gas/scfidw2gen.c b/gas/scfidw2gen.c
index 45fe7787653..651a7885cc6 100644
--- a/gas/scfidw2gen.c
+++ b/gas/scfidw2gen.c
@@ -19,6 +19,8 @@ 
    02110-1301, USA.  */
 
 #include "as.h"
+#include "ginsn.h"
+#include "scfi.h"
 #include "dw2gencfi.h"
 #include "subsegs.h"
 #include "scfidw2gen.h"
@@ -43,15 +45,33 @@  dot_scfi_ignore (int ignored ATTRIBUTE_UNUSED)
 static void
 scfi_process_cfi_label (void)
 {
-  /* To be implemented. */
-  return;
+  char *name;
+  ginsnS *ginsn;
+
+  name = read_symbol_name ();
+  if (name == NULL)
+    return;
+
+  /* Add a new ginsn.  */
+  ginsn = ginsn_new_phantom (symbol_temp_new_now ());
+  frch_ginsn_data_append (ginsn);
+
+  scfi_op_add_cfi_label (ginsn, name);
+  /* TODO.  */
+  // free (name);
+
+  demand_empty_rest_of_line ();
 }
 
 static void
 scfi_process_cfi_signal_frame (void)
 {
-  /* To be implemented.  */
-  return;
+  ginsnS *ginsn;
+
+  ginsn = ginsn_new_phantom (symbol_temp_new_now ());
+  frch_ginsn_data_append (ginsn);
+
+  scfi_op_add_signal_frame (ginsn);
 }
 
 static void
diff --git a/gas/subsegs.c b/gas/subsegs.c
index a74db52637a..9e940a72eda 100644
--- a/gas/subsegs.c
+++ b/gas/subsegs.c
@@ -130,6 +130,7 @@  subseg_set_rest (segT seg, subsegT subseg)
       newP->frch_frag_now = frag_alloc (&newP->frch_obstack);
       newP->frch_frag_now->fr_type = rs_fill;
       newP->frch_cfi_data = NULL;
+      newP->frch_ginsn_data = NULL;
 
       newP->frch_root = newP->frch_last = newP->frch_frag_now;
 
diff --git a/gas/subsegs.h b/gas/subsegs.h
index 2bc7adacc56..55e1a2626a2 100644
--- a/gas/subsegs.h
+++ b/gas/subsegs.h
@@ -40,6 +40,7 @@ 
 #include "obstack.h"
 
 struct frch_cfi_data;
+struct frch_ginsn_data;
 
 struct frchain			/* control building of a frag chain */
 {				/* FRCH = FRagment CHain control */
@@ -52,6 +53,7 @@  struct frchain			/* control building of a frag chain */
   struct obstack frch_obstack;	/* for objects in this frag chain */
   fragS *frch_frag_now;		/* frag_now for this subsegment */
   struct frch_cfi_data *frch_cfi_data;
+  struct frch_ginsn_data *frch_ginsn_data;
 };
 
 typedef struct frchain frchainS;
diff --git a/gas/symbols.c b/gas/symbols.c
index b07ed244142..117410fd1a0 100644
--- a/gas/symbols.c
+++ b/gas/symbols.c
@@ -25,6 +25,7 @@ 
 #include "obstack.h"		/* For "symbols.h" */
 #include "subsegs.h"
 #include "write.h"
+#include "scfi.h"
 
 #include <limits.h>
 #ifndef CHAR_BIT
@@ -709,6 +710,8 @@  colon (/* Just seen "x:" - rattle symbols & frags.  */
 #ifdef obj_frob_label
   obj_frob_label (symbolP);
 #endif
+  if (flag_synth_cfi)
+    ginsn_frob_label (symbolP);
 
   return symbolP;
 }