RFC: Prevent disassembly beyond symbolic boundaries

  Hi Guys,

  Currently objdump will disassemble beyond a symbolic boundary if it
  needs extra bytes to decode an instruction.  For example (with x86):

        .file   "foo.c"
        .text
        .globl  foo
        .type   foo, @function
    foo:
        .byte 0x24
        .byte 0x2f
        .byte 0x83
        .size   foo, .-foo

        .globl bar
        .type bar, @function
    bar:
        .byte 0x0f
        .byte 0xba
        .byte 0xe2
        .byte 0x03
        .size   bar, .-bar

  This will disassemble as:

    0000000000000000 <foo>:
       0:   24 2f                   and    $0x2f,%al
       2:   83 0f ba                orl    $0xffffffba,(%rdi)

    0000000000000003 <bar>:
       3:   0f ba e2 03             bt     $0x3,%edx

  Note how the instruction decoded at address 0x2 has stolen two bytes
  from "foo", but these bytes are also decoded (correctly this time) as
  part of the first instruction of foo.

  I have a patch (attached) which changes this behaviour, so that the
  disassembly would be:

       0:  24 2f              	   and    $0x2f,%al
       2:  83                      .byte 0x83

    00000003 <bar>:
       3:  0f ba e2 03             bt     $0x3,%edx

  The patch works by adding an extra field to the end of the
  disassemble_info structure, setting it inside objdump's
  disassemble_bytes function and then checking it in the opcode
  library's buffer_read_memory function.  This means that other users of
  the opcodes library will not be affected by the change.

  The patch makes sure to only enable this feature when disassembling
  code sections, data sections are unaffected.  I have omitted adding a
  new test for the feature since the gas/i386/x86_64-opcode-inval test
  already covers this.

  What do people think ?  To me this seems like a good idea, but I
  willing to consider alternative suggestions if people have them.

Cheers
  Nick

PS. The patch only affects objdump.  So if you disassemble a binary
  using GDB for example, the old behaviour will still be seen.  Changing
  GDB's behaviour is possible, although it would be quite a big 
  job as there are lots of different places where the disassembler part
  of the opcodes library is called and memory is read.

RFC: Prevent disassembly beyond symbolic boundaries

Commit Message

Comments

Patch