[v5] gdb/DAP Fix disassemble bug

Message ID 20230712184519.182460-1-simon.farre.cx@gmail.com
State New
Headers
Series [v5] gdb/DAP Fix disassemble bug |

Commit Message

Simon Farre July 12, 2023, 6:45 p.m. UTC
  v5. Added new tests that does the following

Disassemble 20 instructions, set breakpoint at ins addr of the 11th
continue; disassemble offset=-5, count=10 and compare the first 5, to
the 5 before the 11th. They should match.

v4. Added tests & doc comments to clarify

v3.

This adds the ability to "disassemble backwards" by retrieving "known
sources of truth" by using the gdb.Block data type, Block.start is always a
valid position in executable code, no? If not, then gdb.Block and all
code that relates to gdb.Block and it's underlying type, must also be
invalid, so I assume here that Block.start is always "good", as it were.

It would probably be nice to have a test for this if we don't already,
but from my dog fooding experience with Midas+VSCode+GDB-DAP, it *seems* to be
correct. But looks can be deceiving, I guess.

v1.
Fixes disassembleRequest

The field instructionOffset can be negative. Previous patch made it so
that sometimes the request got calculated to 0 instructions, when it
meant to retrieve disasm for -50 to 0 (current position).
---
 gdb/python/lib/gdb/dap/disassemble.py |  61 ++++++++++++++-
 gdb/testsuite/gdb.dap/disassemble.c   |  59 +++++++++++++++
 gdb/testsuite/gdb.dap/disassemble.exp | 102 ++++++++++++++++++++++++++
 3 files changed, 219 insertions(+), 3 deletions(-)
 create mode 100644 gdb/testsuite/gdb.dap/disassemble.c
 create mode 100644 gdb/testsuite/gdb.dap/disassemble.exp
  

Comments

Tom Tromey July 28, 2023, 6:22 p.m. UTC | #1
>>>>> "Simon" == Simon Farre via Gdb-patches <gdb-patches@sourceware.org> writes:

Thanks again for working on this.
I'm sorry it took so long for me to get back to it.

Simon> +# Disassemble "backwards"
Simon> +def _disasm_backwards(

This should be marked @in_gdb_thread.

Simon> +    # We take an arbitrary jump backwards
Simon> +    # Guessing that an instruction averages at 8 bytes
Simon> +    start = end_pc - 4 * offset

I don't understand the relation between the comment and the code
here... I thought it was just a typo but later 8 is used, so I don't
know.

Simon> +            tmp = []
Simon> +            # fill remaining ins with "invalid values" as per DAP spec
Simon> +            for i in range(0, offset - len(instructions)):
Simon> +                tmp.append({"addr": 0, "asm": "unknown"})

I guess this could be:

    tmp = [{"addr": 0, "asm": "unknown"}] * (offset - len(instructions))

Tom
  

Patch

diff --git a/gdb/python/lib/gdb/dap/disassemble.py b/gdb/python/lib/gdb/dap/disassemble.py
index bc091eb2c89..ff8cbc9e57a 100644
--- a/gdb/python/lib/gdb/dap/disassemble.py
+++ b/gdb/python/lib/gdb/dap/disassemble.py
@@ -18,17 +18,72 @@  import gdb
 from .server import request, capability
 from .startup import send_gdb_with_response, in_gdb_thread
 
+# Disassemble "backwards"
+def _disasm_backwards(
+    arch: gdb.Architecture, end_pc: int, offset: int, requested_count: int
+):
+    """END_PC is the last PC which we should disassemble to
+    OFFSET is the offset (in instructions) backwards we want to get
+    REQUESTED_COUNT is the number of instructions asked for in the DAP request."""
+    ins_at_pc = arch.disassemble(start_pc=end_pc)[0]
+    offset = abs(offset)
+    # We take an arbitrary jump backwards
+    # Guessing that an instruction averages at 8 bytes
+    start = end_pc - 4 * offset
+    instructions = []
+    while len(instructions) < (offset + 1):
+        block = gdb.current_progspace().block_for_pc(start)
+        # Fail fast; if we can't find a block backwards
+        # fill all with "invalid values"
+        if block is None:
+            tmp = []
+            # fill remaining ins with "invalid values" as per DAP spec
+            for i in range(0, offset - len(instructions)):
+                tmp.append({"addr": 0, "asm": "unknown"})
+            instructions = tmp + instructions
+            # we don't have to break out early, as the while loop
+            # predicate will now return false, but we do so for clarity
+            break
+        else:
+            ins = arch.disassemble(start_pc=block.start, end_pc=end_pc)
+            instructions = ins + instructions
+        start = start - 8 * (offset - len(instructions))
+        end_pc = block.start
+
+    # Disassembled instructions count is >= OFFSET+1
+    diff = len(instructions) - offset
+    result = instructions[diff : diff + requested_count]
+    # DAP seem to not want the actual instruction *at* end_pc
+    # when disassembling backwards
+    if result[-1]["addr"] == ins_at_pc["addr"]:
+        result.pop()
+        result = [instructions[diff - 1]] + result
+    return result[:requested_count]
+
 
 @in_gdb_thread
-def _disassemble(pc, skip_insns, count):
+def _disassemble(pc, instructionOffset, count):
     try:
         arch = gdb.selected_frame().architecture()
     except gdb.error:
         # Maybe there was no frame.
         arch = gdb.selected_inferior().architecture()
     result = []
-    total_count = skip_insns + count
-    for elt in arch.disassemble(pc, count=total_count)[skip_insns:]:
+    if instructionOffset < 0:
+        ins = _disasm_backwards(arch, pc, instructionOffset, count)
+        instructionOffset = 0
+        count = count - len(ins)
+        result = [
+            {"address": hex(elt["addr"]), "instruction": elt["asm"]} for elt in ins
+        ]
+    # Examples of different requests, once we get here:
+    # If the request was for offset=-200, count=50, we have those now,
+    # i.e, instructions -200 to -150. (count = 0, so below does nothing)
+    # If the request was for -50, 100, we have 50 more ins to disassemble,
+    # i.e remaining instructions are 0..50
+    for elt in arch.disassemble(pc, count=count + instructionOffset)[
+        instructionOffset:
+    ]:
         result.append(
             {
                 "address": hex(elt["addr"]),
diff --git a/gdb/testsuite/gdb.dap/disassemble.c b/gdb/testsuite/gdb.dap/disassemble.c
new file mode 100644
index 00000000000..fe5795c66b3
--- /dev/null
+++ b/gdb/testsuite/gdb.dap/disassemble.c
@@ -0,0 +1,59 @@ 
+/* Copyright 2022-2023 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <math.h>
+
+/* Declare it here, so we can define it after main
+   and thus get a span of addresses across different functions
+   that we can test disassemble request against.
+   Currently we don't test this, with a better test suite, i.e.
+   anything but Tcl, it would be trivial.  But even if we
+   don't explicitly test it, we need "padding" as far as executable
+   code, so we don't land outside the .text section when disassembling*/
+int stupid_gcd (int n1, int n2);
+
+void
+code_doing_dumb_stuff ()
+{
+  int a = 10;
+  float foo = (float)a;
+}
+
+float pythagoras (double a, double b)
+{
+  double c = a*a + b*b;
+  return sqrt (c);
+}
+
+int
+main (void)
+{
+  float c = pythagoras (19, 32);
+  int foo = stupid_gcd ((int)c, 42);
+  code_doing_dumb_stuff ();
+  return foo * (int)c + 42;
+}
+
+int stupid_gcd (int n1, int n2) {
+  while(n1 != n2) {
+    if(n1 > n2)
+      n1 -= n2;
+    else
+      n2 -= n1;
+  }
+  return n1;
+}
diff --git a/gdb/testsuite/gdb.dap/disassemble.exp b/gdb/testsuite/gdb.dap/disassemble.exp
new file mode 100644
index 00000000000..b6beee611f0
--- /dev/null
+++ b/gdb/testsuite/gdb.dap/disassemble.exp
@@ -0,0 +1,102 @@ 
+# Copyright 2022-2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test the stop-at-main extension.
+
+require allow_dap_tests
+
+load_lib dap-support.exp
+
+standard_testfile disassemble.c
+
+if {[build_executable ${testfile}.exp $testfile $srcfile] == -1} {
+    return
+}
+
+if {[dap_launch $testfile {} {} 1] == ""} {
+    return
+}
+
+dap_check_request_and_response "start inferior" configurationDone
+# We didn't explicitly set a breakpoint, so if we hit one, it worked.
+dap_wait_for_event_and_check "stopped at function breakpoint" stopped \
+    "body reason" breakpoint
+
+set bt [lindex [dap_check_request_and_response "backtrace" stackTrace \
+		    {o threadId [i 1]}] \
+	    0]
+set pc [dict get [lindex [dict get $bt body stackFrames] 0] instructionPointerReference]
+
+set da [dap_check_request_and_response "disassemble 10 first, -5 .. 5" disassemble\
+	{o memoryReference [s $pc] instructionOffset [i -5] instructionCount [i 10] }]
+
+set ins_count [ llength [dict get [lindex $da 0] body instructions] ]
+gdb_assert { $ins_count == 10 } "expected 10 instructions"
+
+set da [dap_check_request_and_response "disassemble 11, -10 .. 1" disassemble\
+	{o memoryReference [s $pc] instructionOffset [i -10] instructionCount [i 11] }]
+
+set ins_count [ llength [dict get [lindex $da 0] body instructions] ]
+gdb_assert { $ins_count == 11 } "expected 11 instructions"
+
+set last_should_be_pc [dict get [lindex [dict get [lindex $da 0] body instructions] 10] address]
+gdb_assert { $last_should_be_pc == $pc } "expected 11th ins to be same as pc"
+
+set da20 [dap_check_request_and_response "disassemble 20, 0 .. 20" disassemble\
+	{o memoryReference [s $pc] instructionCount [i 20] }]
+
+# collect 5 disassemblies. We set the breakpoint at the pc after these 5
+# then we can compare with a request of -5, 10, if we can see these 5 instructions
+set da20_5 [dict get [lindex [dict get [lindex $da20 0] body instructions] 5] address]
+set da20_6 [dict get [lindex [dict get [lindex $da20 0] body instructions] 6] address]
+set da20_7 [dict get [lindex [dict get [lindex $da20 0] body instructions] 7] address]
+set da20_8 [dict get [lindex [dict get [lindex $da20 0] body instructions] 8] address]
+set da20_9 [dict get [lindex [dict get [lindex $da20 0] body instructions] 9] address]
+
+set bp_addr [dict get [lindex [dict get [lindex $da20 0] body instructions] 10] address]
+puts "PC IS $last_should_be_pc"
+puts "BP_ADDR IS $bp_addr"
+
+dap_check_request_and_response "set breakpoint by address" \
+	     setInstructionBreakpoints \
+	     {o breakpoints [a [o instructionReference [s $bp_addr] ]]}
+
+dap_check_request_and_response "continue to bp" continue {o threadId [i 1]}
+
+dap_wait_for_event_and_check "stopped after return" stopped \
+    "body reason" breakpoint
+
+
+set bt [lindex [dap_check_request_and_response "backtrace2" stackTrace \
+		    {o threadId [i 1]}] \
+	    0]
+set pc [dict get [lindex [dict get $bt body stackFrames] 0] instructionPointerReference]
+
+set da10 [dap_check_request_and_response "disassemble 10" disassemble\
+	{o memoryReference [s $pc] instructionOffset [i -5] instructionCount [i 10] }]
+
+set da10_0 [dict get [lindex [dict get [lindex $da10 0] body instructions] 0] address]
+set da10_1 [dict get [lindex [dict get [lindex $da10 0] body instructions] 1] address]
+set da10_2 [dict get [lindex [dict get [lindex $da10 0] body instructions] 2] address]
+set da10_3 [dict get [lindex [dict get [lindex $da10 0] body instructions] 3] address]
+set da10_4 [dict get [lindex [dict get [lindex $da10 0] body instructions] 4] address]
+
+gdb_assert { $da10_0 == $da20_5 } "expected $da10_0 == $da20_5"
+gdb_assert { $da10_1 == $da20_6 } "expected $da10_1 == $da20_6"
+gdb_assert { $da10_2 == $da20_7 } "expected $da10_2 == $da20_7"
+gdb_assert { $da10_3 == $da20_8 } "expected $da10_3 == $da20_8"
+gdb_assert { $da10_4 == $da20_9 } "expected $da10_4 == $da20_9"
+
+dap_shutdown