[9/9] gdb/testsuite/dap: fix gdb.dap/basic-dap.exp disassembly test for PIE

Message ID 20230106185729.42372-10-simon.marchi@efficios.com
State Committed
Commit d4c4ea75838091b6bf52d97208bd2fa4021951db
Headers
Series Fix gdb.dap/basic-dap.exp for PIE |

Commit Message

Simon Marchi Jan. 6, 2023, 6:57 p.m. UTC
  Prior to this patch, I get:

    >>> {"seq": 17, "type": "request", "command": "disassemble", "arguments": {"memoryReference": "0x115d", "instructionCount": 1}}
    Content-Length: 147

    {"request_seq": 17, "type": "response", "command": "disassemble", "success": false, "message": "Cannot access memory at address 0x115d", "seq": 41}FAIL: gdb.dap/basic-dap.exp: disassemble one instruction success
    FAIL: gdb.dap/basic-dap.exp: instructions in disassemble output

The problem is that the PC to disassemble is taken from the breakpoint
insertion response, which happens before running.  With a PIE
executable, that PC is unrelocated, but the disassembly request happens
after relocation.

I chose to fix this by watching for a breakpoint changed event giving
the new breakpoint address, and recording the address from there.  I
think this is an interesting way to fix it, because it adds a bit of
test coverage, I don't think these events are checked right now.

Other ways to fix it would be:

 - Get the address by doing a breakpoint insertion after the program is
   started, or some other way.
 - Do the disassembly by symbol instead of by address.
 - Do the disassembly before running the program.

Change-Id: I3c396f796ac4c8b22e7dfd2fa1c5467f7a47e84e
---
 gdb/testsuite/gdb.dap/basic-dap.exp | 33 ++++++++++++++++++++++++++---
 1 file changed, 30 insertions(+), 3 deletions(-)
  

Comments

Tom Tromey Jan. 25, 2023, 10:10 p.m. UTC | #1
>>>>> "Simon" == Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:

>>>> {... {"memoryReference": "0x115d"

Simon> The problem is that the PC to disassemble is taken from the breakpoint
Simon> insertion response, which happens before running.  With a PIE
Simon> executable, that PC is unrelocated, but the disassembly request happens
Simon> after relocation.

This is an odd one.  It isn't super clear when memory references are
invalidated.  (The spec doesn't even define what a memory reference is.)

Simon> I chose to fix this by watching for a breakpoint changed event giving
Simon> the new breakpoint address, and recording the address from there.  I
Simon> think this is an interesting way to fix it, because it adds a bit of
Simon> test coverage, I don't think these events are checked right now.

This seems pretty good to me.

Simon>  - Do the disassembly by symbol instead of by address.

I don't think this is possible in DAP.  One of the many holes.

Tom
  
Simon Marchi Jan. 26, 2023, 3:40 a.m. UTC | #2
On 1/25/23 17:10, Tom Tromey wrote:
>>>>>> "Simon" == Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
>>>>> {... {"memoryReference": "0x115d"
> 
> Simon> The problem is that the PC to disassemble is taken from the breakpoint
> Simon> insertion response, which happens before running.  With a PIE
> Simon> executable, that PC is unrelocated, but the disassembly request happens
> Simon> after relocation.
> 
> This is an odd one.  It isn't super clear when memory references are
> invalidated.  (The spec doesn't even define what a memory reference is.)

I'm not sure what you mean.  The same could happen in a regular GDB
test, if you grabbed a function's address, started the program, and
tried to disassemble at that address.  I don't expect GDB to be able to
do anything with the unrelocated address obtained before running.

Since there isn't a DAP event that says when symbols have changed
(AFAIK), I think DAP clients have to assume that any symbol address may
be invalid after resuming the program.

> Simon> I chose to fix this by watching for a breakpoint changed event giving
> Simon> the new breakpoint address, and recording the address from there.  I
> Simon> think this is an interesting way to fix it, because it adds a bit of
> Simon> test coverage, I don't think these events are checked right now.
> 
> This seems pretty good to me.
> 
> Simon>  - Do the disassembly by symbol instead of by address.
> 
> I don't think this is possible in DAP.  One of the many holes.

When I saw this in the DisassembleArguments DAP structure:

  /**
   * Memory reference to the base location containing the instructions to
   * disassemble.
   */
  memoryReference: string;

I assumed that memoryReference could be pretty much anything that
resolves to an address, essentially the same thing you could pass to
GDB's disassemble command.  But that's just my interpretation of it.

Simon
  
Tom Tromey Jan. 26, 2023, 3:08 p.m. UTC | #3
>> This is an odd one.  It isn't super clear when memory references are
>> invalidated.  (The spec doesn't even define what a memory reference is.)

Simon> I'm not sure what you mean.

Actually I used the wrong word there.  In this case the term is really
"instruction reference"

Anyway, what I mean is that the DAP spec refers to "instruction
reference" without defining it.  So for example the response to a
setBreakpoints request is essentially an array of Breakpoint, which is
documented as:

  /**
   * A memory reference to where the breakpoint is set.
   */
  instructionReference?: string;

However, what are the semantics of this?  The spec does not say.  So, we
don't know what values might be valid (and pragmatically I think a
client has to assume it must treat them as opaque cookies).  When are
they invalidated, or are they assumed to be valid for the lifetime of
the inferior?

Maybe gdb could send an InvalidatedEvent but even this is pretty vague.

Simon> Since there isn't a DAP event that says when symbols have changed
Simon> (AFAIK), I think DAP clients have to assume that any symbol address may
Simon> be invalid after resuming the program.

I agree, though it's unwritten.

>> I don't think this is possible in DAP.  One of the many holes.

Simon> When I saw this in the DisassembleArguments DAP structure:

Simon>   /**
Simon>    * Memory reference to the base location containing the instructions to
Simon>    * disassemble.
Simon>    */
Simon>   memoryReference: string;

Simon> I assumed that memoryReference could be pretty much anything that
Simon> resolves to an address, essentially the same thing you could pass to
Simon> GDB's disassemble command.  But that's just my interpretation of it.

This would be convenient and perhaps it is what gdb ought to do, but I
don't see any text in the spec supporting this approach, so unless a
client specifically knows it will only talk to gdb, it seems risky for
them to use it.  And of course if you're specifically talking only to
gdb, you might as well use MI, which despite its flaws has solved all
the problems that DAP still has.

Probably what would be better for gdb is something that uses the DAP
transport -- JSON-RPC is just amazingly easy to work with -- but with MI
requests as the high-level protocol.  Or, that would have been better in
some alternate universe anyway.

Anyway I imagine the route forward is to file a bunch of DAP bugs and
try to get them resolved; though I haven't seen much progress on, e.g.,
the multiple location bug.

Tom
  
Simon Marchi Jan. 26, 2023, 8:21 p.m. UTC | #4
On 1/26/23 10:08, Tom Tromey wrote:
>>> This is an odd one.  It isn't super clear when memory references are
>>> invalidated.  (The spec doesn't even define what a memory reference is.)
> 
> Simon> I'm not sure what you mean.
> 
> Actually I used the wrong word there.  In this case the term is really
> "instruction reference"
> 
> Anyway, what I mean is that the DAP spec refers to "instruction
> reference" without defining it.  So for example the response to a
> setBreakpoints request is essentially an array of Breakpoint, which is
> documented as:
> 
>   /**
>    * A memory reference to where the breakpoint is set.
>    */
>   instructionReference?: string;
> 
> However, what are the semantics of this?  The spec does not say.  So, we
> don't know what values might be valid (and pragmatically I think a
> client has to assume it must treat them as opaque cookies).  When are
> they invalidated, or are they assumed to be valid for the lifetime of
> the inferior?
> 
> Maybe gdb could send an InvalidatedEvent but even this is pretty vague.

Ok, thanks for the explanation.

Well, the client gets an initial instructionReference for the
breakpoint.  But then GDB sends a "breakpoint changed" event for that
same breakpoint, with a new instructionReference.  That sounds like a
good way to tell the client "your old instructionReference for that
breakpoint is not longer valid".

> 
> Simon> Since there isn't a DAP event that says when symbols have changed
> Simon> (AFAIK), I think DAP clients have to assume that any symbol address may
> Simon> be invalid after resuming the program.
> 
> I agree, though it's unwritten.
> 
>>> I don't think this is possible in DAP.  One of the many holes.
> 
> Simon> When I saw this in the DisassembleArguments DAP structure:
> 
> Simon>   /**
> Simon>    * Memory reference to the base location containing the instructions to
> Simon>    * disassemble.
> Simon>    */
> Simon>   memoryReference: string;
> 
> Simon> I assumed that memoryReference could be pretty much anything that
> Simon> resolves to an address, essentially the same thing you could pass to
> Simon> GDB's disassemble command.  But that's just my interpretation of it.
> 
> This would be convenient and perhaps it is what gdb ought to do, but I
> don't see any text in the spec supporting this approach, so unless a
> client specifically knows it will only talk to gdb, it seems risky for
> them to use it.  And of course if you're specifically talking only to
> gdb, you might as well use MI, which despite its flaws has solved all
> the problems that DAP still has.
> 
> Probably what would be better for gdb is something that uses the DAP
> transport -- JSON-RPC is just amazingly easy to work with -- but with MI
> requests as the high-level protocol.  Or, that would have been better in
> some alternate universe anyway.

I would support changing MI to use JSON-RPC or similar.

At least, if the output could be JSON, rather than the current output
that kinda looks like JSON but isn't, it would be nice.

But even command input would benefit from having a better structure.  I
don't recall them at the moment (because I haven't worked on a frontend
for a long time), but there are some inconsistencies between commands,
how things should be quoted, escaped, etc.

> Anyway I imagine the route forward is to file a bunch of DAP bugs and
> try to get them resolved; though I haven't seen much progress on, e.g.,
> the multiple location bug.

It will happen when somebody at Microsoft needs it, I suppose.

Simon
  

Patch

diff --git a/gdb/testsuite/gdb.dap/basic-dap.exp b/gdb/testsuite/gdb.dap/basic-dap.exp
index 795702a5a8f..5303e943320 100644
--- a/gdb/testsuite/gdb.dap/basic-dap.exp
+++ b/gdb/testsuite/gdb.dap/basic-dap.exp
@@ -90,9 +90,36 @@  set insn_pc [dict get [lindex $bplist 0] instructionReference]
 dap_check_request_and_response "start inferior" configurationDone
 dap_wait_for_event_and_check "inferior started" thread "body reason" started
 
-dap_wait_for_event_and_check "stopped at function breakpoint" stopped \
-    "body reason" breakpoint \
-    "body hitBreakpointIds" $fn_bpno
+# While waiting for the stopped event, we might receive breakpoint changed
+# events indicating some breakpoint addresses were relocated.  Update INSN_PC
+# if necessary.
+lassign [dap_wait_for_event_and_check "stopped at function breakpoint" stopped \
+	    "body reason" breakpoint \
+	    "body hitBreakpointIds" $fn_bpno] unused objs
+foreach obj $objs {
+    if { [dict get $obj "type"] != "event" } {
+	continue
+    }
+
+    if { [dict get $obj "event"] != "breakpoint" } {
+	continue
+    }
+
+    set body [dict get $obj "body"]
+
+    if { [dict get $body "reason"] != "changed" } {
+	continue
+    }
+
+    set breakpoint [dict get $body "breakpoint"]
+    set breakpoint_id [dict get $breakpoint "id"]
+
+    if { $breakpoint_id != $insn_bpno } {
+	continue
+    }
+
+    set insn_pc [dict get $breakpoint "instructionReference"]
+}
 
 set obj [dap_check_request_and_response "evaluate global in function" \
 	     evaluate {o expression [s global_variable]}]