Remove stale breakpoint step-over information

Message ID alpine.LFD.2.21.1910182100030.2438@redsun52.ssa.fujisawa.hgst.com
State Superseded
Headers

Commit Message

Maciej W. Rozycki Oct. 19, 2019, 7:10 p.m. UTC
  Fix an issue with the `step_over_info' structure introduced with commit 
31e77af205cf ("PR breakpoints/7143 - Watchpoint does not trigger when 
first set"), where information stored in there is not invalidated in a 
remote debug scenario where software stepping in the all-stop mode is 
forcibly interrupted and the target connection then closed.  As a result 
a subsequent target connection triggers an assertion failure on 
`ecs->event_thread->control.trap_expected' and cannot be successfully 
established, making the particular instance of GDB unusable.

This was observed with a `riscv64-linux-gnu' cross-GDB and RISC-V/Linux 
`gdbserver' being developed and an attempt to step over the `ret' aka 
`c.jr ra' instruction where the value held in the `ra' aka `x1' register 
and therefore the address of a software-stepping breakpoint to insert is 
0, as follows:

(gdb) target remote 1.2.3.4:56789
Remote debugging using 1.2.3.4:56789
warning: while parsing target description (at line 4): Target description specified unknown architecture "riscv:rv64id"
warning: Could not load XML target description; ignoring
Reading symbols from .../sysroot/lib/ld-linux-riscv64-lp64d.so.1...
0x0000001555556ef0 in _start ()
   from .../sysroot/lib/ld-linux-riscv64-lp64d.so.1
(gdb) break main
Breakpoint 1 at 0x1643c
(gdb) continue
Continuing.
Cannot access memory at address 0x0
(gdb) x /i $pc
=> 0x15555607b8 <__GI__dl_debug_state>: ret
(gdb) print /x $ra
$1 = 0x0
(gdb) stepi
^C^CInterrupted while waiting for the program.
Give up waiting? (y or n) y
Quit
(gdb) kill
Kill the program being debugged? (y or n) y
[Inferior 1 (process 8964) killed]
(gdb) target remote 1.2.3.4:56789
Remote debugging using 1.2.3.4:56789
warning: while parsing target description (at line 4): Target description specified unknown architecture "riscv:rv64id"
warning: Could not load XML target description; ignoring
.../gdb/infrun.c:5299: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) y

This is a bug, please report it.  For instructions, see:
<http://www.gnu.org/software/gdb/bugs/>.

.../gdb/infrun.c:5299: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) n
Command aborted.
(gdb) 

(the value of `ra' being considered 0 may well have been caused by GDB's 
inability to accept the XML target description supplied by `gdbserver', 
but that is merely one way to trigger the assertion failure due to an 
interrupted single step).

Correct the issue by making a call to clear global breakpoint step-over 
information from `exit_inferior_1', which is where we already do all 
kinds of similar clean-ups, e.g. `delete_thread' called from there 
clears per-thread step-over information.

	gdb/
	* infrun.h (clear_step_over_info): New prototype.
	* infrun.c (clear_step_over_info): Make external.
	* inferior.c (exit_inferior_1): Call `clear_step_over_info'.
---
Hi,

 Here's a record of a remote debug session leading to the issue:

Sending packet: $vCont?#49...Packet received: vCont;c;C;t
Packet vCont (verbose-resume) is supported
Sending packet: $vCont;c:p23cb.-1#08...Packet received: T05swbreak:;02:20da080000000000;20:b807565515000000;thread:p23cb.23cb;core:3;
Sending packet: $qXfer:libraries-svr4:read::0,1000#20...Packet received: l<library-list-svr4 version="1.0" main-lm="0x155556f160"><library name=".../sysroot/lib/ld-linux-riscv64-lp64d.so.1" lm="0x155556ea18" l_addr="0x1555556000" l_ld="0x155556de90"/><library name="linux-vdso.so.1" lm="0x155556f6f0" l_addr="0x1555570000" l_ld="0x1555570350"/></library-list-svr4>
Sending packet: $X15555607b8,2:\202\200#2e...Packet received: OK
Sending packet: $m15555607b8,2#07...Packet received: 8280
Sending packet: $m15555607b8,2#07...Packet received: 8280
Sending packet: $g#67...Packet received: 0000000000000000000000000000000020da08000000000000000000000000000000000000000000d0ae040000000000696e74000000000000000000000000000100000000000000020000000000000080d708000000000000000000000000000c0000000000000000000000000000000000000000000000000000000000000000e408000000000020e908000000000000000000000000000000000000000000d0ae04000000000072697363765f646f00000000000000000100000000000000030000000000000000e408000000000020f208000000000000000000000000000000000000000000d0ae04000000000072697363765f646f0000000000000000[552 bytes omitted]
Sending packet: $m0,40#2d...Packet received: E01
Sending packet: $m0,1#fa...Packet received: E01
Sending packet: $m0,40#2d...Packet received: E01
Sending packet: $m0,1#fa...Packet received: E01
Cannot access memory at address 0x0
(gdb) stepi
Sending packet: $m15555607b8,4#09...Packet received: 82802a87
Sending packet: $m15555607b4,4#05...Packet received: 01452dbd
^Cremote_pass_ctrlc called
remote_interrupt called
^Cremote_pass_ctrlc called
Interrupted while waiting for the program.
Give up waiting? (y or n) y
Quit
(gdb) 

As you can see the explicit `stepi' command does something suspicious and 
does not even try to issue a `vCont' packet once the insertion of a 
software stepping breakpoint has failed, which in turn means the remote 
target remains halted and therefore a stop reply is never issued.  I am 
going to investigate that next.

 Also, of course, I will make GDB accept the architectures produced by our 
own RISC-V XML target description generation code before RISC-V/Linux 
`gdbserver' is submitted.

 For completeness this change was natively regression-tested with the 
`x86_64-linux-gnu' configuration, although I'd consider it obviously 
harmless and only relevant to the affected scenarios, which, in turn, 
cannot be triggered in regression testing without a controlled way to 
induce incorrect behaviour (which could actually be the objective of an 
interesting project).

 OK to apply?

  Maciej
---
 gdb/inferior.c |    3 +++
 gdb/infrun.c   |    5 ++---
 gdb/infrun.h   |    4 ++++
 3 files changed, 9 insertions(+), 3 deletions(-)

gdb-clear-step-over-info.diff
  

Comments

Maciej W. Rozycki Oct. 29, 2019, 9:53 p.m. UTC | #1
On Sat, 19 Oct 2019, Maciej W. Rozycki wrote:

> Fix an issue with the `step_over_info' structure introduced with commit 
> 31e77af205cf ("PR breakpoints/7143 - Watchpoint does not trigger when 
> first set"), where information stored in there is not invalidated in a 
> remote debug scenario where software stepping in the all-stop mode is 
> forcibly interrupted and the target connection then closed.  As a result 
> a subsequent target connection triggers an assertion failure on 
> `ecs->event_thread->control.trap_expected' and cannot be successfully 
> established, making the particular instance of GDB unusable.

 Ping for:

<https://sourceware.org/ml/gdb-patches/2019-10/msg00659.html>

  Maciej
  
Pedro Alves Oct. 31, 2019, 5:46 p.m. UTC | #2
On 10/19/19 8:10 PM, Maciej W. Rozycki wrote:

> Correct the issue by making a call to clear global breakpoint step-over 
> information from `exit_inferior_1', which is where we already do all 
> kinds of similar clean-ups, e.g. `delete_thread' called from there 
> clears per-thread step-over information.
> 

This looks like a fragile place to clear this, considering
multiprocess.  I.e., what if we're calling exit_inferior for some
inferior other than the one that was stepping.

The step over information is associated with a thread.
I think it'd be better to clear the step over information
when the corresponding thread is deleted.

So something like adding a thread_info parameter to
clear_step_over_info, and then calling clear_step_over_info
from clear_thread_inferior_resources.  clear_step_over_info
would only clear the info if the thread matched, or if NULL
is passed.  Would that work?

Thanks,
Pedro Alves
  

Patch

Index: binutils-gdb/gdb/inferior.c
===================================================================
--- binutils-gdb.orig/gdb/inferior.c
+++ binutils-gdb/gdb/inferior.c
@@ -209,6 +209,9 @@  exit_inferior_1 (struct inferior *inftoe
   /* Reset it.  */
   inf->control = inferior_control_state (NO_STOP_QUIETLY);
 
+  /* Remove any stale breakpoint step-over information.  */
+  clear_step_over_info ();
+
   /* Clear the register cache and the frame cache.  */
   registers_changed ();
   reinit_frame_cache ();
Index: binutils-gdb/gdb/infrun.c
===================================================================
--- binutils-gdb.orig/gdb/infrun.c
+++ binutils-gdb/gdb/infrun.c
@@ -1330,10 +1330,9 @@  set_step_over_info (const address_space
   step_over_info.thread = thread;
 }
 
-/* Called when we're not longer stepping over a breakpoint / an
-   instruction, so all breakpoints are free to be (re)inserted.  */
+/* See infrun.h.  */
 
-static void
+void
 clear_step_over_info (void)
 {
   if (debug_infrun)
Index: binutils-gdb/gdb/infrun.h
===================================================================
--- binutils-gdb.orig/gdb/infrun.h
+++ binutils-gdb/gdb/infrun.h
@@ -120,6 +120,10 @@  extern void insert_step_resume_breakpoin
 						  struct symtab_and_line ,
 						  struct frame_id);
 
+/* Called when we're not longer stepping over a breakpoint / an
+   instruction, so all breakpoints are free to be (re)inserted.  */
+void clear_step_over_info (void);
+
 /* Returns true if we're trying to step past the instruction at
    ADDRESS in ASPACE.  */
 extern int stepping_past_instruction_at (struct address_space *aspace,