[RFA,7/8] Avoid crashes when stepping through ravenscar context-switching

Message ID 20190326144404.6670-8-tromey@adacore.com
State New, archived
Headers

Commit Message

Tom Tromey March 26, 2019, 2:44 p.m. UTC
  When stepping through the ravenscar context-switching code, gdb will
try to single-step the current thread, while leaving the other threads
stopped.  This results in an assertion failure in finish_step_over,
because the event thread is not the request thread.

This patch introduces a new target method so that infrun can ask the
target whether this can happen; and then changes infrun to handle this
case.  When the target can have an unexpected thread switch,
single-stepping may also switch threads.

gdb/ChangeLog
2019-03-26  Tom Tromey  <tromey@adacore.com>

	* infrun.c (finish_step_over): Check
	target_can_randomly_thread_switch.
	(handle_signal_stop): Handle random thread switches.
	* ravenscar-thread.c (struct ravenscar_thread_target)
	<can_randomly_thread_switch>: New method.
	* target-delegates.c: Rebuild.
	* target.h (struct target_ops) <can_randomly_thread_switch>:
	New method.
	(target_can_randomly_thread_switch): New define.
---
 gdb/ChangeLog          | 12 ++++++++++++
 gdb/infrun.c           | 39 ++++++++++++++++++++++++++++++++-------
 gdb/ravenscar-thread.c |  5 +++++
 gdb/target-delegates.c | 27 +++++++++++++++++++++++++++
 gdb/target.h           | 12 ++++++++++++
 5 files changed, 88 insertions(+), 7 deletions(-)
  

Comments

John Baldwin March 26, 2019, 3:44 p.m. UTC | #1
On 3/26/19 7:44 AM, Tom Tromey wrote:
> When stepping through the ravenscar context-switching code, gdb will
> try to single-step the current thread, while leaving the other threads
> stopped.  This results in an assertion failure in finish_step_over,
> because the event thread is not the request thread.
> 
> This patch introduces a new target method so that infrun can ask the
> target whether this can happen; and then changes infrun to handle this
> case.  When the target can have an unexpected thread switch,
> single-stepping may also switch threads.

I think single-stepping kernels via debug server stubs in things like
qemu can also do this, so it will be useful in those contexts as well.
I'm trying to think if the target is the right place to do this rather
than gdbarch, but I think you are right to do it in the target.  We
might want a way to forward this request to remote targets (FreeBSD's
gdb server for the kernel is a remote target that would probably want
to say that this is true).
  
Tom Tromey Sept. 27, 2019, 7:17 p.m. UTC | #2
>>>>> "Tom" == Tom Tromey <tromey@adacore.com> writes:

Tom> When stepping through the ravenscar context-switching code, gdb will
Tom> try to single-step the current thread, while leaving the other threads
Tom> stopped.  This results in an assertion failure in finish_step_over,
Tom> because the event thread is not the request thread.

Tom> This patch introduces a new target method so that infrun can ask the
Tom> target whether this can happen; and then changes infrun to handle this
Tom> case.  When the target can have an unexpected thread switch,
Tom> single-stepping may also switch threads.

It's been a while since I wrote this.  I still haven't put it in,
because it touches a tricky area.


I found out today that it can help with a different bug:

    https://sourceware.org/bugzilla/show_bug.cgi?id=22992

That bug has a test case that causes a native Windows gdb to crash.

Debugging it, I found that (I think) there's an oversight in
windows-nat.c -- namely, stepping doesn't suspend other threads.
However, this oversight doesn't exist in gdbserver, and the bug still
exists when using that.

I am not 100% sure, but I think this bug shows a Windows kernel problem
or the like, as even with thread suspension hacked in, gdb can get a
debug event on an ostensibly suspended thread.

Changing windows-nat.c to report that it is a "random thread switch"
target (which is introduced by the patch I'm replying to) at least
causes the crash to go away... so I think that's an improvement, even
though it isn't fantastic.

So, if you have some time and understand infrun, I'd appreciate it if
you could take a look at this patch; or if you understand Windows
debugging, if you could take a look at 22992 and see if I've missed
something.

thanks,
Tom
  

Patch

diff --git a/gdb/infrun.c b/gdb/infrun.c
index ad7892105a4..bfc909415c5 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -5328,7 +5328,8 @@  finish_step_over (struct execution_control_state *ecs)
       /* If we're stepping over a breakpoint with all threads locked,
 	 then only the thread that was stepped should be reporting
 	 back an event.  */
-      gdb_assert (ecs->event_thread->control.trap_expected);
+      gdb_assert (ecs->event_thread->control.trap_expected
+		  || target_can_randomly_thread_switch ());
 
       clear_step_over_info ();
     }
@@ -5567,14 +5568,38 @@  handle_signal_stop (struct execution_control_state *ecs)
 	{
 	  if (single_step_breakpoint_inserted_here_p (aspace, pc))
 	    {
-	      if (debug_infrun)
+	      if (target_can_randomly_thread_switch ())
 		{
-		  fprintf_unfiltered (gdb_stdlog,
-				      "infrun: [%s] hit another thread's "
-				      "single-step breakpoint\n",
-				      target_pid_to_str (ecs->ptid).c_str ());
+		  if (debug_infrun)
+		    fprintf_unfiltered (gdb_stdlog,
+					"infrun: [%s] random thread switch\n",
+					(target_pid_to_str
+					 (ecs->ptid).c_str ()));
+
+		  /* Here the step was intended for some other thread;
+		     but instead we saw a context switch.  This can
+		     happen with "green" threads in Ravenscar, and the
+		     recourse is to pretend we wanted to step the new
+		     thread and got an ordinary stop there.  */
+		  for (thread_info *th : all_threads ())
+		    if (th->control.trap_expected)
+		      {
+			std::swap (th->control, ecs->event_thread->control);
+			break;
+		      }
+		}
+	      else
+		{
+		  if (debug_infrun)
+		    {
+		      fprintf_unfiltered (gdb_stdlog,
+					  "infrun: [%s] hit another thread's "
+					  "single-step breakpoint\n",
+					  (target_pid_to_str
+					   (ecs->ptid).c_str ()));
+		    }
+		  ecs->hit_singlestep_breakpoint = 1;
 		}
-	      ecs->hit_singlestep_breakpoint = 1;
 	    }
 	}
       else
diff --git a/gdb/ravenscar-thread.c b/gdb/ravenscar-thread.c
index f36ee3ea240..4e35026e6e2 100644
--- a/gdb/ravenscar-thread.c
+++ b/gdb/ravenscar-thread.c
@@ -89,6 +89,11 @@  struct ravenscar_thread_target final : public target_ops
   ptid_t wait (ptid_t, struct target_waitstatus *, int) override;
   void resume (ptid_t, int, enum gdb_signal) override;
 
+  bool can_randomly_thread_switch () override
+  {
+    return true;
+  }
+
   void fetch_registers (struct regcache *, int) override;
   void store_registers (struct regcache *, int) override;
 
diff --git a/gdb/target-delegates.c b/gdb/target-delegates.c
index 3654f02e63b..8c0d135d010 100644
--- a/gdb/target-delegates.c
+++ b/gdb/target-delegates.c
@@ -19,6 +19,7 @@  struct dummy_target : public target_ops
   void fetch_registers (struct regcache *arg0, int arg1) override;
   void store_registers (struct regcache *arg0, int arg1) override;
   void prepare_to_store (struct regcache *arg0) override;
+  bool can_randomly_thread_switch () override;
   void files_info () override;
   int insert_breakpoint (struct gdbarch *arg0, struct bp_target_info *arg1) override;
   int remove_breakpoint (struct gdbarch *arg0, struct bp_target_info *arg1, enum remove_bp_reason arg2) override;
@@ -186,6 +187,7 @@  struct debug_target : public target_ops
   void fetch_registers (struct regcache *arg0, int arg1) override;
   void store_registers (struct regcache *arg0, int arg1) override;
   void prepare_to_store (struct regcache *arg0) override;
+  bool can_randomly_thread_switch () override;
   void files_info () override;
   int insert_breakpoint (struct gdbarch *arg0, struct bp_target_info *arg1) override;
   int remove_breakpoint (struct gdbarch *arg0, struct bp_target_info *arg1, enum remove_bp_reason arg2) override;
@@ -551,6 +553,31 @@  debug_target::prepare_to_store (struct regcache *arg0)
   fputs_unfiltered (")\n", gdb_stdlog);
 }
 
+bool
+target_ops::can_randomly_thread_switch ()
+{
+  return this->beneath ()->can_randomly_thread_switch ();
+}
+
+bool
+dummy_target::can_randomly_thread_switch ()
+{
+  return false;
+}
+
+bool
+debug_target::can_randomly_thread_switch ()
+{
+  bool result;
+  fprintf_unfiltered (gdb_stdlog, "-> %s->can_randomly_thread_switch (...)\n", this->beneath ()->shortname ());
+  result = this->beneath ()->can_randomly_thread_switch ();
+  fprintf_unfiltered (gdb_stdlog, "<- %s->can_randomly_thread_switch (", this->beneath ()->shortname ());
+  fputs_unfiltered (") = ", gdb_stdlog);
+  target_debug_print_bool (result);
+  fputs_unfiltered ("\n", gdb_stdlog);
+  return result;
+}
+
 void
 target_ops::files_info ()
 {
diff --git a/gdb/target.h b/gdb/target.h
index 4f7a43e1a8b..ad3a8be4c18 100644
--- a/gdb/target.h
+++ b/gdb/target.h
@@ -491,6 +491,13 @@  struct target_ops
     virtual void prepare_to_store (struct regcache *)
       TARGET_DEFAULT_NORETURN (noprocess ());
 
+    /* True if single-stepping on this target can cause an
+       un-asked-for thread switch.  This is normal on "green" thread
+       targets, like Ravenscar, when single-stepping through the
+       context-switching code.  */
+    virtual bool can_randomly_thread_switch ()
+      TARGET_DEFAULT_RETURN (false);
+
     virtual void files_info ()
       TARGET_DEFAULT_IGNORE ();
     virtual int insert_breakpoint (struct gdbarch *,
@@ -1402,6 +1409,11 @@  extern void target_resume (ptid_t ptid, int step, enum gdb_signal signal);
    coalesce multiple resumption requests in a single vCont packet.  */
 extern void target_commit_resume ();
 
+/* A wrapper for target_ops::can_randomly_thread_switch.  */
+
+#define target_can_randomly_thread_switch()			\
+  (current_top_target ()->can_randomly_thread_switch) ()
+
 /* Setup to defer target_commit_resume calls, and reactivate
    target_commit_resume on destruction, if it was previously
    active.  */