gdb/Windows: Fix detach while running

Message ID 20240411200356.270360-1-pedro@palves.net
State New
Headers
Series gdb/Windows: Fix detach while running |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm fail Testing failed

Commit Message

Pedro Alves April 11, 2024, 8:03 p.m. UTC
  While testing a WIP Cygwin GDB that supports non-stop, I noticed that
gdb.threads/attach-non-stop.exp exposes that this:

 (gdb) attach PID&
 ...
 (gdb) detach

... hangs.

And it turns out that it hangs in all-stop as well.  This commits
fixes that.

After "attach &", the target is set running, we've called
ContinueDebugEvent and the process_thread thread is waiting for
WaitForDebugEvent events.  It is the equivalent of "attach; c&".

In windows_nat_target::detach, the first thing we do is
unconditionally call windows_continue (for ContinueDebugEvent), which
blocks in do_synchronously, until the process_thread sees an event out
of WaitForDebugEvent.  Unless the inferior happens to run into a
breakpoint, etc., then this hangs indefinitely.

If we've already called ContinueDebugEvent earlier, then we shouldn't
be calling it again in ::detach.

Still in windows_nat_target::detach, we have an interesting issue that
ends up being the bulk of the patch -- only the process_thread thread
can call DebugActiveProcessStop, but if it is blocked in
WaitForDebugEvent, we need to somehow force it to break out of it.
The only way to do that, is to force the inferior to do something that
causes WaitForDebugEvent to return some event.

This patch uses CreateRemoteThread to do it, which results in
WaitForDebugEvent reporting CREATE_THREAD_DEBUG_EVENT.  We then
terminate the injected thread before it has a chance to run any
userspace code.

Note that Win32 functions like DebugBreakProcess and
GenerateConsoleCtrlEvent would also inject a new thread in the
inferior.  I first used DebugBreakProcess, but that is actually more
complicated to use, because we'd have to be sure to consume the
breakpoint event before detaching, otherwise the inferior would likely
die due a breakpoint exception being raised with no debugger around to
intercept it.

See the new break_out_process_thread method.

So the fix has two parts:

 - Keep track of whether we've called ContinueDebugEvent and the
   process_thread thread is waiting for events, or whether
   WaitForDebugEvent already returned an event.

 - In windows_nat_target::detach, if the process_thread thread is
   waiting for events, unblock out of its WaitForDebugEvent, before
   proceeding with the actual detach.

New test included.  Passes cleanly on GNU/Linux native and gdbserver,
and also passes cleanly on Cygwin, with the fix.  Before the fix, it
would hang and fail with a timeout.

Change-Id: Ifb91c58c08af1a9bcbafecedc93dfce001040905
---
 .../gdb.threads/detach-while-running.c        |  30 ++++
 .../gdb.threads/detach-while-running.exp      |  77 ++++++++++
 gdb/windows-nat.c                             | 131 ++++++++++++++++--
 3 files changed, 230 insertions(+), 8 deletions(-)
 create mode 100644 gdb/testsuite/gdb.threads/detach-while-running.c
 create mode 100644 gdb/testsuite/gdb.threads/detach-while-running.exp


base-commit: 02d02fc7924992ddd98073b95810b957efdc421a
  

Comments

Hannes Domani April 12, 2024, 10:56 a.m. UTC | #1
Am Donnerstag, 11. April 2024 um 22:04:34 MESZ hat Pedro Alves <pedro@palves.net> Folgendes geschrieben:

> While testing a WIP Cygwin GDB that supports non-stop, I noticed that
> gdb.threads/attach-non-stop.exp exposes that this:
>
> (gdb) attach PID&
> ...
> (gdb) detach
>
> ... hangs.
>
> And it turns out that it hangs in all-stop as well.  This commits
> fixes that.
>
> After "attach &", the target is set running, we've called
> ContinueDebugEvent and the process_thread thread is waiting for
> WaitForDebugEvent events.  It is the equivalent of "attach; c&".
>
> In windows_nat_target::detach, the first thing we do is
> unconditionally call windows_continue (for ContinueDebugEvent), which
> blocks in do_synchronously, until the process_thread sees an event out
> of WaitForDebugEvent.  Unless the inferior happens to run into a
> breakpoint, etc., then this hangs indefinitely.
>
> If we've already called ContinueDebugEvent earlier, then we shouldn't
> be calling it again in ::detach.
>
> Still in windows_nat_target::detach, we have an interesting issue that
> ends up being the bulk of the patch -- only the process_thread thread
> can call DebugActiveProcessStop, but if it is blocked in
> WaitForDebugEvent, we need to somehow force it to break out of it.
> The only way to do that, is to force the inferior to do something that
> causes WaitForDebugEvent to return some event.
>
> This patch uses CreateRemoteThread to do it, which results in
> WaitForDebugEvent reporting CREATE_THREAD_DEBUG_EVENT.  We then
> terminate the injected thread before it has a chance to run any
> userspace code.
>
> Note that Win32 functions like DebugBreakProcess and
> GenerateConsoleCtrlEvent would also inject a new thread in the
> inferior.  I first used DebugBreakProcess, but that is actually more
> complicated to use, because we'd have to be sure to consume the
> breakpoint event before detaching, otherwise the inferior would likely
> die due a breakpoint exception being raised with no debugger around to
> intercept it.
>
> See the new break_out_process_thread method.
>
> So the fix has two parts:
>
> - Keep track of whether we've called ContinueDebugEvent and the
>   process_thread thread is waiting for events, or whether
>   WaitForDebugEvent already returned an event.
>
> - In windows_nat_target::detach, if the process_thread thread is
>   waiting for events, unblock out of its WaitForDebugEvent, before
>   proceeding with the actual detach.
>
> New test included.  Passes cleanly on GNU/Linux native and gdbserver,
> and also passes cleanly on Cygwin, with the fix.  Before the fix, it
> would hang and fail with a timeout.

I tried this, and it works for me too for Windows native.

Tested-By: Hannes Domani <ssbssa@yahoo.de>

I have 2 comments below.


> Change-Id: Ifb91c58c08af1a9bcbafecedc93dfce001040905
> ---
> .../gdb.threads/detach-while-running.c        |  30 ++++
> .../gdb.threads/detach-while-running.exp      |  77 ++++++++++
> gdb/windows-nat.c                            | 131 ++++++++++++++++--
> 3 files changed, 230 insertions(+), 8 deletions(-)
> create mode 100644 gdb/testsuite/gdb.threads/detach-while-running.c
> create mode 100644 gdb/testsuite/gdb.threads/detach-while-running.exp
>
> diff --git a/gdb/testsuite/gdb.threads/detach-while-running.c b/gdb/testsuite/gdb.threads/detach-while-running.c
> new file mode 100644
> index 00000000000..19cc3b5761c
> --- /dev/null
> +++ b/gdb/testsuite/gdb.threads/detach-while-running.c
> @@ -0,0 +1,30 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +  Copyright 2024 Free Software Foundation, Inc.
> +
> +  This program is free software; you can redistribute it and/or modify
> +  it under the terms of the GNU General Public License as published by
> +  the Free Software Foundation; either version 3 of the License, or
> +  (at your option) any later version.
> +
> +  This program is distributed in the hope that it will be useful,
> +  but WITHOUT ANY WARRANTY; without even the implied warranty of
> +  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +  GNU General Public License for more details.
> +
> +  You should have received a copy of the GNU General Public License
> +  along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include <assert.h>
> +#include <unistd.h>
> +
> +int
> +main (int argc, char **argv)
> +{
> +  alarm (30);
> +
> +  while (1)
> +    sleep (1);
> +
> +  return 0;
> +}

The test passes for me, but I had to #ifdef the alarm() function out,
since it doesn't exist on Windows.


> diff --git a/gdb/windows-nat.c b/gdb/windows-nat.c
> index b123a66ef0f..7e571f281cb 100644
> --- a/gdb/windows-nat.c
> +++ b/gdb/windows-nat.c
> @@ -2072,20 +2089,118 @@ windows_nat_target::attach (const char *args, int from_tty)
>   target_terminal::ours ();
> }
>
> +void
> +windows_nat_target::break_out_process_thread (bool &process_alive)
> +{
> +  /* This is called when the process_thread thread is blocked in
> +    WaitForDebugEvent (unless it already returned some event we
> +    haven't consumed yet), and we need to unblock it so that we can
> +    have it call DebugActiveProcessStop.
> +
> +    To make WaitForDebugEvent return, we need to force some event in
> +    the inferior.  Any method that lets us do that (without
> +    disturbing the other threads), injects a new thread in the
> +    inferior.
> +
> +    We don't use DebugBreakProcess for this, because that injects a
> +    thread that ends up executing a breakpoint instruction.  We can't
> +    let the injected thread hit that breakpoint _after_ we've
> +    detached.  Consuming events until we see a breakpoint trap isn't
> +    100% reliable, because we can't distinguish it from some other
> +    thread itself deciding to call int3 while we're detaching, unless
> +    we temporarily suspend all threads.  It's just a lot of
> +    complication, and there's an easier way.
> +
> +    Important observation: the thread creation event for the newly
> +    injected thread is sufficient to unblock WaitForDebugEvent.
> +
> +    Instead of DebugBreakProcess, we can instead use
> +    CreateRemoteThread to control the code that the injected thread
> +    runs ourselves.  We could consider pointing the injected thread
> +    at some side-effect-free Win32 function as entry point.  However,
> +    finding the address of such a function requires having at least
> +    minimal symbols loaded for ntdll.dll.  Having a way that avoids
> +    that is better, so that detach always works correctly even when
> +    we don't have any symbols loaded.
> +
> +    So what we do is inject a thread that doesn't actually run ANY
> +    userspace code, because we force-terminate it as soon as we see
> +    its corresponding thread creation event.  CreateRemoteThread
> +    gives us the new thread's ID, which we can match with the thread
> +    associated with the CREATE_THREAD_DEBUG_EVENT event.  */
> +
> +  DWORD injected_thread_id = 0;
> +  HANDLE injected_thread_handle
> +    = CreateRemoteThread (windows_process.handle, NULL,
> +              0, (LPTHREAD_START_ROUTINE) 0,
> +              NULL, 0, &injected_thread_id);
> +
> +  process_alive = true;
> +
> +  /* At this point, the user has declared that they want to detach, so
> +    any event that happens from this point on should be forwarded to
> +    the inferior.  */
> +
> +  for (;;)
> +    {
> +      DEBUG_EVENT current_event;
> +      wait_for_debug_event_main_thread (&current_event);
> +
> +      if (current_event.dwDebugEventCode == EXIT_PROCESS_DEBUG_EVENT)
> +    {
> +      DEBUG_EVENTS ("got EXIT_PROCESS_DEBUG_EVENT");
> +      process_alive = false;
> +      break;

Not sure how likely it is to get here, but injected_thread_handle
leaks in this case.


> +    }
> +
> +      if (current_event.dwDebugEventCode == CREATE_THREAD_DEBUG_EVENT
> +      && current_event.dwThreadId == injected_thread_id)
> +    {
> +      DEBUG_EVENTS ("got CREATE_THREAD_DEBUG_EVENT for injected thread");
> +
> +      /* Terminate the injected thread, so it doesn't run any code
> +        at all.  All we wanted was some event, and
> +        CREATE_THREAD_DEBUG_EVENT is sufficient.  */
> +      CHECK (TerminateThread (injected_thread_handle, 0));
> +      CHECK (CloseHandle (injected_thread_handle));
> +      break;
> +    }
> +
> +      DEBUG_EVENTS ("got unrelated event, code %u",
> +            current_event.dwDebugEventCode);
> +      windows_continue (DBG_CONTINUE, -1, 0);
> +    }
> +}


Hannes
  

Patch

diff --git a/gdb/testsuite/gdb.threads/detach-while-running.c b/gdb/testsuite/gdb.threads/detach-while-running.c
new file mode 100644
index 00000000000..19cc3b5761c
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/detach-while-running.c
@@ -0,0 +1,30 @@ 
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <unistd.h>
+
+int
+main (int argc, char **argv)
+{
+  alarm (30);
+
+  while (1)
+    sleep (1);
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.threads/detach-while-running.exp b/gdb/testsuite/gdb.threads/detach-while-running.exp
new file mode 100644
index 00000000000..71e054f5e14
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/detach-while-running.exp
@@ -0,0 +1,77 @@ 
+# Copyright 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test detaching while the inferior is running.  Basically:
+#
+#  (gdb) attach PID
+#  (gdb) c&
+#  (gdb) detach
+
+require can_spawn_for_attach
+
+standard_testfile
+
+if {[build_executable "failed to prepare" $testfile $srcfile {debug}] == -1} {
+    return
+}
+
+# The test proper.  See description above.
+
+proc test {} {
+    global binfile gdb_prompt
+
+    # This test requires executing commands while the target is
+    # running, which, when testing with the remote target, requires
+    # non-stop remote protocol.  Until that variant of the RSP is the
+    # default, force target non-stop mode on.
+    set is_remote \
+	[expr {[target_info exists gdb_protocol] \
+		   && ([target_info gdb_protocol] == "remote" \
+			   || [target_info gdb_protocol] == "extended-remote")}]
+
+    save_vars { ::GDBFLAGS } {
+	if {$is_remote} {
+	    append ::GDBFLAGS " -ex \"maint set target-non-stop on\""
+	}
+	clean_restart ${binfile}
+    }
+
+    set test_spawn_id [spawn_wait_for_attach $binfile]
+    set testpid [spawn_id_get_pid $test_spawn_id]
+
+    set attached 0
+    set any "\[^\r\n\]*"
+
+    gdb_test_multiple "attach $testpid" "attach" {
+	-re "Attaching to program:${any}process $testpid\r\n.*$gdb_prompt " {
+	    pass $gdb_test_name
+	    set attached 1
+	}
+    }
+
+    if {$attached} {
+	gdb_test_multiple "continue &" "" {
+	    -re "Continuing\.\r\n$::gdb_prompt " {
+		pass $gdb_test_name
+	    }
+	}
+
+	gdb_test "detach" "Detaching from.*"
+    }
+
+    kill_wait_spawned_process $test_spawn_id
+}
+
+test
diff --git a/gdb/windows-nat.c b/gdb/windows-nat.c
index b123a66ef0f..7e571f281cb 100644
--- a/gdb/windows-nat.c
+++ b/gdb/windows-nat.c
@@ -357,6 +357,13 @@  struct windows_nat_target final : public x86_nat_target<inf_child_target>
      needed.  */
   void wait_for_debug_event_main_thread (DEBUG_EVENT *event);
 
+  /* Force the process_thread thread to return from WaitForDebugEvent.
+     PROCESS_ALIVE is set to false if the inferior process exits while
+     we're trying to break out the process_thread thread.  This can
+     happen because this is called while all threads are running free,
+     while we're trying to detach.  */
+  void break_out_process_thread (bool &process_alive);
+
   /* Queue used to send requests to process_thread.  This is
      implicitly locked.  */
   std::queue<gdb::function_view<bool ()>> m_queue;
@@ -379,6 +386,12 @@  struct windows_nat_target final : public x86_nat_target<inf_child_target>
 
   /* True if currently in async mode.  */
   bool m_is_async = false;
+
+  /* True if we last called ContinueDebugEvent and the process_thread
+     thread is now waiting for events.  False if WaitForDebugEvent
+     already returned an event, and we need to ContinueDebugEvent
+     again to restart the inferior.  */
+  bool m_continued = false;
 };
 
 static void
@@ -498,6 +511,8 @@  windows_nat_target::wait_for_debug_event_main_thread (DEBUG_EVENT *event)
 	wait_for_debug_event (event, INFINITE);
       return false;
     });
+
+  m_continued = false;
 }
 
 /* See nat/windows-nat.h.  */
@@ -1352,6 +1367,8 @@  windows_nat_target::windows_continue (DWORD continue_status, int id,
 				" - ContinueDebugEvent failed"),
 			      *err);
 
+  m_continued = !last_call;
+
   return TRUE;
 }
 
@@ -2072,20 +2089,118 @@  windows_nat_target::attach (const char *args, int from_tty)
   target_terminal::ours ();
 }
 
+void
+windows_nat_target::break_out_process_thread (bool &process_alive)
+{
+  /* This is called when the process_thread thread is blocked in
+     WaitForDebugEvent (unless it already returned some event we
+     haven't consumed yet), and we need to unblock it so that we can
+     have it call DebugActiveProcessStop.
+
+     To make WaitForDebugEvent return, we need to force some event in
+     the inferior.  Any method that lets us do that (without
+     disturbing the other threads), injects a new thread in the
+     inferior.
+
+     We don't use DebugBreakProcess for this, because that injects a
+     thread that ends up executing a breakpoint instruction.  We can't
+     let the injected thread hit that breakpoint _after_ we've
+     detached.  Consuming events until we see a breakpoint trap isn't
+     100% reliable, because we can't distinguish it from some other
+     thread itself deciding to call int3 while we're detaching, unless
+     we temporarily suspend all threads.  It's just a lot of
+     complication, and there's an easier way.
+
+     Important observation: the thread creation event for the newly
+     injected thread is sufficient to unblock WaitForDebugEvent.
+
+     Instead of DebugBreakProcess, we can instead use
+     CreateRemoteThread to control the code that the injected thread
+     runs ourselves.  We could consider pointing the injected thread
+     at some side-effect-free Win32 function as entry point.  However,
+     finding the address of such a function requires having at least
+     minimal symbols loaded for ntdll.dll.  Having a way that avoids
+     that is better, so that detach always works correctly even when
+     we don't have any symbols loaded.
+
+     So what we do is inject a thread that doesn't actually run ANY
+     userspace code, because we force-terminate it as soon as we see
+     its corresponding thread creation event.  CreateRemoteThread
+     gives us the new thread's ID, which we can match with the thread
+     associated with the CREATE_THREAD_DEBUG_EVENT event.  */
+
+  DWORD injected_thread_id = 0;
+  HANDLE injected_thread_handle
+    = CreateRemoteThread (windows_process.handle, NULL,
+			  0, (LPTHREAD_START_ROUTINE) 0,
+			  NULL, 0, &injected_thread_id);
+
+  process_alive = true;
+
+  /* At this point, the user has declared that they want to detach, so
+     any event that happens from this point on should be forwarded to
+     the inferior.  */
+
+  for (;;)
+    {
+      DEBUG_EVENT current_event;
+      wait_for_debug_event_main_thread (&current_event);
+
+      if (current_event.dwDebugEventCode == EXIT_PROCESS_DEBUG_EVENT)
+	{
+	  DEBUG_EVENTS ("got EXIT_PROCESS_DEBUG_EVENT");
+	  process_alive = false;
+	  break;
+	}
+
+      if (current_event.dwDebugEventCode == CREATE_THREAD_DEBUG_EVENT
+	  && current_event.dwThreadId == injected_thread_id)
+	{
+	  DEBUG_EVENTS ("got CREATE_THREAD_DEBUG_EVENT for injected thread");
+
+	  /* Terminate the injected thread, so it doesn't run any code
+	     at all.  All we wanted was some event, and
+	     CREATE_THREAD_DEBUG_EVENT is sufficient.  */
+	  CHECK (TerminateThread (injected_thread_handle, 0));
+	  CHECK (CloseHandle (injected_thread_handle));
+	  break;
+	}
+
+      DEBUG_EVENTS ("got unrelated event, code %u",
+		    current_event.dwDebugEventCode);
+      windows_continue (DBG_CONTINUE, -1, 0);
+    }
+}
+
 void
 windows_nat_target::detach (inferior *inf, int from_tty)
 {
+  /* If we see the process exit while unblocking the process_thread
+     helper thread, then we should skip the actual
+     DebugActiveProcessStop call.  But don't report an error.  Just
+     pretend the process exited shortly after the detach.  */
+  bool process_alive = true;
+
+  /* The process_thread helper thread will be blocked in
+     WaitForDebugEvent waiting for events if we've resumed the target
+     before we get here, e.g., with "attach&" or "c&".  We need to
+     unblock it so that we can have it call DebugActiveProcessStop
+     below, in the do_synchronously block.  */
+  if (m_continued)
+    break_out_process_thread (process_alive);
+
   windows_continue (DBG_CONTINUE, -1, 0, true);
 
   std::optional<unsigned> err;
-  do_synchronously ([&] ()
-    {
-      if (!DebugActiveProcessStop (windows_process.current_event.dwProcessId))
-	err = (unsigned) GetLastError ();
-      else
-	DebugSetProcessKillOnExit (FALSE);
-      return false;
-    });
+  if (process_alive)
+    do_synchronously ([&] ()
+      {
+	if (!DebugActiveProcessStop (windows_process.current_event.dwProcessId))
+	  err = (unsigned) GetLastError ();
+	else
+	  DebugSetProcessKillOnExit (FALSE);
+	return false;
+      });
 
   if (err.has_value ())
     {