[v6,6/6] Remote fork catch

Message ID 1426625788-4469-7-git-send-email-donb@codesourcery.com
State New, archived
Headers

Commit Message

Don Breazeal March 17, 2015, 8:56 p.m. UTC
  Hi Pedro,
This patch has been updated per your review comments here:
https://sourceware.org/ml/gdb-patches/2015-03/msg00086.html.
Explanation follows:

On 3/3/2015 7:00 AM, Pedro Alves wrote:
> On 03/03/2015 06:20 AM, Breazeal, Don wrote:
> 
---snip---
>>> Unfortunately, with your full series applied, I get this:
>>>
>>>  (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc"
>>>  stepi
>>>  Detaching from process 29944
>>>  Killing process(es): 29942 29944
>>>  /home/pedro/gdb/mygit/src/gdb/gdbserver/linux-low.c:998: A problem internal to GDBserver has been detected.
>>>  kill_wait_lwp: Assertion `res > 0' failed.
>>>  /home/pedro/gdb/mygit/src/gdb/thread.c:1182: internal-error: switch_to_thread: Assertion `inf != NULL' failed.
>>>  A problem internal to GDB has been detected,
>>>  further debugging may prove unreliable.
>>>  Quit this debugging session? (y or n) FAIL: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn (GDB internal error)
>>>  Resyncing due to internal error.
---snip---
>
> Seems like gdb disconnects, and we end up in remote_open again.
> Then probably due to --once (the list descriptor is closed), that
> fails and throws, which runs the "kill or detach everything" cleanup
> (detach_or_kill_for_exit_cleanup).  And that ends up in your new
> code here:
> 
> static int
> linux_kill (int pid)
---snip---
> trying to kill the vfork child.
> 
> Really get_last_target_status is not a good idea.  It's broken
> on the native side already, and adding it to gdbserver too is
> not a good idea.  E.g., consider scheduler-locking or non-stop.
> Other events on other processes/threads can easily happen
> and thus overwrite the last target status, before something
> decides to kill the fork parent.

I've eliminated get_last_target_status from gdbserver in this patch.
Instead of killing the new child in gdbserver,
remote.c:extended_remote_kill uses 'pending_follow' on the host side
to recognize a fork parent.  If it finds one it temporarily switches
to the child, kills it, switches back, and resumes the kill operation
as before.

I've got a WIP patch that does something similar in linux-nat.c:linux_nat_kill,
and I've almost got a test working to make sure that the new child is actually
gone.

Thanks
--Don

This patch implements catchpoints for fork events on extended-remote
Linux targets.

Implementation appeared to be straightforward, requiring four new functions
in remote.c to implement insert/remove of fork/vfork catchpoints.  These
functions are essentially stubs that just return 0 ('success') if the
required features are enabled.  If the fork events are being reported, then
catchpoints are set and hit.

However, there are some extra issues that arise with catchpoints.

1) Thread creation reporting -- fork catchpoints are hit before the
   follow_fork has been completed.  When stopped at a fork catchpoint
   in the native implementation, the new process is not 'reported'
   until after the follow is done.  It doesn't show up in the inferiors
   list or the threads list.  However, in the gdbserver case, an
   'info threads' while stopped at a fork catchpoint will retrieve the
   new thread info from the target and add it to GDB's data structures,
   prior to the follow operations.  Because of this premature report,
   things on the GDB side eventually get very confused.

   So in gdbserver, in server.c:handle_qxfer_threads_worker, we check
   'last_status' and if it shows a FORK event, we know that we are in an
   unfollowed fork and we do not report the new (forked) thread to GDB.

2) Kill process before fork is followed -- on the native side in
   linux-nat.c:linux_nat_kill, there is some code to handle the case where
   a fork has occurred but follow_fork hasn't been called yet.  It does
   this by using the last status to determine if a follow is pending, and
   if it is, to kill the child task.  This patch identifies a fork parent
   in remote.c:extended_remote_kill and if one is found, it makes a
   recursive call to kill the new child as well.

3) One of the tests related to fork catchpoints,
   gdb.threads/fork-thread-pending.exp, depended on the threads being
   reported in a specific order.  GDBserver reported the threads in a
   different order, that is, 'info threads' showed the same threads, but in
   a different order.  The test used a hard-coded thread number to find a
   threads that (a) was not the main thread (blocked in pthread_join), and
   (b) was not the forking thread (stopped in fork).

   I implemented a new proc, find_unforked_thread, that uses a pretty
   brute-force method of finding a thread.  I considered just hard-coding
   another number (the native case used thread 2, which was the forking
   thread in the remote case), but that didn't seem future-proof.
   Suggestions on how to do this better would be welcome.
  
Tested on x64 Ubuntu Lucid, native, remote, extended-remote.  Tested the
case of killing the forking process before the fork has been followed
manually.  It wasn't clear to me how to check that a process had actually
been killed from a dejagnu test.

Thanks
--Don

gdb/gdbserver/
2015-03-17  Don Breazeal  <donb@codesourcery.com>

	* server.c (handle_qxfer_threads_worker): Skip forked child thread
	when between fork and follow_fork.
	(get_last_target_status): New function.
	* server.h (get_last_target_status): Declare new function.

gdb/
2015-03-17  Don Breazeal  <donb@codesourcery.com>
	* remote.c (remote_insert_fork_catchpoint): New function.
	(remote_remove_fork_catchpoint): New function.
	(remote_insert_vfork_catchpoint): New function.
	(remote_remove_vfork_catchpoint): New function.
	(extended_remote_kill): Kill fork child when killing the
	parent before follow_fork completes.
	(init_extended_remote_ops): Initialize target vector with
	new fork catchpoint functions.

gdb/testsuite/
2015-03-17  Don Breazeal  <donb@codesourcery.com>
	* gdb.threads/fork-thread-pending.exp (find_unforked_thread):
	New proc.
	(main program): Call find_unforked_thread.

---
 gdb/NEWS                                          |  7 ++-
 gdb/doc/gdb.texinfo                               |  6 +-
 gdb/gdbserver/server.c                            | 18 ++++++
 gdb/gdbserver/server.h                            |  2 +
 gdb/remote.c                                      | 73 +++++++++++++++++++++++
 gdb/testsuite/gdb.threads/fork-thread-pending.exp | 23 ++++++-
 6 files changed, 123 insertions(+), 6 deletions(-)
  

Comments

Pedro Alves March 24, 2015, 12:47 p.m. UTC | #1
On 03/17/2015 08:56 PM, Don Breazeal wrote:

> diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
> index 8fa6f8a..346f2c4 100644
> --- a/gdb/gdbserver/server.c
> +++ b/gdb/gdbserver/server.c
> @@ -1356,6 +1356,15 @@ handle_qxfer_threads_worker (struct inferior_list_entry *inf, void *arg)
>    int core = target_core_of_thread (ptid);
>    char core_s[21];
>  
> +  /* Skip new threads created as the result of a fork if we are not done
> +     handling that fork event.  We won't know whether to tell GDB about
> +     the new thread until we are done following the fork.  */
> +  if ((last_status.kind == TARGET_WAITKIND_FORKED
> +       || last_status.kind == TARGET_WAITKIND_VFORKED)
> +      && (ptid_get_pid (last_status.value.related_pid)
> +	  == ptid_get_pid (ptid)))
> +    return;

This use of last_status here is really just as bad as
get_last_target_status, for the same reasons.  What if a thread
forks at the same time another thread hits a breakpoint, and
we end up reporting the breakpoint first, leaving the fork
pending?  Sounds like we'll end up listing the child fork
thread then.

> +
>    write_ptid (ptid_s, ptid);
>  
>    if (core != -1)
> @@ -4144,3 +4153,12 @@ handle_target_event (int err, gdb_client_data client_data)
>  
>    return 0;
>  }
> +
> +/* Retrieve the last waitstatus reported to GDB.  */
> +
> +void
> +get_last_target_status (ptid_t *ptid, struct target_waitstatus *last)
> +{
> +  *ptid = last_ptid;
> +  *last = last_status;
> +}

Looks like you forgot to delete the function.  :-)

> diff --git a/gdb/gdbserver/server.h b/gdb/gdbserver/server.h
> index 09a5624..8c6ec27 100644
> --- a/gdb/gdbserver/server.h
> +++ b/gdb/gdbserver/server.h
> @@ -113,6 +113,8 @@ typedef int gdb_fildes_t;
>  /* Functions from server.c.  */
>  extern int handle_serial_event (int err, gdb_client_data client_data);
>  extern int handle_target_event (int err, gdb_client_data client_data);
> +extern void get_last_target_status (ptid_t *ptid,
> +				    struct target_waitstatus *last);
>  
>  #include "remote-utils.h"
>  
> diff --git a/gdb/remote.c b/gdb/remote.c
> index d1ba62d..44ee89f 100644
> --- a/gdb/remote.c
> +++ b/gdb/remote.c


> @@ -8012,6 +8060,23 @@ extended_remote_kill (struct target_ops *ops)
>    int res;
>    int pid = ptid_get_pid (inferior_ptid);
>    struct remote_state *rs = get_remote_state ();
> +  struct thread_info *tp = find_thread_ptid (inferior_ptid);
> +
> +  /* If we're stopped while forking and we haven't followed yet,
> +     kill the child task.  We need to do this first because the
> +     parent will be sleeping if this is a vfork.  */
> +
> +  if (tp != NULL && (tp->pending_follow.kind == TARGET_WAITKIND_FORKED
> +		     || tp->pending_follow.kind == TARGET_WAITKIND_VFORKED))

Looks like this will miss killing the child if the user switches to
some thread other than the one that forked, in case it was a multi-threaded
program that forked.

> +    {
> +      ptid_t parent_ptid = inferior_ptid;
> +
> +      inferior_ptid = tp->pending_follow.value.related_pid;
> +      set_general_thread (inferior_ptid);
> +      extended_remote_kill (ops);
> +      inferior_ptid = parent_ptid;
> +      set_general_thread (inferior_ptid);

We never want to use the 'k' packet here, so this could
just simply be:

         int child_pid = tp->pending_follow.value.related_pid;
         remote_vkill (child_pid);

> +    }
>  
>    res = remote_vkill (pid, rs);
>    if (res == -1 && !(rs->extended && remote_multi_process_p (rs)))
> @@ -12036,6 +12101,14 @@ Specify the serial device it is connected to (e.g. /dev/ttya).";
>    extended_remote_ops.to_supports_disable_randomization
>      = extended_remote_supports_disable_randomization;
>    extended_remote_ops.to_follow_fork = remote_follow_fork;
> +  extended_remote_ops.to_insert_fork_catchpoint
> +    = remote_insert_fork_catchpoint;
> +  extended_remote_ops.to_remove_fork_catchpoint
> +    = remote_remove_fork_catchpoint;
> +  extended_remote_ops.to_insert_vfork_catchpoint
> +    = remote_insert_vfork_catchpoint;
> +  extended_remote_ops.to_remove_vfork_catchpoint
> +    = remote_remove_vfork_catchpoint;
>  }
>  
>  static int
> diff --git a/gdb/testsuite/gdb.threads/fork-thread-pending.exp b/gdb/testsuite/gdb.threads/fork-thread-pending.exp
> index d229232..594f376 100644
> --- a/gdb/testsuite/gdb.threads/fork-thread-pending.exp
> +++ b/gdb/testsuite/gdb.threads/fork-thread-pending.exp
> @@ -31,6 +31,26 @@ if {[gdb_compile_pthreads "${srcdir}/${subdir}/${srcfile}" "${binfile}" executab
>      return -1
>  }
>  
> +# Find a thread that did not fork and is not the main thread and
> +# return its thread number.  We can't just hard-code the thread
> +# number since we have no guarantee as to the ordering of the threads
> +# in gdb.  

I don't understand this -- the test runs to main first, so the main
thread should always be thread 1, no?


> We know that the main thread is in pthread_join and the
> +# forking thread is in fork, so we use this rather ungainly regexp
> +# to capture an entry from 'info threads' that doesn't show one of
> +# those routines, then extract the thread number.
> +
> +proc find_unforked_thread { } {
> +    gdb_test_multiple "info threads" "find unforked thread" {
> +        -re "(\[^\r]*Thread\[^\r]* in \[^fp]\[^ot]\[^rh]\[^kr]\[^e]\[^a]\[^d]\[^_]\[^j]\[^\r]*\r\n)" {
> +    	    regexp "(\[ 	]*)(\[0-9]*)(\[    ]*Thread\[^\r]*\r\n)" $expect_out(0,string) ignore lead_spc threadnum rest
> +        }
> +        timeout {
> +	    set threadnum -1
> +        }
> +    }
> +    return $threadnum
> +}
> +
>  clean_restart ${binfile}
>  
>  if ![runto_main] then {
> @@ -46,7 +66,8 @@ gdb_test "continue" "Catchpoint.*" "1, get to the fork event"
>  
>  gdb_test "info threads" " Thread .* Thread .* Thread .* Thread .*" "1, multiple threads found"
>  
> -gdb_test "thread 1" ".*" "1, switched away from event thread"
> +set threadnum [find_unforked_thread]
> +gdb_test "thread $threadnum" ".*" "1, switched away from event thread to thread $threadnum"
>  
>  gdb_test "continue" "Not resuming.*" "1, refused to resume"
>  
> 

Thanks,
Pedro Alves
  

Patch

diff --git a/gdb/NEWS b/gdb/NEWS
index 38f5ed7..a22e5ae 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -104,9 +104,10 @@  Itanium running HP-UX         ia64-*-hpux*
   The remote stub now reports support for fork and vfork events to
   GDB's qSupported query.
 
-  GDBserver extended-remote Linux targets now provides basic support
-  for fork and vfork events.  This enables follow-fork-mode and
-  detach-on-fork for those targets with Linux kernels 2.5.60 and later.
+  GDBserver extended-remote Linux targets now provides support for
+  fork and vfork events.  This enables follow-fork-mode, detach-on-fork,
+  catch fork, and catch vfork for extended-remote targets with Linux
+  kernels 2.5.60 and later.
 
 *** Changes in GDB 7.9
 
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 11e9850..17b86bc 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -4426,12 +4426,14 @@  Again, in this case @value{GDBN} would not be able to display syscall's names.
 @item fork
 @kindex catch fork
 A call to @code{fork}.  This is currently only available for HP-UX
-and @sc{gnu}/Linux.
+and @sc{gnu}/Linux, and when connected to @code{gdbserver} running
+on @sc{gnu}/Linux with @kbd{target extended-remote}.
 
 @item vfork
 @kindex catch vfork
 A call to @code{vfork}.  This is currently only available for HP-UX
-and @sc{gnu}/Linux.
+and @sc{gnu}/Linux, and when connected to @code{gdbserver} running
+on @sc{gnu}/Linux with @kbd{target extended-remote}.
 
 @item load @r{[}regexp@r{]}
 @itemx unload @r{[}regexp@r{]}
diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
index 8fa6f8a..346f2c4 100644
--- a/gdb/gdbserver/server.c
+++ b/gdb/gdbserver/server.c
@@ -1356,6 +1356,15 @@  handle_qxfer_threads_worker (struct inferior_list_entry *inf, void *arg)
   int core = target_core_of_thread (ptid);
   char core_s[21];
 
+  /* Skip new threads created as the result of a fork if we are not done
+     handling that fork event.  We won't know whether to tell GDB about
+     the new thread until we are done following the fork.  */
+  if ((last_status.kind == TARGET_WAITKIND_FORKED
+       || last_status.kind == TARGET_WAITKIND_VFORKED)
+      && (ptid_get_pid (last_status.value.related_pid)
+	  == ptid_get_pid (ptid)))
+    return;
+
   write_ptid (ptid_s, ptid);
 
   if (core != -1)
@@ -4144,3 +4153,12 @@  handle_target_event (int err, gdb_client_data client_data)
 
   return 0;
 }
+
+/* Retrieve the last waitstatus reported to GDB.  */
+
+void
+get_last_target_status (ptid_t *ptid, struct target_waitstatus *last)
+{
+  *ptid = last_ptid;
+  *last = last_status;
+}
diff --git a/gdb/gdbserver/server.h b/gdb/gdbserver/server.h
index 09a5624..8c6ec27 100644
--- a/gdb/gdbserver/server.h
+++ b/gdb/gdbserver/server.h
@@ -113,6 +113,8 @@  typedef int gdb_fildes_t;
 /* Functions from server.c.  */
 extern int handle_serial_event (int err, gdb_client_data client_data);
 extern int handle_target_event (int err, gdb_client_data client_data);
+extern void get_last_target_status (ptid_t *ptid,
+				    struct target_waitstatus *last);
 
 #include "remote-utils.h"
 
diff --git a/gdb/remote.c b/gdb/remote.c
index d1ba62d..44ee89f 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -1478,6 +1478,54 @@  remote_fork_event_p (struct remote_state *rs)
   return packet_support (PACKET_fork_event_feature) == PACKET_ENABLE;
 }
 
+/* Returns true if vfork events are supported.  */
+
+static int
+remote_vfork_event_p (struct remote_state *rs)
+{
+  return packet_support (PACKET_vfork_event_feature) == PACKET_ENABLE;
+}
+
+/* Insert fork catchpoint target routine.  If fork events are enabled
+   then return success, nothing more to do.  */
+
+static int
+remote_insert_fork_catchpoint (struct target_ops *ops, int pid)
+{
+  struct remote_state *rs = get_remote_state ();
+
+  return !remote_fork_event_p (rs);
+}
+
+/* Remove fork catchpoint target routine.  Nothing to do, just
+   return success.  */
+
+static int
+remote_remove_fork_catchpoint (struct target_ops *ops, int pid)
+{
+  return 0;
+}
+
+/* Insert vfork catchpoint target routine.  If vfork events are enabled
+   then return success, nothing more to do.  */
+
+static int
+remote_insert_vfork_catchpoint (struct target_ops *ops, int pid)
+{
+  struct remote_state *rs = get_remote_state ();
+
+  return !remote_vfork_event_p (rs);
+}
+
+/* Remove vfork catchpoint target routine.  Nothing to do, just
+   return success.  */
+
+static int
+remote_remove_vfork_catchpoint (struct target_ops *ops, int pid)
+{
+  return 0;
+}
+
 /* Tokens for use by the asynchronous signal handlers for SIGINT.  */
 static struct async_signal_handler *async_sigint_remote_twice_token;
 static struct async_signal_handler *async_sigint_remote_token;
@@ -8012,6 +8060,23 @@  extended_remote_kill (struct target_ops *ops)
   int res;
   int pid = ptid_get_pid (inferior_ptid);
   struct remote_state *rs = get_remote_state ();
+  struct thread_info *tp = find_thread_ptid (inferior_ptid);
+
+  /* If we're stopped while forking and we haven't followed yet,
+     kill the child task.  We need to do this first because the
+     parent will be sleeping if this is a vfork.  */
+
+  if (tp != NULL && (tp->pending_follow.kind == TARGET_WAITKIND_FORKED
+		     || tp->pending_follow.kind == TARGET_WAITKIND_VFORKED))
+    {
+      ptid_t parent_ptid = inferior_ptid;
+
+      inferior_ptid = tp->pending_follow.value.related_pid;
+      set_general_thread (inferior_ptid);
+      extended_remote_kill (ops);
+      inferior_ptid = parent_ptid;
+      set_general_thread (inferior_ptid);
+    }
 
   res = remote_vkill (pid, rs);
   if (res == -1 && !(rs->extended && remote_multi_process_p (rs)))
@@ -12036,6 +12101,14 @@  Specify the serial device it is connected to (e.g. /dev/ttya).";
   extended_remote_ops.to_supports_disable_randomization
     = extended_remote_supports_disable_randomization;
   extended_remote_ops.to_follow_fork = remote_follow_fork;
+  extended_remote_ops.to_insert_fork_catchpoint
+    = remote_insert_fork_catchpoint;
+  extended_remote_ops.to_remove_fork_catchpoint
+    = remote_remove_fork_catchpoint;
+  extended_remote_ops.to_insert_vfork_catchpoint
+    = remote_insert_vfork_catchpoint;
+  extended_remote_ops.to_remove_vfork_catchpoint
+    = remote_remove_vfork_catchpoint;
 }
 
 static int
diff --git a/gdb/testsuite/gdb.threads/fork-thread-pending.exp b/gdb/testsuite/gdb.threads/fork-thread-pending.exp
index d229232..594f376 100644
--- a/gdb/testsuite/gdb.threads/fork-thread-pending.exp
+++ b/gdb/testsuite/gdb.threads/fork-thread-pending.exp
@@ -31,6 +31,26 @@  if {[gdb_compile_pthreads "${srcdir}/${subdir}/${srcfile}" "${binfile}" executab
     return -1
 }
 
+# Find a thread that did not fork and is not the main thread and
+# return its thread number.  We can't just hard-code the thread
+# number since we have no guarantee as to the ordering of the threads
+# in gdb.  We know that the main thread is in pthread_join and the
+# forking thread is in fork, so we use this rather ungainly regexp
+# to capture an entry from 'info threads' that doesn't show one of
+# those routines, then extract the thread number.
+
+proc find_unforked_thread { } {
+    gdb_test_multiple "info threads" "find unforked thread" {
+        -re "(\[^\r]*Thread\[^\r]* in \[^fp]\[^ot]\[^rh]\[^kr]\[^e]\[^a]\[^d]\[^_]\[^j]\[^\r]*\r\n)" {
+    	    regexp "(\[ 	]*)(\[0-9]*)(\[    ]*Thread\[^\r]*\r\n)" $expect_out(0,string) ignore lead_spc threadnum rest
+        }
+        timeout {
+	    set threadnum -1
+        }
+    }
+    return $threadnum
+}
+
 clean_restart ${binfile}
 
 if ![runto_main] then {
@@ -46,7 +66,8 @@  gdb_test "continue" "Catchpoint.*" "1, get to the fork event"
 
 gdb_test "info threads" " Thread .* Thread .* Thread .* Thread .*" "1, multiple threads found"
 
-gdb_test "thread 1" ".*" "1, switched away from event thread"
+set threadnum [find_unforked_thread]
+gdb_test "thread $threadnum" ".*" "1, switched away from event thread to thread $threadnum"
 
 gdb_test "continue" "Not resuming.*" "1, refused to resume"