Fix failure to detach if threads exit while detaching on linux

Message ID 1464876960-23372-1-git-send-email-antoine.tremblay@ericsson.com
State New, archived
Headers

Commit Message

Antoine Tremblay June 2, 2016, 2:16 p.m. UTC
  This patches fixes detaching on linux when some threads exit while we're
detaching with GDB and GDBServer.

What happened before is that as GDB/GDBserver would be detaching threads
one thread at a time and allowing them to continue, if one of these
detached threads called exit for example and the other threads were
destroyed GDB/GDBserver would still try and detach these exited threads
and fail with a message like: "Can't detach process." as ptrace could not
execute the operation.

This patch uses check_ptrace_stopped_lwp_gone or
linux_proc_pid_is_trace_stopped_nowarn like is used in the resume case to
avoid an error if this function detects that the ptrace failure is normal
since the thread has exited.

This patch adds the gdb.threads/detach-gone-thread.exp test for this case.

Tested on x86-linux with {unix, native-extended-gdbserver}

gdb/gdbserver/ChangeLog:

	* linux-low.c (check_ptrace_stopped_lwp_gone) Move up to be used
	by linux_detach_one_lwp.
	(linux_detach_one_lwp): Report the error only if
	check_ptrace_stopped_lwp_gone is false.

gdb/ChangeLog:

	* inf-ptrace.c: include nat/linux-procfs.h.
	(inf_ptrace_detach): Report an error only if the thread is not gone.
	* linux-nat.c (check_ptrace_stopped_lwp_gone): Move up to be used
	by detach_callback.
	(detach_callback): Report the error only if
	check_ptrace_stopped_lwp_gone is false.

gdb/testsuite/ChangeLog:

	* gdb.threads/detach-gone-thread.c: New file.
	* gdb.threads/detach-gone-thread.ex: New test.
---
 gdb/gdbserver/linux-low.c                        | 73 ++++++++++++------------
 gdb/inf-ptrace.c                                 |  6 +-
 gdb/linux-nat.c                                  | 69 +++++++++++-----------
 gdb/testsuite/gdb.threads/detach-gone-thread.c   | 45 +++++++++++++++
 gdb/testsuite/gdb.threads/detach-gone-thread.exp | 37 ++++++++++++
 5 files changed, 159 insertions(+), 71 deletions(-)
 create mode 100644 gdb/testsuite/gdb.threads/detach-gone-thread.c
 create mode 100644 gdb/testsuite/gdb.threads/detach-gone-thread.exp
  

Comments

Pedro Alves June 3, 2016, 11:43 a.m. UTC | #1
On 06/02/2016 03:16 PM, Antoine Tremblay wrote:

> gdb/ChangeLog:
> 
> 	* inf-ptrace.c: include nat/linux-procfs.h.
> 	(inf_ptrace_detach): Report an error only if the thread is not gone.

Can't do this.  inf-ptrace.c is a shared file used by non-Linux ports too.

> 	* gdb.threads/detach-gone-thread.c: New file.
> 	* gdb.threads/detach-gone-thread.ex: New test.

Typo: ".ex".  Say "New file" for that one too.

> +  _exit(0);

Space before parenthesis.

> +{
> +  pthread_t threads[256];
> +  int res;
> +  int i;
> +
> +  pthread_barrier_init (&barrier, NULL, 257);
> +
> +  for (i = 0; i < 256; i++)

Please add:

#define NTHREADS 256

and thus write:

  pthread_t threads[NTHREADS];
  int res;
  int i;

  pthread_barrier_init (&barrier, NULL, NTHREADS + 1);

  for (i = 0; i < NTHREADS; i++)


> +
> +if $use_gdb_stub {

Do this at the top, and call the new use_gdb_stub procedure instead:

if [use_gdb_stub] {

> +    # This test is about testing detaching from a process,
> +    # so it doesn't make sense to run it if the target is stub-like.
> +    unsupported "This test is not supported for GDB stub targets."

But in any case, I'm not so sure about this.  Stub targets can
detach too.  What problem did you find that needed this?

Thanks,
Pedro Alves
  
Antoine Tremblay June 3, 2016, 11:55 a.m. UTC | #2
Pedro Alves writes:

> On 06/02/2016 03:16 PM, Antoine Tremblay wrote:
>
>> gdb/ChangeLog:
>> 
>> 	* inf-ptrace.c: include nat/linux-procfs.h.
>> 	(inf_ptrace_detach): Report an error only if the thread is not gone.
>
> Can't do this.  inf-ptrace.c is a shared file used by non-Linux ports too.
>

Oops I tought ptrace was linux specific, seems like bsd uses it too..

I think I can replace this with a try catch over  linux_ops->to_detach
(ops, args, from_tty); can call the rest of inf_ptrace_detach from there
I'll try it out...

>> 	* gdb.threads/detach-gone-thread.c: New file.
>> 	* gdb.threads/detach-gone-thread.ex: New test.
>
> Typo: ".ex".  Say "New file" for that one too.
>
>> +  _exit(0);
>

Done.

> Space before parenthesis.
>
>> +{
>> +  pthread_t threads[256];
>> +  int res;
>> +  int i;
>> +
>> +  pthread_barrier_init (&barrier, NULL, 257);
>> +
>> +  for (i = 0; i < 256; i++)
>
> Please add:
>
> #define NTHREADS 256
>
> and thus write:
>
>   pthread_t threads[NTHREADS];
>   int res;
>   int i;
>
>   pthread_barrier_init (&barrier, NULL, NTHREADS + 1);
>
>   for (i = 0; i < NTHREADS; i++)
>
>

Done.

>> +
>> +if $use_gdb_stub {
>
> Do this at the top, and call the new use_gdb_stub procedure instead:
>
> if [use_gdb_stub] {
>
>> +    # This test is about testing detaching from a process,
>> +    # so it doesn't make sense to run it if the target is stub-like.
>> +    unsupported "This test is not supported for GDB stub targets."
>
> But in any case, I'm not so sure about this.  Stub targets can
> detach too.  What problem did you find that needed this?
>

I did that because I though they could not detach...

Any suggestion on how to limit this to targets that can detach ?.. call
detach see if it suceeds or there's a better way ?

Thanks,
Antoine
  
Pedro Alves June 3, 2016, 12:05 p.m. UTC | #3
On 06/03/2016 12:55 PM, Antoine Tremblay wrote:

> I did that because I though they could not detach...
> 
> Any suggestion on how to limit this to targets that can detach ?.. 

I have trouble thinking of a target that wouldn't be able to detach.
And then it'd have to be a target that supports pthreads, otherwise
the test won't even run.

> call detach see if it suceeds or there's a better way ?

Right.  But I'd just start by assuming it works.  Just make
sure that --target_board=native-gdbserver passes.

Thanks,
Pedro Alves
  

Patch

diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index 81134b0..5f02dab 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -1447,6 +1447,39 @@  get_detach_signal (struct thread_info *thread)
     }
 }
 
+/* Called when we try to resume or detach a stopped LWP and that errors
+   out.  If the LWP is no longer in ptrace-stopped state (meaning it's
+   zombie, or about to become), discard the error, clear any pending
+   status the LWP may have, and return true (we'll collect the exit status
+   soon enough).  Otherwise, return false.  */
+
+static int
+check_ptrace_stopped_lwp_gone (struct lwp_info *lp)
+{
+  struct thread_info *thread = get_lwp_thread (lp);
+
+  /* If we get an error after resuming the LWP successfully, we'd
+     confuse !T state for the LWP being gone.  */
+  gdb_assert (lp->stopped);
+
+  /* We can't just check whether the LWP is in 'Z (Zombie)' state,
+     because even if ptrace failed with ESRCH, the tracee may be "not
+     yet fully dead", but already refusing ptrace requests.  In that
+     case the tracee has 'R (Running)' state for a little bit
+     (observed in Linux 3.18).  See also the note on ESRCH in the
+     ptrace(2) man page.  Instead, check whether the LWP has any state
+     other than ptrace-stopped.  */
+
+  /* Don't assume anything if /proc/PID/status can't be read.  */
+  if (linux_proc_pid_is_trace_stopped_nowarn (lwpid_of (thread)) == 0)
+    {
+      lp->stop_reason = TARGET_STOPPED_BY_NO_REASON;
+      lp->status_pending_p = 0;
+      return 1;
+    }
+  return 0;
+}
+
 static int
 linux_detach_one_lwp (struct inferior_list_entry *entry, void *args)
 {
@@ -1480,9 +1513,10 @@  linux_detach_one_lwp (struct inferior_list_entry *entry, void *args)
     the_low_target.prepare_to_resume (lwp);
   if (ptrace (PTRACE_DETACH, lwpid_of (thread), (PTRACE_TYPE_ARG3) 0,
 	      (PTRACE_TYPE_ARG4) (long) sig) < 0)
-    error (_("Can't detach %s: %s"),
-	   target_pid_to_str (ptid_of (thread)),
-	   strerror (errno));
+    if (!check_ptrace_stopped_lwp_gone (lwp))
+      error (_("Can't detach %s: %s"),
+	     target_pid_to_str (ptid_of (thread)),
+	     strerror (errno));
 
   delete_lwp (lwp);
   return 0;
@@ -4331,39 +4365,6 @@  linux_resume_one_lwp_throw (struct lwp_info *lwp,
   lwp->stop_reason = TARGET_STOPPED_BY_NO_REASON;
 }
 
-/* Called when we try to resume a stopped LWP and that errors out.  If
-   the LWP is no longer in ptrace-stopped state (meaning it's zombie,
-   or about to become), discard the error, clear any pending status
-   the LWP may have, and return true (we'll collect the exit status
-   soon enough).  Otherwise, return false.  */
-
-static int
-check_ptrace_stopped_lwp_gone (struct lwp_info *lp)
-{
-  struct thread_info *thread = get_lwp_thread (lp);
-
-  /* If we get an error after resuming the LWP successfully, we'd
-     confuse !T state for the LWP being gone.  */
-  gdb_assert (lp->stopped);
-
-  /* We can't just check whether the LWP is in 'Z (Zombie)' state,
-     because even if ptrace failed with ESRCH, the tracee may be "not
-     yet fully dead", but already refusing ptrace requests.  In that
-     case the tracee has 'R (Running)' state for a little bit
-     (observed in Linux 3.18).  See also the note on ESRCH in the
-     ptrace(2) man page.  Instead, check whether the LWP has any state
-     other than ptrace-stopped.  */
-
-  /* Don't assume anything if /proc/PID/status can't be read.  */
-  if (linux_proc_pid_is_trace_stopped_nowarn (lwpid_of (thread)) == 0)
-    {
-      lp->stop_reason = TARGET_STOPPED_BY_NO_REASON;
-      lp->status_pending_p = 0;
-      return 1;
-    }
-  return 0;
-}
-
 /* Like linux_resume_one_lwp_throw, but no error is thrown if the LWP
    disappears while we try to resume it.  */
 
diff --git a/gdb/inf-ptrace.c b/gdb/inf-ptrace.c
index 329d8fb..a668b7e 100644
--- a/gdb/inf-ptrace.c
+++ b/gdb/inf-ptrace.c
@@ -31,6 +31,7 @@ 
 #include "inf-ptrace.h"
 #include "inf-child.h"
 #include "gdbthread.h"
+#include "nat/linux-procfs.h"
 
 
 
@@ -259,7 +260,10 @@  inf_ptrace_detach (struct target_ops *ops, const char *args, int from_tty)
      started the process ourselves.  */
   errno = 0;
   ptrace (PT_DETACH, pid, (PTRACE_TYPE_ARG3)1, sig);
-  if (errno != 0)
+
+  /* Don't consider the error if /proc/PID/status can't be read.
+     See check_ptrace_stopped_lwp_gone.  */
+  if (errno != 0 && linux_proc_pid_is_trace_stopped_nowarn (pid) != 0)
     perror_with_name (("ptrace"));
 #else
   error (_("This system does not support detaching from a process"));
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index e6d525f..8a517c8 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -1382,6 +1382,38 @@  get_pending_status (struct lwp_info *lp, int *status)
   return 0;
 }
 
+/* Called when we try to resume or detach a stopped LWP and that errors
+   out.  If the LWP is no longer in ptrace-stopped state (meaning it's
+   zombie, or about to become), discard the error, clear any pending
+   status the LWP may have, and return true (we'll collect the exit status
+   soon enough).  Otherwise, return false.  */
+
+static int
+check_ptrace_stopped_lwp_gone (struct lwp_info *lp)
+{
+  /* If we get an error after resuming the LWP successfully, we'd
+     confuse !T state for the LWP being gone.  */
+  gdb_assert (lp->stopped);
+
+  /* We can't just check whether the LWP is in 'Z (Zombie)' state,
+     because even if ptrace failed with ESRCH, the tracee may be "not
+     yet fully dead", but already refusing ptrace requests.  In that
+     case the tracee has 'R (Running)' state for a little bit
+     (observed in Linux 3.18).  See also the note on ESRCH in the
+     ptrace(2) man page.  Instead, check whether the LWP has any state
+     other than ptrace-stopped.  */
+
+  /* Don't assume anything if /proc/PID/status can't be read.  */
+  if (linux_proc_pid_is_trace_stopped_nowarn (ptid_get_lwp (lp->ptid)) == 0)
+    {
+      lp->stop_reason = TARGET_STOPPED_BY_NO_REASON;
+      lp->status = 0;
+      lp->waitstatus.kind = TARGET_WAITKIND_IGNORE;
+      return 1;
+    }
+  return 0;
+}
+
 static int
 detach_callback (struct lwp_info *lp, void *data)
 {
@@ -1418,8 +1450,9 @@  detach_callback (struct lwp_info *lp, void *data)
       errno = 0;
       if (ptrace (PTRACE_DETACH, ptid_get_lwp (lp->ptid), 0,
 		  WSTOPSIG (status)) < 0)
-	error (_("Can't detach %s: %s"), target_pid_to_str (lp->ptid),
-	       safe_strerror (errno));
+	if (!check_ptrace_stopped_lwp_gone (lp))
+	  error (_("Can't detach %s: %s"), target_pid_to_str (lp->ptid),
+		 safe_strerror (errno));
 
       if (debug_linux_nat)
 	fprintf_unfiltered (gdb_stdlog,
@@ -1531,38 +1564,6 @@  linux_resume_one_lwp_throw (struct lwp_info *lp, int step,
   registers_changed_ptid (lp->ptid);
 }
 
-/* Called when we try to resume a stopped LWP and that errors out.  If
-   the LWP is no longer in ptrace-stopped state (meaning it's zombie,
-   or about to become), discard the error, clear any pending status
-   the LWP may have, and return true (we'll collect the exit status
-   soon enough).  Otherwise, return false.  */
-
-static int
-check_ptrace_stopped_lwp_gone (struct lwp_info *lp)
-{
-  /* If we get an error after resuming the LWP successfully, we'd
-     confuse !T state for the LWP being gone.  */
-  gdb_assert (lp->stopped);
-
-  /* We can't just check whether the LWP is in 'Z (Zombie)' state,
-     because even if ptrace failed with ESRCH, the tracee may be "not
-     yet fully dead", but already refusing ptrace requests.  In that
-     case the tracee has 'R (Running)' state for a little bit
-     (observed in Linux 3.18).  See also the note on ESRCH in the
-     ptrace(2) man page.  Instead, check whether the LWP has any state
-     other than ptrace-stopped.  */
-
-  /* Don't assume anything if /proc/PID/status can't be read.  */
-  if (linux_proc_pid_is_trace_stopped_nowarn (ptid_get_lwp (lp->ptid)) == 0)
-    {
-      lp->stop_reason = TARGET_STOPPED_BY_NO_REASON;
-      lp->status = 0;
-      lp->waitstatus.kind = TARGET_WAITKIND_IGNORE;
-      return 1;
-    }
-  return 0;
-}
-
 /* Like linux_resume_one_lwp_throw, but no error is thrown if the LWP
    disappears while we try to resume it.  */
 
diff --git a/gdb/testsuite/gdb.threads/detach-gone-thread.c b/gdb/testsuite/gdb.threads/detach-gone-thread.c
new file mode 100644
index 0000000..08917a0
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/detach-gone-thread.c
@@ -0,0 +1,45 @@ 
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2016 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <pthread.h>
+#include <unistd.h>
+#include <stdlib.h>
+
+pthread_barrier_t barrier;
+
+void *
+child_function (void *arg)
+{
+  pthread_barrier_wait (&barrier);
+  _exit(0);
+}
+
+int
+main ()
+{
+  pthread_t threads[256];
+  int res;
+  int i;
+
+  pthread_barrier_init (&barrier, NULL, 257);
+
+  for (i = 0; i < 256; i++)
+    res = pthread_create (&threads[i], NULL, child_function, NULL);
+
+  pthread_barrier_wait (&barrier);
+  exit(0);
+}
diff --git a/gdb/testsuite/gdb.threads/detach-gone-thread.exp b/gdb/testsuite/gdb.threads/detach-gone-thread.exp
new file mode 100644
index 0000000..41472ec
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/detach-gone-thread.exp
@@ -0,0 +1,37 @@ 
+# Copyright 2015-2016 Free Software Foundation, Inc.
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+standard_testfile
+
+if {[gdb_compile_pthreads "${srcdir}/${subdir}/${srcfile}" "${binfile}" executable {debug}] != "" } {
+    return -1
+}
+
+clean_restart ${testfile}
+
+if ![runto_main] {
+    fail "Can't run to main to check for trace support"
+    return -1
+}
+
+if $use_gdb_stub {
+    # This test is about testing detaching from a process,
+    # so it doesn't make sense to run it if the target is stub-like.
+    unsupported "This test is not supported for GDB stub targets."
+    return -1
+}
+
+gdb_breakpoint "_exit"
+gdb_continue_to_breakpoint "_exit" ".*_exit.*"
+gdb_test "detach" "Detaching from .*, process $decimal"