[v4,5/7] Add thread_db_notice_clone to gdbserver

Message ID 20170816104143.0af5d91a@pinnacle.lan
State New, archived
Headers

Commit Message

Kevin Buettner Aug. 16, 2017, 5:41 p.m. UTC
  While working on a patch for fetching a thread handle in gdbserver, I
ran into a circumstance in which tests in gdb.mi/mi-nsmoribund.exp
would occasionally fail.  Over a large enough number of runs, it would
fail roughly 2% of the time.

That thread handle patch caused find_one_thread() to be called on
every stop.  find_one_thread() calls td_ta_map_lwp2thr() which, in
turn, can cause ps_get_thread_area() to be called.
ps_get_thread_area() makes a call to ptrace() for getting the thread
area address.  If this should happen when the thread is not stopped,
the call to ptrace will return error which in turn propogates back to
find_one_thread().  find_one_thread() calls error() in this instance
which causes the program to die.

This patch causes find_one_thread() to be called upon reciept of a
clone event.  Since the clone is stopped, the circumstances described
above cannot occur.

gdb/gdbserver/ChangeLog:
    
    	* linux-low.c (handle_extended_wait): Call thread_db_notice_clone().
    	* linux-low.h (thread_db_notice_clone): Declare.
    	* thread-db.c (thread_db_notice_clone): New function.
---
 gdb/gdbserver/linux-low.c |  2 ++
 gdb/gdbserver/linux-low.h |  6 ++++++
 gdb/gdbserver/thread-db.c | 16 ++++++++++++++++
 3 files changed, 24 insertions(+)
  

Comments

Pedro Alves Sept. 29, 2017, 12:24 p.m. UTC | #1
On 08/16/2017 06:41 PM, Kevin Buettner wrote:
> While working on a patch for fetching a thread handle in gdbserver, I
> ran into a circumstance in which tests in gdb.mi/mi-nsmoribund.exp
> would occasionally fail.  Over a large enough number of runs, it would
> fail roughly 2% of the time.
> 
> That thread handle patch caused find_one_thread() to be called on
> every stop.  find_one_thread() calls td_ta_map_lwp2thr() which, in
> turn, can cause ps_get_thread_area() to be called.
> ps_get_thread_area() makes a call to ptrace() for getting the thread
> area address.  If this should happen when the thread is not stopped,
> the call to ptrace will return error which in turn propogates back to
> find_one_thread().  find_one_thread() calls error() in this instance
> which causes the program to die.
> 
> This patch causes find_one_thread() to be called upon reciept of a
> clone event.  Since the clone is stopped, the circumstances described
> above cannot occur.
> 

FYI, issues like the described above still happened with the
patch as it was.  While both parent/child are stopped, we
weren't making sure that one of parent/child is the
current thread.  That lead to libthread_db -> proc-service
trying to access memory from a running or non-existing thread.

For me that resulted in occasional
gdb.threads/multi-create-ns-info-thr.exp failures.

The gdbserver buildslaves have been reporting the same
regression too, e.g.:
  https://sourceware.org/ml/gdb-testers/2017-q3/msg05653.html

Should be fixed now, with:
 https://sourceware.org/ml/gdb-patches/2017-09/msg00903.html
 https://sourceware.org/ml/gdb-patches/2017-09/msg00904.html

Thanks,
Pedro Alves
  

Patch

diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index fd46d85..2b43f51 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -656,6 +656,8 @@  handle_extended_wait (struct lwp_info **orig_event_lwp, int wstat)
 	  new_lwp->status_pending = status;
 	}
 
+      thread_db_notice_clone (get_thread_process (event_thr), ptid);
+
       /* Don't report the event.  */
       return 1;
     }
diff --git a/gdb/gdbserver/linux-low.h b/gdb/gdbserver/linux-low.h
index 2bf7e7c..3d5cb66 100644
--- a/gdb/gdbserver/linux-low.h
+++ b/gdb/gdbserver/linux-low.h
@@ -410,4 +410,10 @@  int thread_db_get_tls_address (struct thread_info *thread, CORE_ADDR offset,
 			       CORE_ADDR load_module, CORE_ADDR *address);
 int thread_db_look_up_one_symbol (const char *name, CORE_ADDR *addrp);
 
+/* Called from linux-low.c when a clone event is detected.  Upon entry,
+   both the clone and the parent should be stopped.  This function does
+   whatever is required have the clone under thread_db's control.  */
+
+void thread_db_notice_clone (struct process_info *proc, ptid_t lwp);
+
 extern int have_ptrace_getregset;
diff --git a/gdb/gdbserver/thread-db.c b/gdb/gdbserver/thread-db.c
index 1ffb79d..b4a5188 100644
--- a/gdb/gdbserver/thread-db.c
+++ b/gdb/gdbserver/thread-db.c
@@ -864,3 +864,19 @@  thread_db_handle_monitor_command (char *mon)
   /* Tell server.c to perform default processing.  */
   return 0;
 }
+
+/* See linux-low.h.  */
+
+void
+thread_db_notice_clone (struct process_info *proc, ptid_t ptid)
+{
+  struct thread_db *thread_db = proc->priv->thread_db;
+
+  /* If the thread layer isn't initialized, return.  It may just
+     be that the program uses clone, but does not use libthread_db.  */
+  if (thread_db == NULL || !thread_db->all_symbols_looked_up)
+    return;
+
+  if (!find_one_thread (ptid))
+    warning ("Cannot find thread after clone.\n");
+}