From patchwork Tue Mar 15 15:29:57 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Pedro Alves <palves@redhat.com>
X-Patchwork-Id: 11341
Received: (qmail 99518 invoked by alias); 15 Mar 2016 15:30:19 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Delivered-To: mailing list gdb-patches@sourceware.org
Received: (qmail 99323 invoked by uid 89); 15 Mar 2016 15:30:05 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,
	RP_MATCHES_RCVD,
	SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=rs, _all_,
	wfi, resumes
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by
	sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with
	(AES256-GCM-SHA384 encrypted) ESMTPS;
	Tue, 15 Mar 2016 15:30:00 +0000
Received: from int-mx09.intmail.prod.int.phx2.redhat.com
	(int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22])	by
	mx1.redhat.com (Postfix) with ESMTPS id F2DBE7AE96;
	Tue, 15 Mar 2016 15:29:58 +0000 (UTC)
Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.phx2.redhat.com
	[10.5.9.1])	by int-mx09.intmail.prod.int.phx2.redhat.com
	(8.14.4/8.14.4) with ESMTP id u2FFTv7O021052;
	Tue, 15 Mar 2016 11:29:57 -0400
Subject: Re: [PATCH v2 3/3] PR remote/19496,
	timeout in forking-threads-plus-bkpt
To: Don Breazeal <donb@codesourcery.com>, gdb-patches@sourceware.org
References: <1455150506-12760-1-git-send-email-donb@codesourcery.com>
From: Pedro Alves <palves@redhat.com>
Message-ID: <56E82A75.8090103@redhat.com>
Date: Tue, 15 Mar 2016 15:29:57 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
	rv:38.0) Gecko/20100101 Thunderbird/38.6.0
MIME-Version: 1.0
In-Reply-To: <1455150506-12760-1-git-send-email-donb@codesourcery.com>

Hi Don,

Sorry for the delay.  I had to find a moment to think this through.

On 02/11/2016 12:28 AM, Don Breazeal wrote:
> On 2/1/2016 12:09 PM, Pedro Alves wrote:
>> On 02/01/2016 07:29 PM, Don Breazeal wrote:
>>> On 2/1/2016 4:05 AM, Pedro Alves wrote:
>>
>>> Hi Pedro,
> ---snip---
>>> A fork event was reported to GDB before GDB knew about the parent thread,
>>> followed immediately by a breakpoint event in a different thread.  The
>>> parent thread was subsequently added via remote_notice_new_inferior in
>>> process_stop_reply, but when the thread was added the thread_info.state
>>> was set to THREAD_STOPPED.  The fork event was then handled correctly,
>>> but when the fork parent was resumed via a call to keep_going, the state
>>> was unchanged.
>>
>> Since this is non-stop, then it sounds to me like the bug is that the
>> thread should have been added in THREAD_RUNNING state.
>>
>> Consider that infrun may be pulling target events out of the target_ops
>> backend into its own event queue, but, not process them immediately.
>>
>> E.g., infrun may be stopping all threads temporarily for a step-over-breakpoint
>> operation for thread A (stop_all_threads).  The waitstatus of all threads
>> is thus left pending in the thread structure (save_status), including the
>> fork event of thread B.  Right at this point, if the user
>> does "info threads", that should show thread B (the fork parent) running,
>> not stopped, even if internally, gdb is holding it paused for a little bit.
>>
> 
> Hi Pedro,
> Here is a new patch that adds the threads with the state set to
> THREAD_RUNNING for fork events.

I really meant it for _all_ kinds of events, not just fork.

I don't think we can do this:

+      if (status->kind == TARGET_WAITKIND_FORKED
+	  || status->kind == TARGET_WAITKIND_VFORKED)
+	remote_notice_new_inferior (ptid, 1);

in all-stop mode, as passing 1 means we'll end up installing a
continuation that immediately resumes the whole process.

+      else
+       remote_notice_new_inferior (ptid, 0);

and with this, if in non-stop, and the event ends up pending
on the infrun side, "info threads" will be confused for that
thread too.

So the fix requires a bit of plumbing to pass down the correct
external and internal thread states.  Something like the patch
below.

WDYT?  Does it fix the timeout you observe?

(We should really come up with better, less confusing names for the
"running" vs "executing" distinction.  Maybe external/internal running
state, or user/internal, or public/private...)

--------------
[PATCH] PR remote/19496, timeout in forking-threads-plus-bkpt

This patch addresses a failure in
gdb.threads/forking-threads-plus-breakpoint.exp:

FAIL: gdb.threads/forking-threads-plus-breakpoint.exp: cond_bp_target=1:
detach_on_fork=on: inferior 1 exited (timeout)

Cause:

A fork event was reported to GDB before GDB knew about the parent
thread, followed immediately by a breakpoint event in a different
thread.  The parent thread was subsequently added via
remote_notice_new_inferior in process_stop_reply, but when the thread
was added the thread_info.state was set to THREAD_STOPPED.  The fork
event was then handled correctly, but when the fork parent was resumed
via a call to keep_going, the state was unchanged.

The breakpoint event was then handled, which caused all the
non-breakpoint threads to be stopped.  When the breakpoint thread was
resumed, all the non-breakpoint threads were resumed via
infrun.c:restart_threads.  Our old fork parent wasn't restarted,
because it still had thread_info.state set to THREAD_STOPPED.
Ultimately the program under debug hung waiting for a pthread_join
while the old fork parent was stopped forever by GDB.

Fix:

Since this is non-stop, then the bug is that the thread should have
been added in THREAD_RUNNING state.  Consider that infrun may be
pulling target events out of the target_ops backend into its own event
queue, but, not process them immediately.  E.g., infrun may be
stopping all threads temporarily for a step-over-breakpoint operation
for thread A (stop_all_threads).  The waitstatus of all threads is
thus left pending in the thread structure (save_status), including the
fork event of thread B.  Right at this point, if the user does "info
threads", that should show thread B (the fork parent) running, not
stopped, even if internally, gdb is holding it paused for a little
bit.

Thus if in non-stop mode, always add new threads in the external
user-visible THREAD_RUNNING state.  Change remote_notice_new_inferior
to accept the internal executing state of the thread instead, with
EXECUTING set to 1 when we discover a thread that is running on the
target (such as through remote_update_thread_list), and 0 when the
thread is really paused (such as when we see a stop reply).

Tested on x86_64 Linux and Nios II Linux target with x86 Linux host.

gdb/ChangeLog:
2016-03-15  Pedro Alves  <palves@redhat.com>
	    Don Breazeal  <donb@codesourcery.com>

	* infcmd.c (notice_new_inferior): Use the 'leave_running' argument
	instead of checking the 'non_stop' global.

	* remote.c (remote_add_thread): New parameter 'executing'.  Use it
	to set the new thread's executing state.
	(remote_notice_new_inferior): Rename parameter 'running' to
	'executing'.  Always set the thread state to THREAD_RUNNING in
	non-stop mode, and to THREAD_STOPPED in all-stop mode.  Pass
	EXECUTING to remote_add_thread and notice_new_inferior.
	(remote_update_thread_list): Update to pass executing state, not
	running state.
---
 gdb/infcmd.c |  7 +------
 gdb/remote.c | 30 ++++++++++++++++++------------
 2 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/gdb/infcmd.c b/gdb/infcmd.c
index a80bf0a..d687116 100644
--- a/gdb/infcmd.c
+++ b/gdb/infcmd.c
@@ -2903,11 +2903,7 @@ notice_new_inferior (ptid_t ptid, int leave_running, int from_tty)
 
   old_chain = make_cleanup (null_cleanup, NULL);
 
-  /* If in non-stop, leave threads as running as they were.  If
-     they're stopped for some reason other than us telling it to, the
-     target reports a signal != GDB_SIGNAL_0.  We don't try to
-     resume threads with such a stop signal.  */
-  mode = non_stop ? ATTACH_POST_WAIT_RESUME : ATTACH_POST_WAIT_NOTHING;
+  mode = leave_running ? ATTACH_POST_WAIT_RESUME : ATTACH_POST_WAIT_NOTHING;
 
   if (!ptid_equal (inferior_ptid, null_ptid))
     make_cleanup_restore_current_thread ();
@@ -2943,7 +2939,6 @@ notice_new_inferior (ptid_t ptid, int leave_running, int from_tty)
       return;
     }
 
-  mode = leave_running ? ATTACH_POST_WAIT_RESUME : ATTACH_POST_WAIT_NOTHING;
   attach_post_wait ("" /* args */, from_tty, mode);
 
   do_cleanups (old_chain);
diff --git a/gdb/remote.c b/gdb/remote.c
index f09a06e..af0a08a 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -1801,7 +1801,7 @@ remote_add_inferior (int fake_pid_p, int pid, int attached,
    according to RUNNING.  */
 
 static void
-remote_add_thread (ptid_t ptid, int running)
+remote_add_thread (ptid_t ptid, int running, int executing)
 {
   struct remote_state *rs = get_remote_state ();
 
@@ -1816,7 +1816,7 @@ remote_add_thread (ptid_t ptid, int running)
   else
     add_thread (ptid);
 
-  set_executing (ptid, running);
+  set_executing (ptid, executing);
   set_running (ptid, running);
 }
 
@@ -1824,11 +1824,17 @@ remote_add_thread (ptid_t ptid, int running)
    It may be the first time we hear about such thread, so take the
    opportunity to add it to GDB's thread list.  In case this is the
    first time we're noticing its corresponding inferior, add it to
-   GDB's inferior list as well.  */
+   GDB's inferior list as well.  EXECUTING indicates whether the
+   thread is (internally) executing or stopped.  */
 
 static void
-remote_notice_new_inferior (ptid_t currthread, int running)
+remote_notice_new_inferior (ptid_t currthread, int executing)
 {
+  /* In non-stop mode, we assume new found threads are (externally)
+     running until proven otherwise with a stop reply.  In all-stop,
+     we can only get here if all threads are stopped.  */
+  int running = target_is_non_stop_p () ? 1 : 0;
+
   /* If this is a new thread, add it to GDB's thread list.
      If we leave it up to WFI to do this, bad things will happen.  */
 
@@ -1836,7 +1842,7 @@ remote_notice_new_inferior (ptid_t currthread, int running)
     {
       /* We're seeing an event on a thread id we knew had exited.
 	 This has to be a new thread reusing the old id.  Add it.  */
-      remote_add_thread (currthread, running);
+      remote_add_thread (currthread, running, executing);
       return;
     }
 
@@ -1857,7 +1863,7 @@ remote_notice_new_inferior (ptid_t currthread, int running)
 	    thread_change_ptid (inferior_ptid, currthread);
 	  else
 	    {
-	      remote_add_thread (currthread, running);
+	      remote_add_thread (currthread, running, executing);
 	      inferior_ptid = currthread;
 	    }
 	  return;
@@ -1888,7 +1894,7 @@ remote_notice_new_inferior (ptid_t currthread, int running)
 	}
 
       /* This is really a new thread.  Add it.  */
-      remote_add_thread (currthread, running);
+      remote_add_thread (currthread, running, executing);
 
       /* If we found a new inferior, let the common code do whatever
 	 it needs to with it (e.g., read shared libraries, insert
@@ -1899,7 +1905,7 @@ remote_notice_new_inferior (ptid_t currthread, int running)
 	  struct remote_state *rs = get_remote_state ();
 
 	  if (!rs->starting_up)
-	    notice_new_inferior (currthread, running, 0);
+	    notice_new_inferior (currthread, executing, 0);
 	}
     }
 }
@@ -3259,12 +3265,12 @@ remote_update_thread_list (struct target_ops *ops)
 	    {
 	      struct private_thread_info *info;
 	      /* In non-stop mode, we assume new found threads are
-		 running until proven otherwise with a stop reply.  In
-		 all-stop, we can only get here if all threads are
+		 executing until proven otherwise with a stop reply.
+		 In all-stop, we can only get here if all threads are
 		 stopped.  */
-	      int running = target_is_non_stop_p () ? 1 : 0;
+	      int executing = target_is_non_stop_p () ? 1 : 0;
 
-	      remote_notice_new_inferior (item->ptid, running);
+	      remote_notice_new_inferior (item->ptid, executing);
 
 	      info = demand_private_info (item->ptid);
 	      info->core = item->core;