[3/3] PR remote/19496, timeout in forking-threads-plus-bkpt

Message ID 1453942111-1215-4-git-send-email-donb@codesourcery.com
State New, archived
Headers

Commit Message

Don Breazeal Jan. 28, 2016, 12:48 a.m. UTC
  This patch addresses a failure in
gdb.threads/forking-threads-plus-breakpoint.exp:

FAIL: gdb.threads/forking-threads-plus-breakpoint.exp: cond_bp_target=1:
detach_on_fork=on: inferior 1 exited (timeout)

Cause:
A fork event was reported to GDB before GDB knew about the parent thread,
followed immediately by a breakpoint event in a different thread.  The
parent thread was subsequently added via remote_notice_new_inferior in
process_stop_reply, but when the thread was added the thread_info.state
was set to THREAD_STOPPED.  The fork event was then handled correctly,
but when the fork parent was resumed via a call to keep_going, the state
was unchanged.

The breakpoint event was then handled, which caused all the non-breakpoint
threads to be stopped.  When the breakpoint thread was resumed, all the
non-breakpoint threads were resumed via infrun.c:restart_threads.  Our old
fork parent wasn't restarted, because it still had thread_info.state set to
THREAD_STOPPED.  Ultimately the program under debug hung waiting for a
pthread_join while the old fork parent was stopped forever by GDB.

Fix:
Add a call to set_running just before resuming the fork parent in
infrun.c:handle_inferior_event_1.  The call could be put elsewhere
(e.g. in keep_going) but this approach makes the change for the case
of resuming after a fork event, without potential side-effects for
other cases where keep_going is called.

Tested on x86_64 Linux and Nios II Linux target with x86 Linux host.

Thanks,
--Don

gdb/ChangeLog:
2016-01-27  Don Breazeal  <donb@codesourcery.com>

	PR remote/19496
	* infrun.c (handle_inferior_event_1): Call set_running before
	resuming.

---
 gdb/infrun.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)
  

Comments

Pedro Alves Feb. 1, 2016, 12:05 p.m. UTC | #1
On 01/28/2016 12:48 AM, Don Breazeal wrote:

> 
> ---
>  gdb/infrun.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 15210c9..e324864 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -5232,7 +5232,13 @@ Cannot fill $_exitsignal with the correct signal number.\n"));
>  	  ecs->ptid = inferior_ptid;
>  
>  	  if (should_resume)
> -	    keep_going (ecs);
> +	    {
> +	      /* Make sure the thread is marked as running.  If the event
> +		 occurred in the thread before we added it, it may have
> +		 never been marked running.  */
> +	      set_running (inferior_ptid, 1);
> +	      keep_going (ecs);
> +	    }
>  	  else
>  	    stop_waiting (ecs);
>  	  return;
> 

Can you check whether this is still needed in current master?

Specifically, whether a2077e254098 ([PATCH] Fix PR 19461: strange "info thread"
behavior in non-stop) already fixed this?

See:
 https://sourceware.org/ml/gdb-patches/2016-01/msg00446.html
 https://sourceware.org/bugzilla/show_bug.cgi?id=19461

Thanks,
Pedro Alves
  
Don Breazeal Feb. 1, 2016, 7:29 p.m. UTC | #2
On 2/1/2016 4:05 AM, Pedro Alves wrote:
> On 01/28/2016 12:48 AM, Don Breazeal wrote:
> 
>>
>> ---
>>  gdb/infrun.c | 8 +++++++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/gdb/infrun.c b/gdb/infrun.c
>> index 15210c9..e324864 100644
>> --- a/gdb/infrun.c
>> +++ b/gdb/infrun.c
>> @@ -5232,7 +5232,13 @@ Cannot fill $_exitsignal with the correct signal number.\n"));
>>  	  ecs->ptid = inferior_ptid;
>>  
>>  	  if (should_resume)
>> -	    keep_going (ecs);
>> +	    {
>> +	      /* Make sure the thread is marked as running.  If the event
>> +		 occurred in the thread before we added it, it may have
>> +		 never been marked running.  */
>> +	      set_running (inferior_ptid, 1);
>> +	      keep_going (ecs);
>> +	    }
>>  	  else
>>  	    stop_waiting (ecs);
>>  	  return;
>>
> 
> Can you check whether this is still needed in current master?
> 
> Specifically, whether a2077e254098 ([PATCH] Fix PR 19461: strange "info thread"
> behavior in non-stop) already fixed this?
> 
> See:
>  https://sourceware.org/ml/gdb-patches/2016-01/msg00446.html
>  https://sourceware.org/bugzilla/show_bug.cgi?id=19461
> 
> Thanks,
> Pedro Alves
> 
Hi Pedro,
It looks like my patch is still needed.  FYI, the values of the
variables used to determine whether to call set_running in your patch
are this, in my failing test case:

follow_child 0
detach_fork 1
non_stop 1
sched_multi 0
target_is_non_stop_p () 1

Thanks,
--Don
  
Pedro Alves Feb. 1, 2016, 8:09 p.m. UTC | #3
On 02/01/2016 07:29 PM, Don Breazeal wrote:
> On 2/1/2016 4:05 AM, Pedro Alves wrote:

> Hi Pedro,
> It looks like my patch is still needed.  FYI, the values of the
> variables used to determine whether to call set_running in your patch
> are this, in my failing test case:
> 
> follow_child 0
> detach_fork 1
> non_stop 1
> sched_multi 0
> target_is_non_stop_p () 1

Hmm.

> A fork event was reported to GDB before GDB knew about the parent thread,
> followed immediately by a breakpoint event in a different thread.  The
> parent thread was subsequently added via remote_notice_new_inferior in
> process_stop_reply, but when the thread was added the thread_info.state
> was set to THREAD_STOPPED.  The fork event was then handled correctly,
> but when the fork parent was resumed via a call to keep_going, the state
> was unchanged.

Since this is non-stop, then it sounds to me like the bug is that the
thread should have been added in THREAD_RUNNING state.

Consider that infrun may be pulling target events out of the target_ops
backend into its own event queue, but, not process them immediately.

E.g., infrun may be stopping all threads temporarily for a step-over-breakpoint
operation for thread A (stop_all_threads).  The waitstatus of all threads
is thus left pending in the thread structure (save_status), including the
fork event of thread B.  Right at this point, if the user
does "info threads", that should show thread B (the fork parent) running,
not stopped, even if internally, gdb is holding it paused for a little bit.

Thanks,
Pedro Alves
  

Patch

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 15210c9..e324864 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -5232,7 +5232,13 @@  Cannot fill $_exitsignal with the correct signal number.\n"));
 	  ecs->ptid = inferior_ptid;
 
 	  if (should_resume)
-	    keep_going (ecs);
+	    {
+	      /* Make sure the thread is marked as running.  If the event
+		 occurred in the thread before we added it, it may have
+		 never been marked running.  */
+	      set_running (inferior_ptid, 1);
+	      keep_going (ecs);
+	    }
 	  else
 	    stop_waiting (ecs);
 	  return;