[2/3] Initialize last_resume_kind for remote fork child

Message ID 1432320931-1550-3-git-send-email-donb@codesourcery.com
State New, archived
Headers

Commit Message

Don Breazeal May 22, 2015, 6:55 p.m. UTC
  This patch fixes some intermittent test failures in
gdb.base/foll-vfork.exp where a vfork child could sometimes be
(incorrectly) resumed when handling the vfork event.  In this
case the result was a subsequent event reported to the client
side as a SIGTRAP delivered to the as-yet-unknown child thread.

The fix is to initialize the threadinfo.last_resume_kind field
for new fork children in gdbserver/linux-low.c:handle_extended_wait.
This prevents the event handler from incorrectly resuming the child
if the stop event that is generated when it starts is reported
before the vfork event that initiated it.  The same thing is done
for the native case in linux-nat.c:linux_child_follow_fork.

In addition, it seemed prudent to initialize lwp_info.status_pending_p
for the new fork child.  I also rearranged the initialization code
so that all of the lwp_info initialization was together, rather than 
intermixed with thread_info and process_info initialization.

Tested native, native-gdbserver, native-extended-gdbserver on
x86_64 GNU/Linux.

OK?

Thanks
--Don

gdb/gdbserver/
2015-05-22  Don Breazeal  <donb@codesourcery.com>

	* linux-low.c (handle_extended_wait):

---
 gdb/gdbserver/linux-low.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
  

Comments

Pedro Alves May 23, 2015, 12:05 p.m. UTC | #1
On 05/22/2015 07:55 PM, Don Breazeal wrote:
> This patch fixes some intermittent test failures in
> gdb.base/foll-vfork.exp where a vfork child could sometimes be
> (incorrectly) resumed when handling the vfork event.  In this
> case the result was a subsequent event reported to the client
> side as a SIGTRAP delivered to the as-yet-unknown child thread.
> 
> The fix is to initialize the threadinfo.last_resume_kind field
> for new fork children in gdbserver/linux-low.c:handle_extended_wait.
> This prevents the event handler from incorrectly resuming the child
> if the stop event that is generated when it starts is reported
> before the vfork event that initiated it.  

I'm not seeing how what comes first makes a difference, actually.

> The same thing is done
> for the native case in linux-nat.c:linux_child_follow_fork.

The code that resumes the child by mistake in question must be
resume_stopped_resumed_lwps, and that is called similarly by
both backends after pulling events out of the kernel, and right
after linux_*_filter_event (which is what calls into handle_extended_wait):

      /* Now that we've pulled all events out of the kernel, resume
	 LWPs that don't have an interesting event to report.  */
      iterate_over_lwps (minus_one_ptid,
			 resume_stopped_resumed_lwps, &minus_one_ptid);

...

     /* Now that we've pulled all events out of the kernel, resume
	 LWPs that don't have an interesting event to report.  */
      if (stopping_threads == NOT_STOPPING_THREADS)
	for_each_inferior (&all_threads, resume_stopped_resumed_lwps);

That happens before linux_child_follow_fork is reached.

The first difference is that gdb's version does not add the fork
child to the lwp list in handle_extended_wait.  But even if it did,
the way linux-nat.c tags lwps as "not stopped-resumed" is different
in gdb and gdbserver.  gdb's has:

static int
resume_stopped_resumed_lwps (struct lwp_info *lp, void *data)
{
  ptid_t *wait_ptid_p = data;

  if (lp->stopped
      && lp->resumed
      && !lwp_status_pending_p (lp))
    {

while gdbserver's:

static void
resume_stopped_resumed_lwps (struct inferior_list_entry *entry)
{
  struct thread_info *thread = (struct thread_info *) entry;
  struct lwp_info *lp = get_thread_lwp (thread);

  if (lp->stopped
      && !lp->status_pending_p
      && thread->last_resume_kind != resume_stop
      && thread->last_status.kind == TARGET_WAITKIND_IGNORE)
    {

So the main difference here is that linux-nat.c checks that "resumed" flag
which doesn't exist in the gdbserver version.  Checking lwp->resumed in
gdb is equivalent to checking:

      (thread->last_resume_kind != resume_stop
      && thread->last_status.kind == TARGET_WAITKIND_IGNORE)

in gdbserver/linux-low.c.  gdbserver/linux-low.c's add_thread function
creates threads with last_resume_kind == resume_continue by default.
So the fix is to make sure to override that to resume_stop.  gdb's
linux_child_follow_fork version does the equivalent, but that's not
what exactly prevents a spurious resume (it's the fact that add_lwp
creates lwps with the 'resume' flag cleared).

> 
> In addition, it seemed prudent to initialize lwp_info.status_pending_p
> for the new fork child.  

That's fine.  I think the child can well report a signal other than
SIGSTOP first (or even be killed/disappear), in which case we should
leave it pending (thus set status_pending_p), but that's a separate,
preexisting issue.

> I also rearranged the initialization code
> so that all of the lwp_info initialization was together, rather than 
> intermixed with thread_info and process_info initialization.
> 
> Tested native, native-gdbserver, native-extended-gdbserver on
> x86_64 GNU/Linux.
> 
> OK?

So the fix is OK, though I think the commit log could be
clarified.

> 
> Thanks
> --Don
> 
> gdb/gdbserver/
> 2015-05-22  Don Breazeal  <donb@codesourcery.com>
> 
> 	* linux-low.c (handle_extended_wait):

Seems incomplete.  :-)

> 
> ---
>  gdb/gdbserver/linux-low.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
> index 9f3ea48..d763c66 100644
> --- a/gdb/gdbserver/linux-low.c
> +++ b/gdb/gdbserver/linux-low.c
> @@ -457,6 +457,7 @@ handle_extended_wait (struct lwp_info *event_lwp, int wstat)
>  	  struct process_info *parent_proc;
>  	  struct process_info *child_proc;
>  	  struct lwp_info *child_lwp;
> +	  struct thread_info *child_thr;
>  	  struct target_desc *tdesc;
>  
>  	  ptid = ptid_build (new_pid, new_pid, 0);
> @@ -479,6 +480,10 @@ handle_extended_wait (struct lwp_info *event_lwp, int wstat)
>  	  child_lwp = add_lwp (ptid);
>  	  gdb_assert (child_lwp != NULL);
>  	  child_lwp->stopped = 1;
> +	  child_lwp->must_set_ptrace_flags = 1;
> +	  child_lwp->status_pending_p = 0;
> +	  child_thr = get_lwp_thread (child_lwp);
> +	  child_thr->last_resume_kind = resume_stop;
>  	  parent_proc = get_thread_process (event_thr);
>  	  child_proc->attached = parent_proc->attached;
>  	  clone_all_breakpoints (&child_proc->breakpoints,
> @@ -488,7 +493,6 @@ handle_extended_wait (struct lwp_info *event_lwp, int wstat)
>  	  tdesc = xmalloc (sizeof (struct target_desc));
>  	  copy_target_description (tdesc, parent_proc->tdesc);
>  	  child_proc->tdesc = tdesc;
> -	  child_lwp->must_set_ptrace_flags = 1;
>  
>  	  /* Clone arch-specific process data.  */
>  	  if (the_low_target.new_fork != NULL)
> 

Thanks,
Pedro Alves
  

Patch

diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index 9f3ea48..d763c66 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -457,6 +457,7 @@  handle_extended_wait (struct lwp_info *event_lwp, int wstat)
 	  struct process_info *parent_proc;
 	  struct process_info *child_proc;
 	  struct lwp_info *child_lwp;
+	  struct thread_info *child_thr;
 	  struct target_desc *tdesc;
 
 	  ptid = ptid_build (new_pid, new_pid, 0);
@@ -479,6 +480,10 @@  handle_extended_wait (struct lwp_info *event_lwp, int wstat)
 	  child_lwp = add_lwp (ptid);
 	  gdb_assert (child_lwp != NULL);
 	  child_lwp->stopped = 1;
+	  child_lwp->must_set_ptrace_flags = 1;
+	  child_lwp->status_pending_p = 0;
+	  child_thr = get_lwp_thread (child_lwp);
+	  child_thr->last_resume_kind = resume_stop;
 	  parent_proc = get_thread_process (event_thr);
 	  child_proc->attached = parent_proc->attached;
 	  clone_all_breakpoints (&child_proc->breakpoints,
@@ -488,7 +493,6 @@  handle_extended_wait (struct lwp_info *event_lwp, int wstat)
 	  tdesc = xmalloc (sizeof (struct target_desc));
 	  copy_target_description (tdesc, parent_proc->tdesc);
 	  child_proc->tdesc = tdesc;
-	  child_lwp->must_set_ptrace_flags = 1;
 
 	  /* Clone arch-specific process data.  */
 	  if (the_low_target.new_fork != NULL)