[5/9] fbsd-nat: Defer any ineligible events reported by wait.

Message ID 20230228181845.99936-6-jhb@FreeBSD.org
State New
Headers
Series Fixes for multiprocess for FreeBSD's native target |

Commit Message

John Baldwin Feb. 28, 2023, 6:18 p.m. UTC
  If wait_1 finds an event for a thread or process that does not match
the set of threads and processes previously resumed, defer the event.
If the event is for a specific thread, suspend the thread and continue
the associated process before waiting for another event.

One specific example of such an event is if a thread is created while
another thread in the same process hits a breakpoint.  If the second
thread's event is reported first, the target resume method does not
yet "know" about the new thread and will not suspend it via
PT_SUSPEND.  When wait is called, it will probably return the event
from the first thread before the result of the step from second
thread.  This is the case reported in PR 21497.
---
 gdb/fbsd-nat.c | 34 +++++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)
  

Comments

Simon Marchi March 20, 2023, 7:09 p.m. UTC | #1
On 2/28/23 13:18, John Baldwin wrote:
> If wait_1 finds an event for a thread or process that does not match
> the set of threads and processes previously resumed, defer the event.
> If the event is for a specific thread, suspend the thread and continue
> the associated process before waiting for another event.

I do not understand this, because it doesn't match my mental model of
how things should work (which is derived from working with linux-nat).
In that model, a thread with a pending event should never be
ptrace-resumed.  So the last sentence doesn't make sense, it implies
that a thread has a pending and is ptrace-resumed (since we suspend it).

> One specific example of such an event is if a thread is created while
> another thread in the same process hits a breakpoint.  If the second
> thread's event is reported first, the target resume method does not
> yet "know" about the new thread and will not suspend it via
> PT_SUSPEND.

I was surprised to read that the resume method suspends some threads.  I
don't think it should, but I do see the current code.  The current code
appears to interpret the resume request as: "ensure the state of thread
resumption looks exactly like this", so it suspends threads as needed,
to match the requested resumption state.  I don't think it's supposed to
work like that.  The resume method should only resume some threads, and
if the core of GDB wants some threads stopped, it will call the stop
method.

Anyhow, just to be sure I understand the problem you describe: when
fbsd-nat reports the breakpoint hit from "another thread", is the
"thread created" event still in the kernel, not consumed by GDB?

> When wait is called, it will probably return the event
> from the first thread before the result of the step from second
> thread.  This is the case reported in PR 21497.

> ---
>  gdb/fbsd-nat.c | 34 +++++++++++++++++++++++++++++++++-
>  1 file changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/fbsd-nat.c b/gdb/fbsd-nat.c
> index ca278b871ef..3f7278c6ea0 100644
> --- a/gdb/fbsd-nat.c
> +++ b/gdb/fbsd-nat.c
> @@ -1514,7 +1514,39 @@ fbsd_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
>    if (is_async_p ())
>      async_file_flush ();
>  
> -  wptid = wait_1 (ptid, ourstatus, target_options);
> +  while (1)
> +    {
> +      wptid = wait_1 (ptid, ourstatus, target_options);
> +
> +      /* If no event was found, just return.  */
> +      if (ourstatus->kind () == TARGET_WAITKIND_IGNORE
> +	  || ourstatus->kind () == TARGET_WAITKIND_NO_RESUMED)
> +	break;
> +
> +      /* If an event is reported for a thread or process while
> +	 stepping some other thread, suspend the thread reporting the
> +	 event and defer the event until it can be reported to the
> +	 core.  */
> +      if (!wptid.matches (resume_ptid))
> +	{
> +	  add_pending_event (wptid, *ourstatus);
> +	  fbsd_nat_debug_printf ("deferring event [%s], [%s]",
> +				 target_pid_to_str (wptid).c_str (),
> +				 ourstatus->to_string ().c_str ());
> +	  if (wptid.pid () == resume_ptid.pid ())
> +	    {
> +	      fbsd_nat_debug_printf ("suspending thread [%s]",
> +				     target_pid_to_str (wptid).c_str ());
> +	      if (ptrace (PT_SUSPEND, wptid.lwp (), NULL, 0) == -1)
> +		perror_with_name (("ptrace (PT_SUSPEND)"));


This is the part I don't understand.  After wait_1 returned an event for
a specific thread, I would expect this thread to be ptrace-stopped.  So,
I don't see the need to suspend it.  You'd just leave it in the "resumed
from the
until.

> +	      if (ptrace (PT_CONTINUE, wptid.pid (), (caddr_t) 1, 0) == -1)
> +		perror_with_name (("ptrace (PT_CONTINUE)"));

I don't get why you continue `wptid.pid ()`.  What assures you that this
particular thread was stopped previously?  Perhaps I don't understand
because I assume somethings about pids/lwps and ptrace that are only
true on Linux.

Simon
  
John Baldwin March 27, 2023, 8:49 p.m. UTC | #2
On 3/20/23 12:09 PM, Simon Marchi wrote:
> On 2/28/23 13:18, John Baldwin wrote:
>> If wait_1 finds an event for a thread or process that does not match
>> the set of threads and processes previously resumed, defer the event.
>> If the event is for a specific thread, suspend the thread and continue
>> the associated process before waiting for another event.
> 
> I do not understand this, because it doesn't match my mental model of
> how things should work (which is derived from working with linux-nat).
> In that model, a thread with a pending event should never be
> ptrace-resumed.  So the last sentence doesn't make sense, it implies
> that a thread has a pending and is ptrace-resumed (since we suspend it).

The difference here might be that on FreeBSD ptrace can only PT_CONTINUE
and wait() for entire processes.  When a thread contains multiple threads,
PT_CONTINUE resumes all threads belonging to that process.  Individual
threads can be controlled by using PT_SUSPEND on a specific LWP.  Those
LWPs will stay asleep after PT_CONTINUE.  If any thread encounters a
stop event (breakpoint, fork event, etc.) the entire process stops (the
kernel signals threads in userland if needed to get them back into the
kernel to stop).  Once it is stopped it reports an event via wait().

Currently on FreeBSD wait() only reports a single event at once.  So
if multiple threads hit a breakpoint, wait() returns for the first one,
PT_LWPINFO is used on the pid to find out what specific event happened
(and which LWP it happened for).  The other thread will keep its event
pending in the kernel, and after PT_CONTINUE it will immediately trigger
another stop and an event reported from wait().

>> One specific example of such an event is if a thread is created while
>> another thread in the same process hits a breakpoint.  If the second
>> thread's event is reported first, the target resume method does not
>> yet "know" about the new thread and will not suspend it via
>> PT_SUSPEND.
> 
> I was surprised to read that the resume method suspends some threads.  I
> don't think it should, but I do see the current code.  The current code
> appears to interpret the resume request as: "ensure the state of thread
> resumption looks exactly like this", so it suspends threads as needed,
> to match the requested resumption state.  I don't think it's supposed to
> work like that.  The resume method should only resume some threads, and
> if the core of GDB wants some threads stopped, it will call the stop
> method.

See above.  I can't individually suspend/resume threads.  I can only
resume processes, but decide which set of threads within a process
to keep resumed when resuming the entire process.

> Anyhow, just to be sure I understand the problem you describe: when
> fbsd-nat reports the breakpoint hit from "another thread", is the
> "thread created" event still in the kernel, not consumed by GDB?

Yes.  What happens is that GDB tries to single-step a thread over a
breakpoint, so it does a resume() of a single ptid_t.  To implement
this, fbsd-nat's resume has to explicitly PT_SUSPEND all the other
LWPs in the process before doing a PT_CONTINUE so that only the
stepped thread will execute.  However, this logic doesn't yet know
about the new thread (and it hasn't started executing in userland
yet).  However, once the new thread starts executing, it immediately
reports a ptrace event.  This confuses GDB currently becuase that
ptrace event can be reported right after the PT_CONTINUE.  I think
in part this is because unlike on Linux, when thread A creates thread
B, thread A doesn't report anything.  Instead, only thread B reports
an event when it starts.  On Linux thread creation is more like fork
where thread A reports an event saying it created B.
  
>> When wait is called, it will probably return the event
>> from the first thread before the result of the step from second
>> thread.  This is the case reported in PR 21497.
> 
>> ---
>>   gdb/fbsd-nat.c | 34 +++++++++++++++++++++++++++++++++-
>>   1 file changed, 33 insertions(+), 1 deletion(-)
>>
>> diff --git a/gdb/fbsd-nat.c b/gdb/fbsd-nat.c
>> index ca278b871ef..3f7278c6ea0 100644
>> --- a/gdb/fbsd-nat.c
>> +++ b/gdb/fbsd-nat.c
>> @@ -1514,7 +1514,39 @@ fbsd_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
>>     if (is_async_p ())
>>       async_file_flush ();
>>   
>> -  wptid = wait_1 (ptid, ourstatus, target_options);
>> +  while (1)
>> +    {
>> +      wptid = wait_1 (ptid, ourstatus, target_options);
>> +
>> +      /* If no event was found, just return.  */
>> +      if (ourstatus->kind () == TARGET_WAITKIND_IGNORE
>> +	  || ourstatus->kind () == TARGET_WAITKIND_NO_RESUMED)
>> +	break;
>> +
>> +      /* If an event is reported for a thread or process while
>> +	 stepping some other thread, suspend the thread reporting the
>> +	 event and defer the event until it can be reported to the
>> +	 core.  */
>> +      if (!wptid.matches (resume_ptid))
>> +	{
>> +	  add_pending_event (wptid, *ourstatus);
>> +	  fbsd_nat_debug_printf ("deferring event [%s], [%s]",
>> +				 target_pid_to_str (wptid).c_str (),
>> +				 ourstatus->to_string ().c_str ());
>> +	  if (wptid.pid () == resume_ptid.pid ())
>> +	    {
>> +	      fbsd_nat_debug_printf ("suspending thread [%s]",
>> +				     target_pid_to_str (wptid).c_str ());
>> +	      if (ptrace (PT_SUSPEND, wptid.lwp (), NULL, 0) == -1)
>> +		perror_with_name (("ptrace (PT_SUSPEND)"));
> 
> 
> This is the part I don't understand.  After wait_1 returned an event for
> a specific thread, I would expect this thread to be ptrace-stopped.  So,
> I don't see the need to suspend it.  You'd just leave it in the "resumed
> from the
> until.
> 
>> +	      if (ptrace (PT_CONTINUE, wptid.pid (), (caddr_t) 1, 0) == -1)
>> +		perror_with_name (("ptrace (PT_CONTINUE)"));
> 
> I don't get why you continue `wptid.pid ()`.  What assures you that this
> particular thread was stopped previously?  Perhaps I don't understand
> because I assume somethings about pids/lwps and ptrace that are only
> true on Linux.

In this specific case, here is the sequence of events:

- process P1 contains two threads, T1, and T2
- T1 creates a new thread T3, but T3 isn't yet scheduled to execute
- T2 hits a breakpoint so the process stops
- wait() returns P1
- PT_LWPINFO reports the breakpoint event for T2
- GDB undoes the breakpoint and tries to single-step T2 over the
   breakpoint
- GDB calls resume with step=1 and scope_ptid=ptid(P1, T2)
- fbsd_nat::resume iterates over the threads it knows about, but
   it only knows about T1 and T2
   - PT_SUSPEND T1
   - PT_RESUME T2
- the P1 process is now resumed via PT_CONTINUE
- T3 starts executing and reports its "born" event
- wait() returns P1
- PT_LWPINFO reports the thread birth event for T3

Previously at this point fbsd_nat::wait() returned an event for T3
which triggers the assertion failure in the PR.  With this patch
what happens now is:

- fbsd_nat::wait() realizes T3 isn't "resumed" from the core's
   perspective so explicitly suspends it via PT_SUSPEND
- the P1 process is continued via PT_CONTINUE
- T2 completes its step and reports the event
- wait() returns P1
- PT_LWPINFO reports the step-complete event for P2
- the event for T2 is returned from fbsd_nat::wait() to the core
  
Simon Marchi March 27, 2023, 9:04 p.m. UTC | #3
On 3/27/23 16:49, John Baldwin wrote:
> On 3/20/23 12:09 PM, Simon Marchi wrote:
>> On 2/28/23 13:18, John Baldwin wrote:
>>> If wait_1 finds an event for a thread or process that does not match
>>> the set of threads and processes previously resumed, defer the event.
>>> If the event is for a specific thread, suspend the thread and continue
>>> the associated process before waiting for another event.
>>
>> I do not understand this, because it doesn't match my mental model of
>> how things should work (which is derived from working with linux-nat).
>> In that model, a thread with a pending event should never be
>> ptrace-resumed.  So the last sentence doesn't make sense, it implies
>> that a thread has a pending and is ptrace-resumed (since we suspend it).
> 
> The difference here might be that on FreeBSD ptrace can only PT_CONTINUE
> and wait() for entire processes.  When a thread contains multiple threads,
> PT_CONTINUE resumes all threads belonging to that process.  Individual
> threads can be controlled by using PT_SUSPEND on a specific LWP.  Those
> LWPs will stay asleep after PT_CONTINUE.  If any thread encounters a
> stop event (breakpoint, fork event, etc.) the entire process stops (the
> kernel signals threads in userland if needed to get them back into the
> kernel to stop).  Once it is stopped it reports an event via wait().
> 
> Currently on FreeBSD wait() only reports a single event at once.  So
> if multiple threads hit a breakpoint, wait() returns for the first one,
> PT_LWPINFO is used on the pid to find out what specific event happened
> (and which LWP it happened for).  The other thread will keep its event
> pending in the kernel, and after PT_CONTINUE it will immediately trigger
> another stop and an event reported from wait().

Thanks for the explanation.  I'll keep that in mind when re-reading
(perhaps a v2, if you want to rebase and submit a fresh version).

>>> One specific example of such an event is if a thread is created while
>>> another thread in the same process hits a breakpoint.  If the second
>>> thread's event is reported first, the target resume method does not
>>> yet "know" about the new thread and will not suspend it via
>>> PT_SUSPEND.
>>
>> I was surprised to read that the resume method suspends some threads.  I
>> don't think it should, but I do see the current code.  The current code
>> appears to interpret the resume request as: "ensure the state of thread
>> resumption looks exactly like this", so it suspends threads as needed,
>> to match the requested resumption state.  I don't think it's supposed to
>> work like that.  The resume method should only resume some threads, and
>> if the core of GDB wants some threads stopped, it will call the stop
>> method.
> 
> See above.  I can't individually suspend/resume threads.  I can only
> resume processes, but decide which set of threads within a process
> to keep resumed when resuming the entire process.

Ack.

> 
>> Anyhow, just to be sure I understand the problem you describe: when
>> fbsd-nat reports the breakpoint hit from "another thread", is the
>> "thread created" event still in the kernel, not consumed by GDB?
> 
> Yes.  What happens is that GDB tries to single-step a thread over a
> breakpoint, so it does a resume() of a single ptid_t.  To implement
> this, fbsd-nat's resume has to explicitly PT_SUSPEND all the other
> LWPs in the process before doing a PT_CONTINUE so that only the
> stepped thread will execute.  However, this logic doesn't yet know
> about the new thread (and it hasn't started executing in userland
> yet).  However, once the new thread starts executing, it immediately
> reports a ptrace event.  This confuses GDB currently becuase that
> ptrace event can be reported right after the PT_CONTINUE.  I think
> in part this is because unlike on Linux, when thread A creates thread
> B, thread A doesn't report anything.  Instead, only thread B reports
> an event when it starts.  On Linux thread creation is more like fork
> where thread A reports an event saying it created B.

Ack.

>  
>>> When wait is called, it will probably return the event
>>> from the first thread before the result of the step from second
>>> thread.  This is the case reported in PR 21497.
>>
>>> ---
>>>   gdb/fbsd-nat.c | 34 +++++++++++++++++++++++++++++++++-
>>>   1 file changed, 33 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/gdb/fbsd-nat.c b/gdb/fbsd-nat.c
>>> index ca278b871ef..3f7278c6ea0 100644
>>> --- a/gdb/fbsd-nat.c
>>> +++ b/gdb/fbsd-nat.c
>>> @@ -1514,7 +1514,39 @@ fbsd_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
>>>     if (is_async_p ())
>>>       async_file_flush ();
>>>   -  wptid = wait_1 (ptid, ourstatus, target_options);
>>> +  while (1)
>>> +    {
>>> +      wptid = wait_1 (ptid, ourstatus, target_options);
>>> +
>>> +      /* If no event was found, just return.  */
>>> +      if (ourstatus->kind () == TARGET_WAITKIND_IGNORE
>>> +      || ourstatus->kind () == TARGET_WAITKIND_NO_RESUMED)
>>> +    break;
>>> +
>>> +      /* If an event is reported for a thread or process while
>>> +     stepping some other thread, suspend the thread reporting the
>>> +     event and defer the event until it can be reported to the
>>> +     core.  */
>>> +      if (!wptid.matches (resume_ptid))
>>> +    {
>>> +      add_pending_event (wptid, *ourstatus);
>>> +      fbsd_nat_debug_printf ("deferring event [%s], [%s]",
>>> +                 target_pid_to_str (wptid).c_str (),
>>> +                 ourstatus->to_string ().c_str ());
>>> +      if (wptid.pid () == resume_ptid.pid ())
>>> +        {
>>> +          fbsd_nat_debug_printf ("suspending thread [%s]",
>>> +                     target_pid_to_str (wptid).c_str ());
>>> +          if (ptrace (PT_SUSPEND, wptid.lwp (), NULL, 0) == -1)
>>> +        perror_with_name (("ptrace (PT_SUSPEND)"));
>>
>>
>> This is the part I don't understand.  After wait_1 returned an event for
>> a specific thread, I would expect this thread to be ptrace-stopped.  So,
>> I don't see the need to suspend it.  You'd just leave it in the "resumed
>> from the
>> until.
>>
>>> +          if (ptrace (PT_CONTINUE, wptid.pid (), (caddr_t) 1, 0) == -1)
>>> +        perror_with_name (("ptrace (PT_CONTINUE)"));
>>
>> I don't get why you continue `wptid.pid ()`.  What assures you that this
>> particular thread was stopped previously?  Perhaps I don't understand
>> because I assume somethings about pids/lwps and ptrace that are only
>> true on Linux.
> 
> In this specific case, here is the sequence of events:
> 
> - process P1 contains two threads, T1, and T2
> - T1 creates a new thread T3, but T3 isn't yet scheduled to execute
> - T2 hits a breakpoint so the process stops
> - wait() returns P1
> - PT_LWPINFO reports the breakpoint event for T2
> - GDB undoes the breakpoint and tries to single-step T2 over the
>   breakpoint
> - GDB calls resume with step=1 and scope_ptid=ptid(P1, T2)
> - fbsd_nat::resume iterates over the threads it knows about, but
>   it only knows about T1 and T2
>   - PT_SUSPEND T1
>   - PT_RESUME T2
> - the P1 process is now resumed via PT_CONTINUE
> - T3 starts executing and reports its "born" event
> - wait() returns P1
> - PT_LWPINFO reports the thread birth event for T3
> 
> Previously at this point fbsd_nat::wait() returned an event for T3
> which triggers the assertion failure in the PR.  With this patch
> what happens now is:
> 
> - fbsd_nat::wait() realizes T3 isn't "resumed" from the core's
>   perspective so explicitly suspends it via PT_SUSPEND
> - the P1 process is continued via PT_CONTINUE
> - T2 completes its step and reports the event
> - wait() returns P1
> - PT_LWPINFO reports the step-complete event for P2
> - the event for T2 is returned from fbsd_nat::wait() to the core

Ok, thanks, that makes sense.

Simon
  

Patch

diff --git a/gdb/fbsd-nat.c b/gdb/fbsd-nat.c
index ca278b871ef..3f7278c6ea0 100644
--- a/gdb/fbsd-nat.c
+++ b/gdb/fbsd-nat.c
@@ -1514,7 +1514,39 @@  fbsd_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
   if (is_async_p ())
     async_file_flush ();
 
-  wptid = wait_1 (ptid, ourstatus, target_options);
+  while (1)
+    {
+      wptid = wait_1 (ptid, ourstatus, target_options);
+
+      /* If no event was found, just return.  */
+      if (ourstatus->kind () == TARGET_WAITKIND_IGNORE
+	  || ourstatus->kind () == TARGET_WAITKIND_NO_RESUMED)
+	break;
+
+      /* If an event is reported for a thread or process while
+	 stepping some other thread, suspend the thread reporting the
+	 event and defer the event until it can be reported to the
+	 core.  */
+      if (!wptid.matches (resume_ptid))
+	{
+	  add_pending_event (wptid, *ourstatus);
+	  fbsd_nat_debug_printf ("deferring event [%s], [%s]",
+				 target_pid_to_str (wptid).c_str (),
+				 ourstatus->to_string ().c_str ());
+	  if (wptid.pid () == resume_ptid.pid ())
+	    {
+	      fbsd_nat_debug_printf ("suspending thread [%s]",
+				     target_pid_to_str (wptid).c_str ());
+	      if (ptrace (PT_SUSPEND, wptid.lwp (), NULL, 0) == -1)
+		perror_with_name (("ptrace (PT_SUSPEND)"));
+	      if (ptrace (PT_CONTINUE, wptid.pid (), (caddr_t) 1, 0) == -1)
+		perror_with_name (("ptrace (PT_CONTINUE)"));
+	    }
+	  continue;
+	}
+
+      break;
+    }
 
   /* If we are in async mode and found an event, there may still be
      another event pending.  Trigger the event pipe so that that the