Patchwork [14/23] Tweak handling of remote errors in response to resumption packet

login
register
mail settings
Submitter Pedro Alves
Date Sept. 6, 2019, 11:27 p.m.
Message ID <20190906232807.6191-15-palves@redhat.com>
Download mbox | patch
Permalink /patch/34436/
State New
Headers show

Comments

Pedro Alves - Sept. 6, 2019, 11:27 p.m.
With current master, on a Fedora 27 machine with a kernel with buggy
watchpoint support, I see:

  (gdb) PASS: gdb.threads/watchpoint-fork.exp: parent: singlethreaded: hardware breakpoints work
  continue
  Continuing.
  warning: Remote failure reply: E01
  Remote communication error.  Target disconnected.: Connection reset by peer.
  (gdb) FAIL: gdb.threads/watchpoint-fork.exp: parent: singlethreaded: watchpoints work
  continue
  The program is not being run.
  (gdb) FAIL: gdb.threads/watchpoint-fork.exp: parent: singlethreaded: breakpoint after the first fork (the program is no longer running)

The FAILs themselves aren't what's interesting here.  What is
interesting is that with the main multi-target patch applied, I was getting this:

  (gdb) PASS: gdb.threads/watchpoint-fork.exp: parent: singlethreaded: hardware breakpoints work
  continue
  Continuing.
  warning: Remote failure reply: E01
  /home/pedro/brno/pedro/gdb/binutils-gdb-2/build/../src/gdb/inferior.c:285: internal-error: inferior* find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' failed.
  A problem internal to GDB has been detected,
  further debugging may prove unreliable.
  Quit this debugging session? (y or n) FAIL: gdb.threads/watchpoint-fork.exp: parent: singlethreaded: watchpoints work (GDB internal error)

The problem is that in remote_target::wait_as, we're hitting this:

  switch (buf[0])
    {
    case 'E':		/* Error of some sort.	*/
      /* We're out of sync with the target now.  Did it continue or
	 not?  Not is more likely, so report a stop.  */
      rs->waiting_for_stop_reply = 0;

      warning (_("Remote failure reply: %s"), buf);
      status->kind = TARGET_WAITKIND_STOPPED;
      status->value.sig = GDB_SIGNAL_0;
      break;

which leaves event_ptid as null_ptid.  At the end of the function, we then reach:

  else if (status->kind != TARGET_WAITKIND_EXITED
	   && status->kind != TARGET_WAITKIND_SIGNALLED)
    {
      if (event_ptid != null_ptid)
	record_currthread (rs, event_ptid);
      else
	event_ptid = inferior_ptid;                 <<<<< here
    }

and the trouble is that with the multi-target patch, we'll get here
with inferior_ptid as null_ptid too.  That is done exactly to find
these implicit assumptions that inferior_ptid is a good choice for
default thread, which isn't generaly true.

I first thought of fixing this in the "case 'E'" path, but, given that
this "event_ptid = inferior_ptid" path is also taken when the remote
target does not support threads at all, no thread-related packets or
extensions, it's better to fix it in latter path, to handle all
scenarios that miss reporting a thread.

That's what this patch does.

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <palves@redhat.com>

	* remote.c (first_remote_resumed_thread): New.
	(remote_target::wait_as): Use it as default event_ptid instead of
	inferior_ptid.
---
 gdb/remote.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)
Tankut Baris Aktemur - Oct. 9, 2019, 1:34 p.m.
* On September 7, 2019 1:28 AM, Pedro Alves wrote:
> 

> With current master, on a Fedora 27 machine with a kernel with buggy

> watchpoint support, I see:

> 

>   (gdb) PASS: gdb.threads/watchpoint-fork.exp: parent: singlethreaded: hardware breakpoints work

>   continue

>   Continuing.

>   warning: Remote failure reply: E01

>   Remote communication error.  Target disconnected.: Connection reset by peer.

>   (gdb) FAIL: gdb.threads/watchpoint-fork.exp: parent: singlethreaded: watchpoints work

>   continue

>   The program is not being run.

>   (gdb) FAIL: gdb.threads/watchpoint-fork.exp: parent: singlethreaded: breakpoint after the first fork (the program

> is no longer running)

> 

> The FAILs themselves aren't what's interesting here.  What is

> interesting is that with the main multi-target patch applied, I was getting this:

> 

>   (gdb) PASS: gdb.threads/watchpoint-fork.exp: parent: singlethreaded: hardware breakpoints work

>   continue

>   Continuing.

>   warning: Remote failure reply: E01

>   /home/pedro/brno/pedro/gdb/binutils-gdb-2/build/../src/gdb/inferior.c:285: internal-error: inferior*

> find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' failed.

>   A problem internal to GDB has been detected,

>   further debugging may prove unreliable.

>   Quit this debugging session? (y or n) FAIL: gdb.threads/watchpoint-fork.exp: parent: singlethreaded: watchpoints

> work (GDB internal error)

> 

> The problem is that in remote_target::wait_as, we're hitting this:

> 

>   switch (buf[0])

>     {

>     case 'E':		/* Error of some sort.	*/

>       /* We're out of sync with the target now.  Did it continue or

> 	 not?  Not is more likely, so report a stop.  */

>       rs->waiting_for_stop_reply = 0;

> 

>       warning (_("Remote failure reply: %s"), buf);

>       status->kind = TARGET_WAITKIND_STOPPED;

>       status->value.sig = GDB_SIGNAL_0;

>       break;

> 

> which leaves event_ptid as null_ptid.  At the end of the function, we then reach:

> 

>   else if (status->kind != TARGET_WAITKIND_EXITED

> 	   && status->kind != TARGET_WAITKIND_SIGNALLED)

>     {

>       if (event_ptid != null_ptid)

> 	record_currthread (rs, event_ptid);

>       else

> 	event_ptid = inferior_ptid;                 <<<<< here

>     }

> 

> and the trouble is that with the multi-target patch, we'll get here

> with inferior_ptid as null_ptid too.  That is done exactly to find

> these implicit assumptions that inferior_ptid is a good choice for

> default thread, which isn't generaly true.

> 

> I first thought of fixing this in the "case 'E'" path, but, given that

> this "event_ptid = inferior_ptid" path is also taken when the remote

> target does not support threads at all, no thread-related packets or

> extensions, it's better to fix it in latter path, to handle all

> scenarios that miss reporting a thread.

> 

> That's what this patch does.

>


A similar problem could occur in case of an exit event, too -- for instance,
if the remote target does not support the multi-process packet or if the
packet is disabled.

This can be checked by modifying the test

  testsuite/gdb.server/connect-without-multi-process.exp

to continue the inferior to termination.

To also catch the process exit case, the coverage of the patch can be
expanded.  Instead of 

   else
     /* A process exit.  Invalidate our notion of current thread.  */
     record_currthread (rs, minus_one_ptid);

the following can be done:

   else
     {
       /* A process exit.  Invalidate our notion of current thread.  */
       record_currthread (rs, minus_one_ptid);
       /* It's possible that the packet did not include a pid.  */
       if (event_ptid == null_ptid)
	event_ptid = first_remote_resumed_thread (this);
       /* EVENT_PTID could still be NULL_PTID.  Double-check.  */
       if (event_ptid == null_ptid)
	event_ptid = magic_null_ptid;
     } 

I'm attaching these as a patch in case one wants to test.  It should be applied
after the main multi-target patch.

Regards,
-Baris


> gdb/ChangeLog:

> yyyy-mm-dd  Pedro Alves  <palves@redhat.com>

> 

> 	* remote.c (first_remote_resumed_thread): New.

> 	(remote_target::wait_as): Use it as default event_ptid instead of

> 	inferior_ptid.

> ---

>  gdb/remote.c | 13 ++++++++++++-

>  1 file changed, 12 insertions(+), 1 deletion(-)

> 

> diff --git a/gdb/remote.c b/gdb/remote.c

> index bed861c65d..eacaf11976 100644

> --- a/gdb/remote.c

> +++ b/gdb/remote.c

> @@ -7703,6 +7703,17 @@ remote_target::wait_ns (ptid_t ptid, struct target_waitstatus *status, int optio

>      }

>  }

> 

> +/* Return the first resumed thread.  */

> +

> +static ptid_t

> +first_remote_resumed_thread ()

> +{

> +  for (thread_info *tp : all_non_exited_threads (minus_one_ptid))

> +    if (tp->resumed)

> +      return tp->ptid;

> +  return null_ptid;

> +}

> +

>  /* Wait until the remote machine stops, then return, storing status in

>     STATUS just as `wait' would.  */

> 

> @@ -7839,7 +7850,7 @@ remote_target::wait_as (ptid_t ptid, target_waitstatus *status, int options)

>        if (event_ptid != null_ptid)

>  	record_currthread (rs, event_ptid);

>        else

> -	event_ptid = inferior_ptid;

> +	event_ptid = first_remote_resumed_thread ();

>      }

>    else

>      /* A process exit.  Invalidate our notion of current thread.  */

> --

> 2.14.5


Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

Patch

diff --git a/gdb/remote.c b/gdb/remote.c
index bed861c65d..eacaf11976 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -7703,6 +7703,17 @@  remote_target::wait_ns (ptid_t ptid, struct target_waitstatus *status, int optio
     }
 }
 
+/* Return the first resumed thread.  */
+
+static ptid_t
+first_remote_resumed_thread ()
+{
+  for (thread_info *tp : all_non_exited_threads (minus_one_ptid))
+    if (tp->resumed)
+      return tp->ptid;
+  return null_ptid;
+}
+
 /* Wait until the remote machine stops, then return, storing status in
    STATUS just as `wait' would.  */
 
@@ -7839,7 +7850,7 @@  remote_target::wait_as (ptid_t ptid, target_waitstatus *status, int options)
       if (event_ptid != null_ptid)
 	record_currthread (rs, event_ptid);
       else
-	event_ptid = inferior_ptid;
+	event_ptid = first_remote_resumed_thread ();
     }
   else
     /* A process exit.  Invalidate our notion of current thread.  */