[00/12] Fix attach/run failure handling - gdbserver & Windows, document "E.MESSAGE" RSP errors, more

Message ID 20240419151342.1592474-1-pedro@palves.net
Headers
Series Fix attach/run failure handling - gdbserver & Windows, document "E.MESSAGE" RSP errors, more |

Message

Pedro Alves April 19, 2024, 3:13 p.m. UTC
  This is a "down the rabbit hole" series.

It started out as wanting to fix the hangs that you see on native
Windows when "attach" or "run" fails.  A subsequent "run" or "attach"
hangs/deadlocks forever.

Then I wrote tests for those.  And... they hit GDBserver bugs...

... so fixed the GDBserver bugs.

... but I want to be able to report textual errors to GDB using
E.MESSAGE responses.

... but GDB doesn't print the textual error in the case of "run"
failing.  But wait, textual errors in the RSP aren't documented...

... so I'm fixing that too.

... and then I notice gdb.base/attach.exp has a couple FAILs that
shouldn't be there, the tests should be skipped.  There is already a
similar test in the same file that is skipped, I should just copy the
predicate it uses to skip the test.  Oh wait, that is unnecessarily
heavy.  Let's fix that.

... but better fix that throughout the whole testsuite.

On the textual errors in RSP issue.  I am talking about the E.MESSAGE
responses.  I've been aware of those since forever, and all along I
thought they were documented, up until more recently, the fact that
most other people weren't familiar when them surprised me, at some
Cauldron GDB-related talk.  Recently, Sandra handed me CodeSourcery's
local GDB patches, to see if I could upstream any that relates to
Windows debugging, and there I saw the local patch that CS had been
carrying for many years that documents the "E.NAME.MESSAGE" response,
like so:

 +@item E.@var{name}@r{[}.@var{message}@r{]}
 +An error has occurred; @var{name} is the name of the error.  The name
 +may contain letters, numbers, and @samp{-} characters.  If present,
 +@var{message} is an error message, encoded using the escaped eight-bit
 +conventions for binary data described above.

... and also implements the GDB-side of parsing that.

Ahhh!  That _must_ have been were I first became aware of this "E."
response, back when I was at CS.  And this explains why we have this
comment upstream, too:

	  /* Always treat "E." as an error.  This will be used for
	     more verbose error messages, such as E.memtypes.  */

The CS patch goes on to say:

  +The protocol uses the following error names:
  +
  +@table @samp
  +
  +@item fatal
  +A fatal error has occurred; the stub will be unable to interact
  +further with @value{GDBN}.  Fatal errors should always include a
  +message explaining their cause.
  +
  +Any command may return this error.
  +
  +@item memtype
  +The memory addressed is of the wrong type for the given command.  For
  +example, a @samp{vFlashWrite} command applied to non-flash memory
  +elicits an @samp{E.memtype} error response.
  +
  +@end table
  +@end table

Ah, yeah, that matches.  vFlashWrite was contributed upstream by CS
(by Dan) back in 2006, in commit a76d924dffcb, here:

  https://sourceware.org/pipermail/gdb-patches/2006-September/047286.html

And that is exactly the patch that first made GDB understand "E.",
including that "E.memtypes" comment mentioned above.

OK, so seems like Dan/CS intended to upstream the "E.name.message"
format completely at some point, but never did.  And then, because GDB
prints whatever follows the "E." as the error message, GDBserver
started responding to errors with "E.MESSAGE" throughout and nowadays
it's kind of pervasive.

I think the ship for "E.NAME.MESSAGE" with binary encoded MESSAGE has
sailed though.  The format would work because "NAME" in E.NAME.MESSAGE
can't have a period.  So to find the message part, you'd look for the
second period.  But GDBserver has many instances of error messages
with two periods (and no "NAME" component), like e.g.:

 server.cc:      strcpy (own_buf, "E.Must select a single thread.");
 server.cc:      strcpy (own_buf, "E.No such thread.");
 server.cc:      sprintf (own_buf, "E.%s", exception.what ());
 server.cc:      strcpy (own_buf, "E.Must select a single thread.");
 server.cc:      strcpy (own_buf, "E.No such thread.");

Also, I can't find anywhere in GDB upstream or in CS's local patches
that handled "memtype" or "fatal" specially.  It seems like really the
NAME part in E.NAME.MESSAGE in CS-style error responses never found a
real use case.

So seems to me that best we can do is just document what we've been
using in practice for probably a decade and a half already, which is
just "E.ASCII_MESSAGE".  That is what this series does (among other
things).

Tested on x86_64 GNU/Linux {native, gdbserver, extended-gdbserver} and
Cygwin.

Pedro Alves (12):
  Document conventions for describing packet syntax
  Centralize documentation of error and empty RSP responses
  Document "E.MESSAGE" RSP errors
  Windows: Fix run/attach hang after bad run/attach
  Fix "run" failure handling with GDBserver
  Improve vRun error reporting
  Fix "attach" failure handling with GDBserver
  gdbserver: Fix vAttach response when attaching is not supported
  gdb_target_is_native -> gdb_protocol_is_native
  gdb_target_is_remote -> gdb_protocol_is_remote
  Eliminate gdb_is_target_remote / gdb_is_target_native & friends
  Fix gdb.base/attach.exp --pid test skipping on
    native-extended-gdbserver

 gdb/doc/gdb.texinfo                           | 264 ++++--------------
 gdb/remote.c                                  |  67 ++++-
 .../gdb.arch/aarch64-sme-core.exp.tcl         |  12 +-
 .../aarch64-sme-regs-available.exp.tcl        |  25 +-
 .../aarch64-sme-regs-sigframe.exp.tcl         |  25 +-
 .../aarch64-sme-regs-unavailable.exp.tcl      |  12 +-
 gdb/testsuite/gdb.arch/aarch64-sme-sanity.exp |  16 +-
 gdb/testsuite/gdb.base/attach-fail-twice.c    |  27 ++
 gdb/testsuite/gdb.base/attach-fail-twice.exp  |  94 +++++++
 gdb/testsuite/gdb.base/attach.exp             |  23 +-
 gdb/testsuite/gdb.base/cond-eval-mode.exp     |   2 +-
 gdb/testsuite/gdb.base/dprintf.exp            |   2 +-
 gdb/testsuite/gdb.base/foll-exec-mode.exp     |   4 +-
 .../gdb.base/hbreak-in-shr-unsupported.exp    |   6 +-
 gdb/testsuite/gdb.base/load-command.exp       |  11 +-
 gdb/testsuite/gdb.base/run-fail-twice.c       |  20 ++
 gdb/testsuite/gdb.base/run-fail-twice.exp     |  63 +++++
 gdb/testsuite/gdb.mi/mi-nonstop.exp           |   2 +-
 gdb/testsuite/gdb.multi/stop-all-on-exit.exp  |  16 +-
 gdb/testsuite/gdb.python/py-evsignal.exp      |   3 +-
 gdb/testsuite/gdb.python/py-inferior.exp      |   2 +-
 .../gdb.reverse/finish-reverse-next.exp       |   1 -
 .../gdb.threads/break-while-running.exp       |   2 +-
 .../main-thread-exit-during-detach.exp        |   2 +-
 .../process-dies-while-handling-bp.exp        |   2 +-
 .../gdb.threads/threads-after-exec.exp        |   2 +-
 gdb/testsuite/gdb.trace/change-loc.exp        |  10 +-
 gdb/testsuite/gdb.trace/ftrace.exp            |   2 +-
 gdb/testsuite/gdb.trace/qtro.exp              |  11 +-
 gdb/testsuite/lib/gdb.exp                     |  62 +---
 gdb/testsuite/lib/mi-support.exp              |   9 -
 gdb/windows-nat.c                             |  35 ++-
 gdbserver/server.cc                           |  54 ++--
 33 files changed, 468 insertions(+), 420 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/attach-fail-twice.c
 create mode 100644 gdb/testsuite/gdb.base/attach-fail-twice.exp
 create mode 100644 gdb/testsuite/gdb.base/run-fail-twice.c
 create mode 100644 gdb/testsuite/gdb.base/run-fail-twice.exp


base-commit: 0e6747d2a638693ad2f20e7929c8364913c87279
  

Comments

Pedro Alves April 26, 2024, 8:25 p.m. UTC | #1
On 2024-04-19 16:13, Pedro Alves wrote:

> Pedro Alves (12):
>   Document conventions for describing packet syntax
>   Centralize documentation of error and empty RSP responses
>   Document "E.MESSAGE" RSP errors
>   Windows: Fix run/attach hang after bad run/attach
>   Fix "run" failure handling with GDBserver
>   Improve vRun error reporting
>   Fix "attach" failure handling with GDBserver
>   gdbserver: Fix vAttach response when attaching is not supported
>   gdb_target_is_native -> gdb_protocol_is_native
>   gdb_target_is_remote -> gdb_protocol_is_remote
>   Eliminate gdb_is_target_remote / gdb_is_target_native & friends
>   Fix gdb.base/attach.exp --pid test skipping on
>     native-extended-gdbserver
> 

FYI, I've now merged this.