[3/3] gdb, gdbserver: introduce the 'x' RSP packet for binary memory read

Message ID 990be8b42f1f6ca33ffed7a8ae7ead327009d847.1710343840.git.tankut.baris.aktemur@intel.com
State New
Headers
Series Introduce the 'x' RSP packet |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Testing passed

Commit Message

Aktemur, Tankut Baris March 13, 2024, 3:35 p.m. UTC
  Introduce an RSP packet, 'x', for reading from the remote server
memory in binary format.  The binary write packet, 'X' already exists.
The 'x' packet is essentially the same as 'm', except that the
returned data is in binary format.  For transferring relatively large
data (e.g.  shared library files), the 'x' packet can reduce the
transfer costs.

For example, without this patch, fetching ~100MB of data from a remote
target takes

  (gdb) dump binary memory temp.o 0x00007f3ba4c576c0 0x00007f3bab709400
  2024-03-13 16:17:42.626 - command started
  2024-03-13 16:18:24.151 - command finished
  Command execution time: 32.136785 (cpu), 41.525515 (wall)
  (gdb)

whereas with this patch, we obtain

  (gdb) dump binary memory temp.o 0x00007fec39fce6c0 0x00007fec40a80400
  2024-03-13 16:20:48.609 - command started
  2024-03-13 16:21:16.873 - command finished
  Command execution time: 20.447970 (cpu), 28.264202 (wall)
  (gdb)

We see improvements not only when reading bulk data as above, but also
when making a large number of small memory access requests.

For example, without this patch:

  (gdb) pipe x/100000xw $pc | wc -l
  2024-03-13 16:04:57.112 - command started
  25000
  2024-03-13 16:05:10.798 - command finished
  Command execution time: 9.952364 (cpu), 13.686581 (wall)

With this patch:

  (gdb) pipe x/100000xw $pc | wc -l
  2024-03-13 16:06:48.160 - command started
  25000
  2024-03-13 16:06:57.750 - command finished
  Command execution time: 6.541425 (cpu), 9.589839 (wall)
  (gdb)

Regression-tested on X86-64 using the unix (default) and
native-extended-gdbserver board files.
---
 gdb/NEWS                  |  7 ++++++
 gdb/doc/gdb.texinfo       | 29 +++++++++++++++++++++++++
 gdb/remote.c              | 45 +++++++++++++++++++++++++++++++++------
 gdbserver/remote-utils.cc | 39 ++++++++++++++++++++++++++++-----
 gdbserver/remote-utils.h  |  2 ++
 gdbserver/server.cc       | 21 ++++++++++++++++++
 6 files changed, 132 insertions(+), 11 deletions(-)
  

Comments

Luis Machado March 13, 2024, 3:50 p.m. UTC | #1
On 3/13/24 15:35, Tankut Baris Aktemur wrote:
> Introduce an RSP packet, 'x', for reading from the remote server
> memory in binary format.  The binary write packet, 'X' already exists.
> The 'x' packet is essentially the same as 'm', except that the
> returned data is in binary format.  For transferring relatively large
> data (e.g.  shared library files), the 'x' packet can reduce the
> transfer costs.

I think file transfers are handled through vFile instead, and that uses binary data.

Would the x packet play any role in this? Or would it be just for memory reads?
  
Eli Zaretskii March 13, 2024, 3:58 p.m. UTC | #2
> From: Tankut Baris Aktemur <tankut.baris.aktemur@intel.com>
> Date: Wed, 13 Mar 2024 16:35:45 +0100
> 
>  gdb/NEWS                  |  7 ++++++
>  gdb/doc/gdb.texinfo       | 29 +++++++++++++++++++++++++
>  gdb/remote.c              | 45 +++++++++++++++++++++++++++++++++------
>  gdbserver/remote-utils.cc | 39 ++++++++++++++++++++++++++++-----
>  gdbserver/remote-utils.h  |  2 ++
>  gdbserver/server.cc       | 21 ++++++++++++++++++
>  6 files changed, 132 insertions(+), 11 deletions(-)

Thanks.

> --- a/gdb/NEWS
> +++ b/gdb/NEWS
> @@ -152,6 +152,13 @@ QThreadOptions in qSupported
>    QThreadOptions packet, and the qSupported response can contain the
>    set of thread options the remote stub supports.
>  
> +x
> +
> +  Given ADDR and LENGTH, fetch LENGTH units from the memory at address
> +  ADDR and send the fetched data in binary format.  This packet is
> +  equivalent to 'm', except that the data in the response are in
> +  binary format.

This part is OK, but I wonder whether it would have been better to
mention the packet parameters on the line that now says just "x"?

> +Read @var{length} addressable memory units starting at address @var{addr}
> +(@pxref{addressable memory unit}).  Note that @var{addr} may not be aligned
> +to any particular boundary.

That "may not be aligned" could be misinterpreted to mean "must not be
aligned".  I suggest to rephrase:

  Note that @var{addr} does not have to be aligned to any particular
  boundary.

> +The stub need not use any particular size or alignment when gathering
> +data from memory for the response; even if @var{addr} is word-aligned
> +and @var{length} is a multiple of the word size, the stub is free to
> +use byte accesses, or not.  For this reason, this packet may not be
> +suitable for accessing memory-mapped I/O devices.
> +@cindex alignment of remote memory accesses
> +@cindex size of remote memory accesses
> +@cindex memory, alignment and size of remote accesses

These @cindex entries should be before the text they index, not after
it.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>
  
Aktemur, Tankut Baris March 13, 2024, 6:21 p.m. UTC | #3
On Wednesday, March 13, 2024 4:50 PM, Luis Machado wrote:
> On 3/13/24 15:35, Tankut Baris Aktemur wrote:
> > Introduce an RSP packet, 'x', for reading from the remote server
> > memory in binary format.  The binary write packet, 'X' already exists.
> > The 'x' packet is essentially the same as 'm', except that the
> > returned data is in binary format.  For transferring relatively large
> > data (e.g.  shared library files), the 'x' packet can reduce the
> > transfer costs.
> 
> I think file transfers are handled through vFile instead, and that uses binary data.
> 
> Would the x packet play any role in this? Or would it be just for memory reads?

It would be about memory reads.

Sorry, the phrase "(e.g.  shared library files)" was misleading.  In our (Intel)
downstream debugger, in-memory solib files are JIT'ed and are accessed via the 'm'
package.  That was the background why I wrote "e.g. shared library files".  I will
remove that phrase in the next version to avoid confusion.

Thanks.
-Baris


Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
  
Tom Tromey March 13, 2024, 7:27 p.m. UTC | #4
>>>>> Tankut Baris Aktemur <tankut.baris.aktemur@intel.com> writes:

> Introduce an RSP packet, 'x', for reading from the remote server
> memory in binary format.  The binary write packet, 'X' already exists.
> The 'x' packet is essentially the same as 'm', except that the
> returned data is in binary format.  For transferring relatively large
> data (e.g.  shared library files), the 'x' packet can reduce the
> transfer costs.

Thanks for the patch.  I agree with the overall idea and the choice of
'x' as the packet.

> +@item x @var{addr},@var{length}
> +@anchor{x packet}
> +@cindex @samp{x} packet
> +Read @var{length} addressable memory units starting at address @var{addr}
> +(@pxref{addressable memory unit}).  Note that @var{addr} may not be aligned
> +to any particular boundary.

I think it would be good to address the "short read" request here:

    https://sourceware.org/bugzilla/show_bug.cgi?id=24751

that is, document what ought to happen in this case.


Now that 'E' must be quoted in binary packets, I think the "binary data"
section of the "Overview" node must be updated to mention that certain
replies -- and in particular this one -- must handle this properly.

> +  /* Determine which packet format to use.  */
> +  char packet_format = 'm';
> +DIAGNOSTIC_PUSH
> +DIAGNOSTIC_ERROR_SWITCH
> +  switch (m_features.packet_support (PACKET_x))
> +    {
> +    case PACKET_ENABLE:
> +      packet_format = 'x';
> +      break;
> +    case PACKET_DISABLE:
> +      packet_format = 'm';
> +      break;
> +    case PACKET_SUPPORT_UNKNOWN:
> +      internal_error (_("remote_read_bytes_1: bad internal state"));
> +    }
> +DIAGNOSTIC_POP

Can this possibly be done without the diagnostic stuff.

> +      /* Binary memory read support.  */
> +      strcat (own_buf, ";x+");

One thing I don't really know about the remote protocol is whether we
prefer to add advertised features (like this) or just have gdb probe.

> +	    suppress_next_putpkt_log ();

Ok, I guess I see why you did it this way now.  You can ignore that
earlier comment I guess.

Tom
  
Luis Machado March 14, 2024, 10:01 a.m. UTC | #5
On 3/13/24 18:21, Aktemur, Tankut Baris wrote:
> On Wednesday, March 13, 2024 4:50 PM, Luis Machado wrote:
>> On 3/13/24 15:35, Tankut Baris Aktemur wrote:
>>> Introduce an RSP packet, 'x', for reading from the remote server
>>> memory in binary format.  The binary write packet, 'X' already exists.
>>> The 'x' packet is essentially the same as 'm', except that the
>>> returned data is in binary format.  For transferring relatively large
>>> data (e.g.  shared library files), the 'x' packet can reduce the
>>> transfer costs.
>>
>> I think file transfers are handled through vFile instead, and that uses binary data.
>>
>> Would the x packet play any role in this? Or would it be just for memory reads?
> 
> It would be about memory reads.
> 
> Sorry, the phrase "(e.g.  shared library files)" was misleading.  In our (Intel)
> downstream debugger, in-memory solib files are JIT'ed and are accessed via the 'm'
> package.  That was the background why I wrote "e.g. shared library files".  I will
> remove that phrase in the next version to avoid confusion.

Ah, got it. No worries. Thanks for clarifying it.

It is a bit funny why we have never used a binary mode to read memory, after all these years.

Sounds like a good change.
  
Aktemur, Tankut Baris March 14, 2024, 10:36 a.m. UTC | #6
Thank you for your review.  I'll submit the next revision soon.

On Wednesday, March 13, 2024 8:27 PM, Tom Tromey wrote:
> >>>>> Tankut Baris Aktemur <tankut.baris.aktemur@intel.com> writes:
> 
> > Introduce an RSP packet, 'x', for reading from the remote server
> > memory in binary format.  The binary write packet, 'X' already exists.
> > The 'x' packet is essentially the same as 'm', except that the
> > returned data is in binary format.  For transferring relatively large
> > data (e.g.  shared library files), the 'x' packet can reduce the
> > transfer costs.
...
> > +      /* Binary memory read support.  */
> > +      strcat (own_buf, ";x+");
> 
> One thing I don't really know about the remote protocol is whether we
> prefer to add advertised features (like this) or just have gdb probe.

For the 'X' packet, GDB does probing in `check_binary_download`.  (It is, at
least to me, unintuitive that the function is named "download" while it's
checking for 'X', which is for writing; maybe "download" is meant from the target's
perspective.)

I also don't know exactly which approach should be taken, but I opted for announcing
the support at the beginning because it's easier and cleaner, in my opinion.

Probing needs to sends an address: Which address shall we use? The same address we
are given in the first encounter?  And the length is 0.  Then, what if the address
is bad, e.g.  in an inaccessible region?  But because the length is 0, should the target
send a success result or an error?  From GDB's PoV, for probing, it doesn't matter, but 
from target's perspective, I think it creates an odd situation.  If it's success for
a request of length 0, the target would trivially send an empty response as the binary
data.  But if the packet is not recognized, the target responds with an empty reply
packet, as well.  How do we distinguish then success from unsupported?  Shall we have
a delimiter at the beginning of the sent binary data?  For these reasons, announcing
the support upfront had sounded better to me.

Regards
-Baris


Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
  
Aktemur, Tankut Baris March 14, 2024, 12:34 p.m. UTC | #7
On Wednesday, March 13, 2024 8:27 PM, Tom Tromey wrote:
> >>>>> Tankut Baris Aktemur <tankut.baris.aktemur@intel.com> writes:
> 
> > Introduce an RSP packet, 'x', for reading from the remote server
> > memory in binary format.  The binary write packet, 'X' already exists.
> > The 'x' packet is essentially the same as 'm', except that the
> > returned data is in binary format.  For transferring relatively large
> > data (e.g.  shared library files), the 'x' packet can reduce the
> > transfer costs.
> 
> Thanks for the patch.  I agree with the overall idea and the choice of
> 'x' as the packet.
> 
> > +@item x @var{addr},@var{length}
> > +@anchor{x packet}
> > +@cindex @samp{x} packet
> > +Read @var{length} addressable memory units starting at address @var{addr}
> > +(@pxref{addressable memory unit}).  Note that @var{addr} may not be aligned
> > +to any particular boundary.
> 
> I think it would be good to address the "short read" request here:
> 
>     https://sourceware.org/bugzilla/show_bug.cgi?id=24751
> 
> that is, document what ought to happen in this case.

I submitted this comment to Bugzilla:
https://sourceware.org/bugzilla/show_bug.cgi?id=24751#c1

Thanks
-Baris


Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
  
Tom Tromey March 14, 2024, 12:46 p.m. UTC | #8
>>>>> Aktemur, Tankut Baris <tankut.baris.aktemur@intel.com> writes:

> Probing needs to sends an address: Which address shall we use? The same address we
> are given in the first encounter?  And the length is 0.  Then, what if the address
> is bad, e.g.  in an inaccessible region?  But because the length is 0, should the target
> send a success result or an error?  From GDB's PoV, for probing, it doesn't matter, but 
> from target's perspective, I think it creates an odd situation.

I think the probing idea is that you simply try the 'x' command the
first time it is needed.  If remote sends an empty response (not an 'E'
response), then the packet isn't supported, so you disable it and retry
with 'm'.

This is documented in the Overview node:

       For any COMMAND not supported by the stub, an empty response (‘$#00’)
    should be returned.  That way it is possible to extend the protocol.  A
    newer GDB can tell if a packet is supported based on that response.

Tom
  
Aktemur, Tankut Baris March 15, 2024, 9:59 a.m. UTC | #9
On Thursday, March 14, 2024 1:47 PM, Tom Tromey wrote:
> >>>>> Aktemur, Tankut Baris <tankut.baris.aktemur@intel.com> writes:
> 
> > Probing needs to sends an address: Which address shall we use? The same address we
> > are given in the first encounter?  And the length is 0.  Then, what if the address
> > is bad, e.g.  in an inaccessible region?  But because the length is 0, should the target
> > send a success result or an error?  From GDB's PoV, for probing, it doesn't matter, but
> > from target's perspective, I think it creates an odd situation.
> 
> I think the probing idea is that you simply try the 'x' command the
> first time it is needed.  If remote sends an empty response (not an 'E'
> response), then the packet isn't supported, so you disable it and retry
> with 'm'.
> 
> This is documented in the Overview node:
> 
>        For any COMMAND not supported by the stub, an empty response (‘$#00’)
>     should be returned.  That way it is possible to extend the protocol.  A
>     newer GDB can tell if a packet is supported based on that response.
> 
> Tom

Yes, I get this idea, and the 'X' packet actually uses it.  My concern there is
that (1) sending this packet with an arbitrary address and length 0 seems too
artificial; (2) using the proposed reply format, the target would send an empty reply
both when it does not recognize the packet and when it recognizes because the length
argument in the probe is 0 (the same problem would exist with the 'm', 'p', 'g' packets
if we were to probe them).  This would not be an issue if we decide to force the reply
to always start with a marker character, as suggested in Ciaran's email.

In general, there are two options:

* Send the binary data verbatim in the reply to the 'x' packet:
  (+) The definition of the packet is consistent with other similar packets.
  (-) We have to escape the 'E' character; we cannot use probing.

* Always start the reply to the 'x' packet with a marker character:
  (+) We wouldn't need to escape 'E'; we can use probing if we want to.
  (-) The packet definition kind of diverges from the other similar packets.

I opted for the first option when submitting the patches, but the second option
is also doable.

-Baris


Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
  
Tom Tromey March 19, 2024, 4:01 p.m. UTC | #10
>>>>> Aktemur, Tankut Baris <tankut.baris.aktemur@intel.com> writes:

>> I think the probing idea is that you simply try the 'x' command the
>> first time it is needed.  If remote sends an empty response (not an 'E'
>> response), then the packet isn't supported, so you disable it and retry
>> with 'm'.
>> 
>> This is documented in the Overview node:

> Yes, I get this idea, and the 'X' packet actually uses it.  My concern there is
> that (1) sending this packet with an arbitrary address and length 0 seems too
> artificial; (2) using the proposed reply format, the target would send an empty reply
> both when it does not recognize the packet and when it recognizes because the length
> argument in the probe is 0 (the same problem would exist with the 'm', 'p', 'g' packets
> if we were to probe them).  This would not be an issue if we decide to force the reply
> to always start with a marker character, as suggested in Ciaran's email.

I definitely think that gdb should not send an 'x' packet with an
arbitrary address and a length of 0.

Instead the idea is that if the packet is not explicitly forbidden
(either via a gdb command or by previous probe failure), just attempt to
use this packet for the first memory read.

If the remote reports the packet as unsupported, fall back to the 'm'
packet and carry on.

I think the docs should be explicit about what a 0-length read means.
I tend to think this should just be an error.

I also prefer Ciaran's idea for the response packet.  I didn't realize
other packets already did this or perhaps I would have suggested it.

thanks,
Tom
  
Aktemur, Tankut Baris March 20, 2024, 5:32 p.m. UTC | #11
Hi Tom,

On Tuesday, March 19, 2024 5:01 PM, Tom Tromey wrote:
> >>>>> Aktemur, Tankut Baris <tankut.baris.aktemur@intel.com> writes:
> 
> >> I think the probing idea is that you simply try the 'x' command the
> >> first time it is needed.  If remote sends an empty response (not an 'E'
> >> response), then the packet isn't supported, so you disable it and retry
> >> with 'm'.
> >>
> >> This is documented in the Overview node:
> 
> > Yes, I get this idea, and the 'X' packet actually uses it.  My concern there is
> > that (1) sending this packet with an arbitrary address and length 0 seems too
> > artificial; (2) using the proposed reply format, the target would send an empty reply
> > both when it does not recognize the packet and when it recognizes because the length
> > argument in the probe is 0 (the same problem would exist with the 'm', 'p', 'g' packets
> > if we were to probe them).  This would not be an issue if we decide to force the reply
> > to always start with a marker character, as suggested in Ciaran's email.
> 
> I definitely think that gdb should not send an 'x' packet with an
> arbitrary address and a length of 0.
> 
> Instead the idea is that if the packet is not explicitly forbidden
> (either via a gdb command or by previous probe failure), just attempt to
> use this packet for the first memory read.

You mean, just as "x\0", without any arguments?  (Please also see the end
of the email about this)

> If the remote reports the packet as unsupported, fall back to the 'm'
> packet and carry on.
> 
> I think the docs should be explicit about what a 0-length read means.

I agree that the document should be clear for this.

> I tend to think this should just be an error.

My first reaction would be that it depends on the given address.  If the
address is accessible, it seems viable that a zero-length read is trivially
a success, akin to defining an empty string.  If the address is not
accessible, however, it would be an error.  So, a zero-length read attempt
could also be considered as a probe to check if the address is accessible
by the context.

Checking the code, I see in `target_write_memory`:

  /* GDB may send X packets with LEN==0, for probing packet support.
     If we let such a request go through, then buffer.data() below may
     return NULL, which may confuse target implementations.  Handle it
     here to avoid lower levels having to care about this case.  */
  if (len == 0)
    return 0;

(A return value of 0 denotes success, by the way.)
And then in `read_inferior_memory`:

  /* At the time of writing, GDB only sends write packets with LEN==0,
     not read packets (see comment in target_write_memory), but it
     doesn't hurt to prevent problems if it ever does, or we're
     connected to some client other than GDB that does.  */
  if (len == 0)
    return 0;

I had mentioned `check_binary_download` in a previous email.  In that function
GDB probes for 'X' using the first memory address it receives with a length
of 0.  From that perspective, the address is arbitrary, and I, too, find that
unpleasant.

Because of these reasons, I had preferred not probing and chose announcing
the support instead.  But if probing is still the desired approach, how about this:

We can send a plain "x\0" packet, with no arguments.  Because this doesn't conform
to the format, it would be an error, but that's sufficient to indicate that
the server supports the packet.  Would that be ok?

> I also prefer Ciaran's idea for the response packet.  I didn't realize
> other packets already did this or perhaps I would have suggested it.

Ok, thank you.  I'll make the reply start with a marker character in the next
revision.

> thanks,
> Tom

Regards
-Baris


Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
  
Tom Tromey March 20, 2024, 8:12 p.m. UTC | #12
>> Instead the idea is that if the packet is not explicitly forbidden
>> (either via a gdb command or by previous probe failure), just attempt to
>> use this packet for the first memory read.

> You mean, just as "x\0", without any arguments?  (Please also see the end
> of the email about this)

What I mean is refactor remote_read_bytes_1 to try the 'x' packet the
first time it is called (assuming the 'x' packet is not already banned).
If the remote replies that 'x' is unsupported, disable it and continue
to the 'm' packet.

> My first reaction would be that it depends on the given address.  If the
> address is accessible, it seems viable that a zero-length read is trivially
> a success, akin to defining an empty string.  If the address is not
> accessible, however, it would be an error.

I suspect not all remotes can make this determination.
So, if this is the semantics of a 0-length read, then maybe the packet
can't be used in some situations.

However I tend to think gdb should simply not do 0-length reads.

Tom
  
Andrew Burgess April 5, 2024, 1:05 p.m. UTC | #13
Tom Tromey <tom@tromey.com> writes:

>>>>>> Aktemur, Tankut Baris <tankut.baris.aktemur@intel.com> writes:
>
>>> I think the probing idea is that you simply try the 'x' command the
>>> first time it is needed.  If remote sends an empty response (not an 'E'
>>> response), then the packet isn't supported, so you disable it and retry
>>> with 'm'.
>>> 
>>> This is documented in the Overview node:
>
>> Yes, I get this idea, and the 'X' packet actually uses it.  My concern there is
>> that (1) sending this packet with an arbitrary address and length 0 seems too
>> artificial; (2) using the proposed reply format, the target would send an empty reply
>> both when it does not recognize the packet and when it recognizes because the length
>> argument in the probe is 0 (the same problem would exist with the 'm', 'p', 'g' packets
>> if we were to probe them).  This would not be an issue if we decide to force the reply
>> to always start with a marker character, as suggested in Ciaran's email.
>
> I definitely think that gdb should not send an 'x' packet with an
> arbitrary address and a length of 0.
>
> Instead the idea is that if the packet is not explicitly forbidden
> (either via a gdb command or by previous probe failure), just attempt to
> use this packet for the first memory read.
>
> If the remote reports the packet as unsupported, fall back to the 'm'
> packet and carry on.

The alternative to this would be to add a qSupported feature for the 'x'
packet and choose m/x based on the qSupported reply.

I mention this only for completeness really.  I've replied in another
thread that I think we should adopt a prefix character to avoid needing
to escape 'E', but this would also mean we never get an empty reply, so
probing would work fine.

> I think the docs should be explicit about what a 0-length read means.
> I tend to think this should just be an error.

I don't have a firm opinion about whether this _new_ packet should give
a success or failure for zero length access.  And I'm not convinced it
really matters, so long as it's documented!!  (I haven't checked your
patch, so you might have already done so).

Success would have the benefit of matching existing behaviour, so I'd be
slightly more inclined to take that approach.

Thanks,
Andrew

>
> I also prefer Ciaran's idea for the response packet.  I didn't realize
> other packets already did this or perhaps I would have suggested it.
>
> thanks,
> Tom
  
Tom Tromey April 5, 2024, 2:09 p.m. UTC | #14
>>>>> "Andrew" == Andrew Burgess <aburgess@redhat.com> writes:

Andrew> The alternative to this would be to add a qSupported feature for the 'x'
Andrew> packet and choose m/x based on the qSupported reply.

That's what his original patch did.  If you think that's best then it
already exists.  My view was that there didn't seem to be a need for
this to be in qSupported and the qSupported responses are already
getting pretty long.

Tom
  
Andrew Burgess April 8, 2024, 9:36 p.m. UTC | #15
Tom Tromey <tom@tromey.com> writes:

>>>>>> "Andrew" == Andrew Burgess <aburgess@redhat.com> writes:
>
> Andrew> The alternative to this would be to add a qSupported feature for the 'x'
> Andrew> packet and choose m/x based on the qSupported reply.
>
> That's what his original patch did.  If you think that's best then it
> already exists.  My view was that there didn't seem to be a need for
> this to be in qSupported and the qSupported responses are already
> getting pretty long.

No, I joined this party a little late, so I missed that this was the
original approach taken.

I think the best solution would be to add a prefix character to the
reply so that we never send back an empty packet (when this feature is
supported), then use the empty packet to represent the unsupported case.

Thanks,
Andrew
  

Patch

diff --git a/gdb/NEWS b/gdb/NEWS
index d8ac0bb06a7..378ad66b284 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -152,6 +152,13 @@  QThreadOptions in qSupported
   QThreadOptions packet, and the qSupported response can contain the
   set of thread options the remote stub supports.
 
+x
+
+  Given ADDR and LENGTH, fetch LENGTH units from the memory at address
+  ADDR and send the fetched data in binary format.  This packet is
+  equivalent to 'm', except that the data in the response are in
+  binary format.
+
 *** Changes in GDB 14
 
 * GDB now supports the AArch64 Scalable Matrix Extension 2 (SME2), which
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 6099d125a60..42119778ad9 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -43119,6 +43119,35 @@  for success (@pxref{Stop Reply Packets})
 @cindex @samp{vStopped} packet
 @xref{Notification Packets}.
 
+@item x @var{addr},@var{length}
+@anchor{x packet}
+@cindex @samp{x} packet
+Read @var{length} addressable memory units starting at address @var{addr}
+(@pxref{addressable memory unit}).  Note that @var{addr} may not be aligned
+to any particular boundary.
+
+The stub need not use any particular size or alignment when gathering
+data from memory for the response; even if @var{addr} is word-aligned
+and @var{length} is a multiple of the word size, the stub is free to
+use byte accesses, or not.  For this reason, this packet may not be
+suitable for accessing memory-mapped I/O devices.
+@cindex alignment of remote memory accesses
+@cindex size of remote memory accesses
+@cindex memory, alignment and size of remote accesses
+
+This packet is used only if the stub announces support for it using
+@samp{qSupported}.
+
+Reply:
+@table @samp
+@item @var{XX@dots{}}
+Memory contents as binary data (@pxref{Binary Data}).
+The reply may contain fewer addressable memory units than requested if the
+server was able to read only part of the region of memory.
+@item E @var{NN}
+for an error
+@end table
+
 @item X @var{addr},@var{length}:@var{XX@dots{}}
 @anchor{X packet}
 @cindex @samp{X} packet
diff --git a/gdb/remote.c b/gdb/remote.c
index 14c8b020b1e..406df0265eb 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -194,6 +194,7 @@  struct packet_result
 enum {
   PACKET_vCont = 0,
   PACKET_X,
+  PACKET_x,
   PACKET_qSymbol,
   PACKET_P,
   PACKET_p,
@@ -5763,6 +5764,7 @@  static const struct protocol_feature remote_protocol_features[] = {
   { "no-resumed", PACKET_DISABLE, remote_supported_packet, PACKET_no_resumed },
   { "memory-tagging", PACKET_DISABLE, remote_supported_packet,
     PACKET_memory_tagging_feature },
+  { "x", PACKET_DISABLE, remote_supported_packet, PACKET_x },
 };
 
 static char *remote_support_xml;
@@ -9582,24 +9584,53 @@  remote_target::remote_read_bytes_1 (CORE_ADDR memaddr, gdb_byte *myaddr,
   todo_units = std::min (len_units,
 			 (ULONGEST) (buf_size_bytes / unit_size) / 2);
 
-  /* Construct "m"<memaddr>","<len>".  */
+  /* Determine which packet format to use.  */
+  char packet_format = 'm';
+DIAGNOSTIC_PUSH
+DIAGNOSTIC_ERROR_SWITCH
+  switch (m_features.packet_support (PACKET_x))
+    {
+    case PACKET_ENABLE:
+      packet_format = 'x';
+      break;
+    case PACKET_DISABLE:
+      packet_format = 'm';
+      break;
+    case PACKET_SUPPORT_UNKNOWN:
+      internal_error (_("remote_read_bytes_1: bad internal state"));
+    }
+DIAGNOSTIC_POP
+
+  /* Construct "m/x"<memaddr>","<len>".  */
   memaddr = remote_address_masked (memaddr);
   p = rs->buf.data ();
-  *p++ = 'm';
+  *p++ = packet_format;
   p += hexnumstr (p, (ULONGEST) memaddr);
   *p++ = ',';
   p += hexnumstr (p, (ULONGEST) todo_units);
   *p = '\0';
   putpkt (rs->buf);
-  getpkt (&rs->buf);
+  int packet_len = getpkt (&rs->buf);
+  if (packet_len < 0)
+    return TARGET_XFER_E_IO;
+
   if (rs->buf[0] == 'E'
       && isxdigit (rs->buf[1]) && isxdigit (rs->buf[2])
       && rs->buf[3] == '\0')
     return TARGET_XFER_E_IO;
-  /* Reply describes memory byte by byte, each byte encoded as two hex
-     characters.  */
+
   p = rs->buf.data ();
-  decoded_bytes = hex2bin (p, myaddr, todo_units * unit_size);
+  if (packet_format == 'x')
+    decoded_bytes = remote_unescape_input ((const gdb_byte *) p,
+					   packet_len, myaddr,
+					   todo_units * unit_size);
+  else
+    {
+      /* Reply describes memory byte by byte, each byte encoded as two hex
+	 characters.  */
+      decoded_bytes = hex2bin (p, myaddr, todo_units * unit_size);
+    }
+
   /* Return what we have.  Let higher layers handle partial reads.  */
   *xfered_len_units = (ULONGEST) (decoded_bytes / unit_size);
   return (*xfered_len_units != 0) ? TARGET_XFER_OK : TARGET_XFER_EOF;
@@ -15839,6 +15870,8 @@  Show the maximum size of the address (in bits) in a memory packet."), NULL,
 
   add_packet_config_cmd (PACKET_X, "X", "binary-download", 1);
 
+  add_packet_config_cmd (PACKET_x, "x", "binary-read", 0);
+
   add_packet_config_cmd (PACKET_vCont, "vCont", "verbose-resume", 0);
 
   add_packet_config_cmd (PACKET_QPassSignals, "QPassSignals", "pass-signals",
diff --git a/gdbserver/remote-utils.cc b/gdbserver/remote-utils.cc
index 5e7c38056d0..beccde1a9fe 100644
--- a/gdbserver/remote-utils.cc
+++ b/gdbserver/remote-utils.cc
@@ -1334,6 +1334,13 @@  decode_M_packet (const char *from, CORE_ADDR *mem_addr_ptr,
   hex2bin (from, *to_p, *len_ptr);
 }
 
+void
+decode_x_packet (const char *from, CORE_ADDR *mem_addr_ptr,
+		 unsigned int *len_ptr)
+{
+  decode_m_packet_params (from, mem_addr_ptr, len_ptr, '\0');
+}
+
 int
 decode_X_packet (char *from, int packet_len, CORE_ADDR *mem_addr_ptr,
 		 unsigned int *len_ptr, unsigned char **to_p)
@@ -1571,18 +1578,36 @@  relocate_instruction (CORE_ADDR *to, CORE_ADDR oldloc)
      wait for the qRelocInsn "response".  That requires re-entering
      the main loop.  For now, this is an adequate approximation; allow
      GDB to access memory.  */
-  while (cs.own_buf[0] == 'm' || cs.own_buf[0] == 'M' || cs.own_buf[0] == 'X')
+  while (cs.own_buf[0] == 'm' || cs.own_buf[0] == 'M'
+	 || cs.own_buf[0] == 'X' || cs.own_buf[0] == 'x')
     {
       CORE_ADDR mem_addr;
       unsigned char *mem_buf = NULL;
       unsigned int mem_len;
+      int new_packet_len = -1;
 
-      if (cs.own_buf[0] == 'm')
+      if (cs.own_buf[0] == 'm' || cs.own_buf[0] == 'x')
 	{
-	  decode_m_packet (&cs.own_buf[1], &mem_addr, &mem_len);
+	  if (cs.own_buf[0] == 'm')
+	    decode_m_packet (&cs.own_buf[1], &mem_addr, &mem_len);
+	  else
+	    decode_x_packet (&cs.own_buf[1], &mem_addr, &mem_len);
+
 	  mem_buf = (unsigned char *) xmalloc (mem_len);
 	  if (read_inferior_memory (mem_addr, mem_buf, mem_len) == 0)
-	    bin2hex (mem_buf, cs.own_buf, mem_len);
+	    {
+	      if (cs.own_buf[0] == 'm')
+		bin2hex (mem_buf, cs.own_buf, mem_len);
+	      else
+		{
+		  int out_len_units;
+		  new_packet_len
+		    = remote_escape_output (mem_buf, mem_len, 1,
+					    (gdb_byte *) cs.own_buf,
+					    &out_len_units, PBUFSIZ);
+		  suppress_next_putpkt_log ();
+		}
+	    }
 	  else
 	    write_enn (cs.own_buf);
 	}
@@ -1604,7 +1629,11 @@  relocate_instruction (CORE_ADDR *to, CORE_ADDR oldloc)
 	    write_enn (cs.own_buf);
 	}
       free (mem_buf);
-      if (putpkt (cs.own_buf) < 0)
+
+      int res = ((new_packet_len == -1)
+		 ? putpkt (cs.own_buf)
+		 : putpkt_binary (cs.own_buf, new_packet_len));
+      if (res < 0)
 	return -1;
       len = getpkt (cs.own_buf);
       if (len < 0)
diff --git a/gdbserver/remote-utils.h b/gdbserver/remote-utils.h
index 4681ca12f55..65ce604f9dc 100644
--- a/gdbserver/remote-utils.h
+++ b/gdbserver/remote-utils.h
@@ -57,6 +57,8 @@  void decode_m_packet (const char *from, CORE_ADDR * mem_addr_ptr,
 		      unsigned int *len_ptr);
 void decode_M_packet (const char *from, CORE_ADDR * mem_addr_ptr,
 		      unsigned int *len_ptr, unsigned char **to_p);
+void decode_x_packet (const char *from, CORE_ADDR *mem_addr_ptr,
+		      unsigned int *len_ptr);
 int decode_X_packet (char *from, int packet_len, CORE_ADDR * mem_addr_ptr,
 		     unsigned int *len_ptr, unsigned char **to_p);
 int decode_xfer_write (char *buf, int packet_len,
diff --git a/gdbserver/server.cc b/gdbserver/server.cc
index 74c7763d777..96e6e551784 100644
--- a/gdbserver/server.cc
+++ b/gdbserver/server.cc
@@ -2840,6 +2840,9 @@  handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
       if (target_supports_memory_tagging ())
 	strcat (own_buf, ";memory-tagging+");
 
+      /* Binary memory read support.  */
+      strcat (own_buf, ";x+");
+
       /* Reinitialize components as needed for the new connection.  */
       hostio_handle_new_gdb_connection ();
       target_handle_new_gdb_connection ();
@@ -4690,6 +4693,24 @@  process_serial_event (void)
       else
 	write_enn (cs.own_buf);
       break;
+    case 'x':
+      {
+	require_running_or_break (cs.own_buf);
+	decode_x_packet (&cs.own_buf[1], &mem_addr, &len);
+	int res = gdb_read_memory (mem_addr, mem_buf, len);
+	if (res < 0)
+	  write_enn (cs.own_buf);
+	else
+	  {
+	    int out_len_units;
+	    new_packet_len
+	      = remote_escape_output (mem_buf, res, 1,
+				      (gdb_byte *) cs.own_buf,
+				      &out_len_units, PBUFSIZ);
+	    suppress_next_putpkt_log ();
+	  }
+      }
+      break;
     case 'X':
       require_running_or_break (cs.own_buf);
       if (decode_X_packet (&cs.own_buf[1], packet_len - 1,