[v3,5/8] gdbserver: Transmit target description ID in thread list and stop reply

Message ID 20230130044518.3322695-6-thiago.bauermann@linaro.org
State New
Headers
Series gdbserver improvements for AArch64 SVE support |

Commit Message

Thiago Jung Bauermann Jan. 30, 2023, 4:45 a.m. UTC
  Now that an inferior thread can have a different target description than
its process, there needs to be a way to communicate this target
description to GDB.  So add the concept of a target description ID to the
remote protocol, which is used to reference them and allows them to be
transferred only once over the wire.

The ID is an unsigned integer, and is sent in the 'T AA n1:r1;n2:r2;...'
stop reply packet as a new 'n:r' pair, where n is "tdesc" and "r" is an
unsigned integer containing the ID.

It is also sent in the threads list XML in the response of a
qXfer:threads:read request.  The ID is sent as a new "tdesc" attribute of
the <thread> node.

To request the target description XML of a given ID, GDB sends the
qXfer:features:read request with "target-id-%u.xml" as the annex, where %u
is the target description ID.

Suggested-By: Simon Marchi <simon.marchi@efficios.com>
---
 gdb/doc/gdb.texinfo       | 27 ++++++++++++++++---
 gdbserver/remote-utils.cc | 57 +++++++++++++++++++++++++++++++++++++++
 gdbserver/remote-utils.h  |  9 +++++++
 gdbserver/server.cc       | 17 ++++++++++--
 gdbserver/server.h        |  4 +++
 5 files changed, 109 insertions(+), 5 deletions(-)
  

Comments

Eli Zaretskii Jan. 30, 2023, 12:52 p.m. UTC | #1
> Cc: Thiago Jung Bauermann <thiago.bauermann@linaro.org>,
>  Simon Marchi <simon.marchi@efficios.com>
> Date: Mon, 30 Jan 2023 04:45:15 +0000
> From: Thiago Jung Bauermann via Gdb-patches <gdb-patches@sourceware.org>
> 
> Now that an inferior thread can have a different target description than
> its process, there needs to be a way to communicate this target
> description to GDB.  So add the concept of a target description ID to the
> remote protocol, which is used to reference them and allows them to be
> transferred only once over the wire.
> 
> The ID is an unsigned integer, and is sent in the 'T AA n1:r1;n2:r2;...'
> stop reply packet as a new 'n:r' pair, where n is "tdesc" and "r" is an
> unsigned integer containing the ID.
> 
> It is also sent in the threads list XML in the response of a
> qXfer:threads:read request.  The ID is sent as a new "tdesc" attribute of
> the <thread> node.
> 
> To request the target description XML of a given ID, GDB sends the
> qXfer:features:read request with "target-id-%u.xml" as the annex, where %u
> is the target description ID.
> 
> Suggested-By: Simon Marchi <simon.marchi@efficios.com>
> ---
>  gdb/doc/gdb.texinfo       | 27 ++++++++++++++++---
>  gdbserver/remote-utils.cc | 57 +++++++++++++++++++++++++++++++++++++++
>  gdbserver/remote-utils.h  |  9 +++++++
>  gdbserver/server.cc       | 17 ++++++++++--
>  gdbserver/server.h        |  4 +++
>  5 files changed, 109 insertions(+), 5 deletions(-)

Thanks.

> +If target description IDs are being used (@pxref{Target Description ID}),
> +@value{GDBN} can retrieve a target description with a given ID by using
> +@samp{target-id-ID.xml} as the @var{annex}, where @var{ID} is the
   ^^^^^^^^^^^^^^^^^^^^^^^                           ^^^^^^^^
@file{target-id-@var{id}.xml}                        @var{id}

The arguments of @var should be in lower-case, as that looks better in
the printed and HTML versions of the manual.

The documentation part is OK with these problems fixed.
  
Thiago Jung Bauermann Jan. 30, 2023, 2:05 p.m. UTC | #2
Hello Eli,

Eli Zaretskii <eliz@gnu.org> writes:

>> Now that an inferior thread can have a different target description than
>> its process, there needs to be a way to communicate this target
>> description to GDB.  So add the concept of a target description ID to the
>> remote protocol, which is used to reference them and allows them to be
>> transferred only once over the wire.
>> 
>> The ID is an unsigned integer, and is sent in the 'T AA n1:r1;n2:r2;...'
>> stop reply packet as a new 'n:r' pair, where n is "tdesc" and "r" is an
>> unsigned integer containing the ID.
>> 
>> It is also sent in the threads list XML in the response of a
>> qXfer:threads:read request.  The ID is sent as a new "tdesc" attribute of
>> the <thread> node.
>> 
>> To request the target description XML of a given ID, GDB sends the
>> qXfer:features:read request with "target-id-%u.xml" as the annex, where %u
>> is the target description ID.
>> 
>> Suggested-By: Simon Marchi <simon.marchi@efficios.com>
>> ---
>>  gdb/doc/gdb.texinfo       | 27 ++++++++++++++++---
>>  gdbserver/remote-utils.cc | 57 +++++++++++++++++++++++++++++++++++++++
>>  gdbserver/remote-utils.h  |  9 +++++++
>>  gdbserver/server.cc       | 17 ++++++++++--
>>  gdbserver/server.h        |  4 +++
>>  5 files changed, 109 insertions(+), 5 deletions(-)
>
> Thanks.

Thank you for the quick review!

>> +If target description IDs are being used (@pxref{Target Description ID}),
>> +@value{GDBN} can retrieve a target description with a given ID by using
>> +@samp{target-id-ID.xml} as the @var{annex}, where @var{ID} is the
>    ^^^^^^^^^^^^^^^^^^^^^^^                           ^^^^^^^^
> @file{target-id-@var{id}.xml}                        @var{id}
>
> The arguments of @var should be in lower-case, as that looks better in
> the printed and HTML versions of the manual.

Ah, nice to know that I can nest tags like that. Thank you for the
explanation as well.

> The documentation part is OK with these problems fixed.

Fixed locally.
  
Luis Machado Feb. 1, 2023, 9:39 a.m. UTC | #3
This looks good. Thanks. Just a nit below.

On 1/30/23 04:45, Thiago Jung Bauermann via Gdb-patches wrote:
> Now that an inferior thread can have a different target description than
> its process, there needs to be a way to communicate this target
> description to GDB.  So add the concept of a target description ID to the
> remote protocol, which is used to reference them and allows them to be
> transferred only once over the wire.
> 
> The ID is an unsigned integer, and is sent in the 'T AA n1:r1;n2:r2;...'
> stop reply packet as a new 'n:r' pair, where n is "tdesc" and "r" is an
> unsigned integer containing the ID.
> 
> It is also sent in the threads list XML in the response of a
> qXfer:threads:read request.  The ID is sent as a new "tdesc" attribute of
> the <thread> node.
> 
> To request the target description XML of a given ID, GDB sends the
> qXfer:features:read request with "target-id-%u.xml" as the annex, where %u
> is the target description ID.
> 
> Suggested-By: Simon Marchi <simon.marchi@efficios.com>
> ---
>   gdb/doc/gdb.texinfo       | 27 ++++++++++++++++---
>   gdbserver/remote-utils.cc | 57 +++++++++++++++++++++++++++++++++++++++
>   gdbserver/remote-utils.h  |  9 +++++++
>   gdbserver/server.cc       | 17 ++++++++++--
>   gdbserver/server.h        |  4 +++
>   5 files changed, 109 insertions(+), 5 deletions(-)
> 
> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
> index 9c0018ea5c14..fbf7e59853b5 100644
> --- a/gdb/doc/gdb.texinfo
> +++ b/gdb/doc/gdb.texinfo
> @@ -42030,6 +42030,10 @@ the stopped thread, as specified in @ref{thread-id syntax}.
>   If @var{n} is @samp{core}, then @var{r} is the hexadecimal number of
>   the core on which the stop event was detected.
>   
> +@item
> +If @var{n} is @samp{tdesc}, then @var{r} is the hexadecimal number of
> +the target description ID (@pxref{Target Description ID}).
> +

Above we mention that it is hexadecimal, but ...

>   @item
>   If @var{n} is a recognized @dfn{stop reason}, it describes a more
>   specific event that stopped the target.  The currently defined stop
> @@ -46546,7 +46550,7 @@ the following structure:
>   @smallexample
>   <?xml version="1.0"?>
>   <threads>
> -    <thread id="id" core="0" name="name">
> +    <thread id="id" core="0" name="name" tdesc="0">
>       ... description ...
>       </thread>
>   </threads>
> @@ -46556,8 +46560,10 @@ Each @samp{thread} element must have the @samp{id} attribute that
>   identifies the thread (@pxref{thread-id syntax}).  The
>   @samp{core} attribute, if present, specifies which processor core
>   the thread was last executing on.  The @samp{name} attribute, if
> -present, specifies the human-readable name of the thread.  The content
> -of the of @samp{thread} element is interpreted as human-readable
> +present, specifies the human-readable name of the thread.  The
> +@samp{tdesc} attribute, if present, specifies the target description
> +ID of the thread (@pxref{Target Description ID}).  The content of
> +the @samp{thread} element is interpreted as human-readable
>   auxiliary information.  The @samp{handle} attribute, if present,
>   is a hex encoded representation of the thread handle.

... what about the id being passed via the qXfer:threads:read? Should it be hex or decimal?

Might be nice to clarify so implementors have a clear understanding.

>   
> @@ -46767,6 +46773,8 @@ descriptions are accurate, and that @value{GDBN} understands them.
>   target descriptions.  @xref{Expat}.
>   
>   @menu
> +* Target Description ID::           Referencing different descriptions in the
> +                                    remote protocol.
>   * Retrieving Descriptions::         How descriptions are fetched from a target.
>   * Target Description Format::       The contents of a target description.
>   * Predefined Target Types::         Standard types available for target
> @@ -46775,6 +46783,14 @@ target descriptions.  @xref{Expat}.
>   * Standard Target Features::        Features @value{GDBN} knows about.
>   @end menu
>   
> +@node Target Description ID
> +@section Target Description ID
> +
> +In cases where a remote target supports threads having different
> +target descriptions than their parent process, the remote protocol
> +assigns a non-negative integer to each target description to reference
> +it in the communication between the host and the target.
> +
>   @node Retrieving Descriptions
>   @section Retrieving Descriptions
>   
> @@ -46787,6 +46803,11 @@ qXfer}).  The @var{annex} in the @samp{qXfer} packet will be
>   XML document, of the form described in @ref{Target Description
>   Format}.
>   
> +If target description IDs are being used (@pxref{Target Description ID}),
> +@value{GDBN} can retrieve a target description with a given ID by using
> +@samp{target-id-ID.xml} as the @var{annex}, where @var{ID} is the
> +non-negative integer identifier of the desired target description.
> +

Same case in here. Should id be a regular integer or its hex representation?

 From reading the code, it seems the two cases above should really be integers, so the text
is fine. But we want to make sure whoever is implementing the changes gets it right.

>   Alternatively, you can specify a file to read for the target description.
>   If a file is set, the target will not be queried.  The commands to
>   specify a file are:
> diff --git a/gdbserver/remote-utils.cc b/gdbserver/remote-utils.cc
> index 80310bc2c709..baff899307cc 100644
> --- a/gdbserver/remote-utils.cc
> +++ b/gdbserver/remote-utils.cc
> @@ -1049,6 +1049,53 @@ outreg (struct regcache *regcache, int regno, char *buf)
>     return buf;
>   }
>   
> +/* See remote-utils.h.  */
> +
> +unsigned int
> +get_tdesc_rsp_id (const target_desc *tdesc)
> +{
> +  client_state &cs = get_client_state ();
> +  unsigned int i;
> +
> +  for (i = 0; i < cs.tdescs.size (); i++)
> +    if (cs.tdescs[i] == tdesc)
> +      return i;
> +
> +  cs.tdescs.push_back (tdesc);
> +
> +  return i;
> +}
> +
> +/* See remote-utils.h.  */
> +
> +const target_desc *
> +get_tdesc_from_rsp_id (unsigned int id)
> +{
> +  client_state &cs = get_client_state ();
> +
> +  if (id >= cs.tdescs.size ())
> +    return nullptr;
> +
> +  return cs.tdescs[id];
> +}
> +
> +/* Return the ID as used in the remote protocol for the target descriptor of the
> +   given PTID.  */
> +
> +static unsigned int
> +get_tdesc_rsp_id (ptid_t ptid)
> +{
> +  const thread_info *thread = find_thread_ptid (ptid);
> +  const target_desc *tdesc;
> +
> +  if (thread == nullptr)
> +    tdesc = find_process_pid (ptid.pid ())->tdesc;
> +  else
> +    tdesc = get_thread_target_desc (thread);
> +
> +  return get_tdesc_rsp_id (tdesc);
> +}
> +
>   void
>   prepare_resume_reply (char *buf, ptid_t ptid, const target_waitstatus &status)
>   {
> @@ -1241,6 +1288,16 @@ prepare_resume_reply (char *buf, ptid_t ptid, const target_waitstatus &status)
>   	    buf += strlen (buf);
>   	    current_process ()->dlls_changed = false;
>   	  }
> +
> +	if (current_thread->tdesc != nullptr
> +	    && current_thread->tdesc != current_process ()->tdesc)
> +	  {
> +	    sprintf (buf, "tdesc:");
> +	    buf += strlen (buf);
> +	    sprintf (buf, "%x", get_tdesc_rsp_id (ptid));
> +	    strcat (buf, ";");
> +	    buf += strlen (buf);
> +	  }
>         }
>         break;
>       case TARGET_WAITKIND_EXITED:
> diff --git a/gdbserver/remote-utils.h b/gdbserver/remote-utils.h
> index cb2d6c346c99..61ef80b4dad7 100644
> --- a/gdbserver/remote-utils.h
> +++ b/gdbserver/remote-utils.h
> @@ -75,4 +75,13 @@ int relocate_instruction (CORE_ADDR *to, CORE_ADDR oldloc);
>   
>   void monitor_output (const char *msg);
>   
> +/* Return the ID as used in the remote protocol for the given target
> +   descriptor.  */
> +
> +unsigned int get_tdesc_rsp_id (const target_desc *tdesc);
> +
> +/* Return the target description corresponding to the remote protocol ID.  */
> +
> +const target_desc *get_tdesc_from_rsp_id (unsigned int id);
> +
>   #endif /* GDBSERVER_REMOTE_UTILS_H */
> diff --git a/gdbserver/server.cc b/gdbserver/server.cc
> index 21fb51a45d16..2d1062f98468 100644
> --- a/gdbserver/server.cc
> +++ b/gdbserver/server.cc
> @@ -976,7 +976,15 @@ handle_general_set (char *own_buf)
>   static const char *
>   get_features_xml (const char *annex)
>   {
> -  const struct target_desc *desc = current_target_desc ();
> +  const struct target_desc *desc;
> +  unsigned int id;
> +
> +  if (strcmp (annex, "target.xml") == 0)
> +    desc = current_target_desc ();
> +  else if (sscanf (annex, "target-id-%u.xml", &id) == 1)
> +    desc = get_tdesc_from_rsp_id (id);
> +  else
> +    desc = nullptr;
>   
>     /* `desc->xmltarget' defines what to return when looking for the
>        "target.xml" file.  Its contents can either be verbatim XML code
> @@ -986,7 +994,7 @@ get_features_xml (const char *annex)
>        This variable is set up from the auto-generated
>        init_registers_... routine for the current target.  */
>   
> -  if (strcmp (annex, "target.xml") == 0)
> +  if (desc != nullptr)
>       {
>         const char *ret = tdesc_get_features_xml (desc);
>   
> @@ -1664,6 +1672,11 @@ handle_qxfer_threads_worker (thread_info *thread, struct buffer *buffer)
>     if (name != NULL)
>       buffer_xml_printf (buffer, " name=\"%s\"", name);
>   
> +  if (thread->tdesc != nullptr
> +      && thread->tdesc != get_thread_process (thread)->tdesc)
> +    buffer_xml_printf (buffer, " tdesc=\"%u\"",
> +		       get_tdesc_rsp_id (thread->tdesc));
> +
>     if (handle_status)
>       {
>         char *handle_s = (char *) alloca (handle_len * 2 + 1);
> diff --git a/gdbserver/server.h b/gdbserver/server.h
> index 7997d1a32e6e..58be5027795b 100644
> --- a/gdbserver/server.h
> +++ b/gdbserver/server.h
> @@ -193,6 +193,10 @@ struct client_state
>     /* If true, memory tagging features are supported.  */
>     bool memory_tagging_feature = false;
>   
> +  /* The target descriptions that have been communicated to the client.  The
> +     index of a target description in this vector is the ID used to reference it
> +     in the remote protocol.  */
> +  std::vector<const target_desc *> tdescs;
>   };
>   
>   client_state &get_client_state ();

Otherwise LGTM.

Reviewed-by: Luis Machado <luis.machado@arm.com>
  
Andrew Burgess Feb. 1, 2023, 12:07 p.m. UTC | #4
Thiago Jung Bauermann via Gdb-patches <gdb-patches@sourceware.org>
writes:

> Now that an inferior thread can have a different target description than
> its process, there needs to be a way to communicate this target
> description to GDB.  So add the concept of a target description ID to the
> remote protocol, which is used to reference them and allows them to be
> transferred only once over the wire.
>
> The ID is an unsigned integer, and is sent in the 'T AA n1:r1;n2:r2;...'
> stop reply packet as a new 'n:r' pair, where n is "tdesc" and "r" is an
> unsigned integer containing the ID.
>
> It is also sent in the threads list XML in the response of a
> qXfer:threads:read request.  The ID is sent as a new "tdesc" attribute of
> the <thread> node.
>
> To request the target description XML of a given ID, GDB sends the
> qXfer:features:read request with "target-id-%u.xml" as the annex, where %u
> is the target description ID.

Luis already commented that in some locations the ID is hex, while in
others it is not explicitly stated if the value is hex or decimal.  I'd
like to second that feedback, and suggest that we pick one, and use that
consistently throughout.

My thinking is that it will be easier to understand remote packet traces
if target descriptions are requested using an ID in the safe format as
was sent to GDB, and if IDs in different packets match up.

Thus, I would suggest we switch to using 'target-id-%x.xml' here, and
send the ID as hex in the threads reply packet.

Additionally, I think it would be worth adding a new feature to the
qSupported packet, maybe 'per-thread-tdesc'.  With this added, GDB would
be able to tell gdbserver that it supports this feature, and gdbserver
will be able to confirm that the feature is supported.

I'm not 100% sure what we'd want to do if it turns out GDB doesn't
support the feature?  Is it better to push on with GDB using the wrong
target description?  Or would it be better if gdbserver exits with an
error suggesting the GDB needs updating?  In some ways, _what_ we do
doesn't really matter to me, but I think having the feature will allow
us to pick a suitable error handling solution later if needed.

I'd be happy if adding the feature was done as a separate patch in this
series, but I do think it should be part of this series.

>
> Suggested-By: Simon Marchi <simon.marchi@efficios.com>
> ---
>  gdb/doc/gdb.texinfo       | 27 ++++++++++++++++---
>  gdbserver/remote-utils.cc | 57 +++++++++++++++++++++++++++++++++++++++
>  gdbserver/remote-utils.h  |  9 +++++++
>  gdbserver/server.cc       | 17 ++++++++++--
>  gdbserver/server.h        |  4 +++
>  5 files changed, 109 insertions(+), 5 deletions(-)
>
> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
> index 9c0018ea5c14..fbf7e59853b5 100644
> --- a/gdb/doc/gdb.texinfo
> +++ b/gdb/doc/gdb.texinfo
> @@ -42030,6 +42030,10 @@ the stopped thread, as specified in @ref{thread-id syntax}.
>  If @var{n} is @samp{core}, then @var{r} is the hexadecimal number of
>  the core on which the stop event was detected.
>  
> +@item
> +If @var{n} is @samp{tdesc}, then @var{r} is the hexadecimal number of
> +the target description ID (@pxref{Target Description ID}).
> +
>  @item
>  If @var{n} is a recognized @dfn{stop reason}, it describes a more
>  specific event that stopped the target.  The currently defined stop
> @@ -46546,7 +46550,7 @@ the following structure:
>  @smallexample
>  <?xml version="1.0"?>
>  <threads>
> -    <thread id="id" core="0" name="name">
> +    <thread id="id" core="0" name="name" tdesc="0">
>      ... description ...
>      </thread>
>  </threads>
> @@ -46556,8 +46560,10 @@ Each @samp{thread} element must have the @samp{id} attribute that
>  identifies the thread (@pxref{thread-id syntax}).  The
>  @samp{core} attribute, if present, specifies which processor core
>  the thread was last executing on.  The @samp{name} attribute, if
> -present, specifies the human-readable name of the thread.  The content
> -of the of @samp{thread} element is interpreted as human-readable
> +present, specifies the human-readable name of the thread.  The
> +@samp{tdesc} attribute, if present, specifies the target description
> +ID of the thread (@pxref{Target Description ID}).  The content of
> +the @samp{thread} element is interpreted as human-readable
>  auxiliary information.  The @samp{handle} attribute, if present,
>  is a hex encoded representation of the thread handle.
>  
> @@ -46767,6 +46773,8 @@ descriptions are accurate, and that @value{GDBN} understands them.
>  target descriptions.  @xref{Expat}.
>  
>  @menu
> +* Target Description ID::           Referencing different descriptions in the
> +                                    remote protocol.
>  * Retrieving Descriptions::         How descriptions are fetched from a target.
>  * Target Description Format::       The contents of a target description.
>  * Predefined Target Types::         Standard types available for target
> @@ -46775,6 +46783,14 @@ target descriptions.  @xref{Expat}.
>  * Standard Target Features::        Features @value{GDBN} knows about.
>  @end menu
>  
> +@node Target Description ID
> +@section Target Description ID
> +
> +In cases where a remote target supports threads having different
> +target descriptions than their parent process, the remote protocol
> +assigns a non-negative integer to each target description to reference
> +it in the communication between the host and the target.

I think this should be reworded slightly.  I would prefer that it be
made clear that it is the remote target that is responsible for
assigning the ID numbers for target descriptions.

It would also be nice if there were some cross references from this
section to the other places in the manual where we discuss
sending/receiving the ID number.

Finally, I wonder if it might make sense to add something like:

  @cindex Target Description IDs

to every place where we discuss these ID's, then there will be an index
entry that links all the places together?

> +
>  @node Retrieving Descriptions
>  @section Retrieving Descriptions
>  
> @@ -46787,6 +46803,11 @@ qXfer}).  The @var{annex} in the @samp{qXfer} packet will be
>  XML document, of the form described in @ref{Target Description
>  Format}.
>  
> +If target description IDs are being used (@pxref{Target Description ID}),
> +@value{GDBN} can retrieve a target description with a given ID by using
> +@samp{target-id-ID.xml} as the @var{annex}, where @var{ID} is the
> +non-negative integer identifier of the desired target description.
> +
>  Alternatively, you can specify a file to read for the target description.
>  If a file is set, the target will not be queried.  The commands to
>  specify a file are:
> diff --git a/gdbserver/remote-utils.cc b/gdbserver/remote-utils.cc
> index 80310bc2c709..baff899307cc 100644
> --- a/gdbserver/remote-utils.cc
> +++ b/gdbserver/remote-utils.cc
> @@ -1049,6 +1049,53 @@ outreg (struct regcache *regcache, int regno, char *buf)
>    return buf;
>  }
>  
> +/* See remote-utils.h.  */
> +
> +unsigned int
> +get_tdesc_rsp_id (const target_desc *tdesc)
> +{
> +  client_state &cs = get_client_state ();
> +  unsigned int i;
> +
> +  for (i = 0; i < cs.tdescs.size (); i++)
> +    if (cs.tdescs[i] == tdesc)
> +      return i;
> +
> +  cs.tdescs.push_back (tdesc);
> +
> +  return i;
> +}
> +
> +/* See remote-utils.h.  */
> +
> +const target_desc *
> +get_tdesc_from_rsp_id (unsigned int id)
> +{
> +  client_state &cs = get_client_state ();
> +
> +  if (id >= cs.tdescs.size ())
> +    return nullptr;
> +
> +  return cs.tdescs[id];
> +}
> +
> +/* Return the ID as used in the remote protocol for the target descriptor of the
> +   given PTID.  */
> +
> +static unsigned int
> +get_tdesc_rsp_id (ptid_t ptid)
> +{
> +  const thread_info *thread = find_thread_ptid (ptid);
> +  const target_desc *tdesc;
> +
> +  if (thread == nullptr)
> +    tdesc = find_process_pid (ptid.pid ())->tdesc;
> +  else
> +    tdesc = get_thread_target_desc (thread);
> +
> +  return get_tdesc_rsp_id (tdesc);
> +}
> +
>  void
>  prepare_resume_reply (char *buf, ptid_t ptid, const target_waitstatus &status)
>  {
> @@ -1241,6 +1288,16 @@ prepare_resume_reply (char *buf, ptid_t ptid, const target_waitstatus &status)
>  	    buf += strlen (buf);
>  	    current_process ()->dlls_changed = false;
>  	  }
> +
> +	if (current_thread->tdesc != nullptr
> +	    && current_thread->tdesc != current_process ()->tdesc)
> +	  {
> +	    sprintf (buf, "tdesc:");
> +	    buf += strlen (buf);
> +	    sprintf (buf, "%x", get_tdesc_rsp_id (ptid));
> +	    strcat (buf, ";");
> +	    buf += strlen (buf);
> +	  }
>        }
>        break;
>      case TARGET_WAITKIND_EXITED:
> diff --git a/gdbserver/remote-utils.h b/gdbserver/remote-utils.h
> index cb2d6c346c99..61ef80b4dad7 100644
> --- a/gdbserver/remote-utils.h
> +++ b/gdbserver/remote-utils.h
> @@ -75,4 +75,13 @@ int relocate_instruction (CORE_ADDR *to, CORE_ADDR oldloc);
>  
>  void monitor_output (const char *msg);
>  
> +/* Return the ID as used in the remote protocol for the given target
> +   descriptor.  */
> +
> +unsigned int get_tdesc_rsp_id (const target_desc *tdesc);
> +
> +/* Return the target description corresponding to the remote protocol ID.  */

This comment should explain what happens if the ID is invalid
(i.e. returns nullptr).

> +
> +const target_desc *get_tdesc_from_rsp_id (unsigned int id);
> +
>  #endif /* GDBSERVER_REMOTE_UTILS_H */
> diff --git a/gdbserver/server.cc b/gdbserver/server.cc
> index 21fb51a45d16..2d1062f98468 100644
> --- a/gdbserver/server.cc
> +++ b/gdbserver/server.cc
> @@ -976,7 +976,15 @@ handle_general_set (char *own_buf)
>  static const char *
>  get_features_xml (const char *annex)
>  {
> -  const struct target_desc *desc = current_target_desc ();
> +  const struct target_desc *desc;
> +  unsigned int id;
> +
> +  if (strcmp (annex, "target.xml") == 0)
> +    desc = current_target_desc ();
> +  else if (sscanf (annex, "target-id-%u.xml", &id) == 1)
> +    desc = get_tdesc_from_rsp_id (id);
> +  else
> +    desc = nullptr;
>  
>    /* `desc->xmltarget' defines what to return when looking for the
>       "target.xml" file.  Its contents can either be verbatim XML code

This comment could be updated to replace '"target.xml"' with 'target
description' now that we handle multiple annex names.

There's a second reference toe "target.xml" at the end of the comment,
which can also be removed with a little rewording.

Thanks,
Andrew

> @@ -986,7 +994,7 @@ get_features_xml (const char *annex)
>       This variable is set up from the auto-generated
>       init_registers_... routine for the current target.  */
>  
> -  if (strcmp (annex, "target.xml") == 0)
> +  if (desc != nullptr)
>      {
>        const char *ret = tdesc_get_features_xml (desc);
>  
> @@ -1664,6 +1672,11 @@ handle_qxfer_threads_worker (thread_info *thread, struct buffer *buffer)
>    if (name != NULL)
>      buffer_xml_printf (buffer, " name=\"%s\"", name);
>  
> +  if (thread->tdesc != nullptr
> +      && thread->tdesc != get_thread_process (thread)->tdesc)
> +    buffer_xml_printf (buffer, " tdesc=\"%u\"",
> +		       get_tdesc_rsp_id (thread->tdesc));
> +
>    if (handle_status)
>      {
>        char *handle_s = (char *) alloca (handle_len * 2 + 1);
> diff --git a/gdbserver/server.h b/gdbserver/server.h
> index 7997d1a32e6e..58be5027795b 100644
> --- a/gdbserver/server.h
> +++ b/gdbserver/server.h
> @@ -193,6 +193,10 @@ struct client_state
>    /* If true, memory tagging features are supported.  */
>    bool memory_tagging_feature = false;
>  
> +  /* The target descriptions that have been communicated to the client.  The
> +     index of a target description in this vector is the ID used to reference it
> +     in the remote protocol.  */
> +  std::vector<const target_desc *> tdescs;
>  };
>  
>  client_state &get_client_state ();
  
Eli Zaretskii Feb. 1, 2023, 1:03 p.m. UTC | #5
> Cc: Thiago Jung Bauermann <thiago.bauermann@linaro.org>, Simon Marchi
>  <simon.marchi@efficios.com>
> Date: Wed, 01 Feb 2023 12:07:20 +0000
> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
> 
> Finally, I wonder if it might make sense to add something like:
> 
>   @cindex Target Description IDs
> 
> to every place where we discuss these ID's, then there will be an index
> entry that links all the places together?

Adding index entries is always welcome, as those help finding subjects
quickly and efficiently.  However:

  . we prefer index entries to use only lower-case letters, to avoid
    having their sorting order (which eventually affects whet you see
    in the Info reader when you use completion) depend on the locale
    where the manual is produced from Texinfo;
  . more importantly, it is not a good idea to have multiple
    identical index entries, since makeinfo then disambiguates them as
    foo<1>, foo<2>, etc., and when you see those in the list of
    completions, you have no idea which one is the one you want.  So
    it is preferable to qualify each such index entry in a way that
    will convey enough information to the reader to know which one
    he/she wants.  For example:

     @cindex target description id, and remote packets
     @cindex target description id, in python programs

    etc., you get the idea.
  
Andrew Burgess Feb. 1, 2023, 2:51 p.m. UTC | #6
Thiago Jung Bauermann via Gdb-patches <gdb-patches@sourceware.org>
writes:

> Now that an inferior thread can have a different target description than
> its process, there needs to be a way to communicate this target
> description to GDB.  So add the concept of a target description ID to the
> remote protocol, which is used to reference them and allows them to be
> transferred only once over the wire.
>
> The ID is an unsigned integer, and is sent in the 'T AA n1:r1;n2:r2;...'
> stop reply packet as a new 'n:r' pair, where n is "tdesc" and "r" is an
> unsigned integer containing the ID.
>
> It is also sent in the threads list XML in the response of a
> qXfer:threads:read request.  The ID is sent as a new "tdesc" attribute of
> the <thread> node.
>
> To request the target description XML of a given ID, GDB sends the
> qXfer:features:read request with "target-id-%u.xml" as the annex, where %u
> is the target description ID.
>
> Suggested-By: Simon Marchi <simon.marchi@efficios.com>
> ---
>  gdb/doc/gdb.texinfo       | 27 ++++++++++++++++---
>  gdbserver/remote-utils.cc | 57 +++++++++++++++++++++++++++++++++++++++
>  gdbserver/remote-utils.h  |  9 +++++++
>  gdbserver/server.cc       | 17 ++++++++++--
>  gdbserver/server.h        |  4 +++
>  5 files changed, 109 insertions(+), 5 deletions(-)
>
> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
> index 9c0018ea5c14..fbf7e59853b5 100644
> --- a/gdb/doc/gdb.texinfo
> +++ b/gdb/doc/gdb.texinfo
> @@ -42030,6 +42030,10 @@ the stopped thread, as specified in @ref{thread-id syntax}.
>  If @var{n} is @samp{core}, then @var{r} is the hexadecimal number of
>  the core on which the stop event was detected.
>  
> +@item
> +If @var{n} is @samp{tdesc}, then @var{r} is the hexadecimal number of
> +the target description ID (@pxref{Target Description ID}).
> +
>  @item
>  If @var{n} is a recognized @dfn{stop reason}, it describes a more
>  specific event that stopped the target.  The currently defined stop
> @@ -46546,7 +46550,7 @@ the following structure:
>  @smallexample
>  <?xml version="1.0"?>
>  <threads>
> -    <thread id="id" core="0" name="name">
> +    <thread id="id" core="0" name="name" tdesc="0">
>      ... description ...
>      </thread>
>  </threads>
> @@ -46556,8 +46560,10 @@ Each @samp{thread} element must have the @samp{id} attribute that
>  identifies the thread (@pxref{thread-id syntax}).  The
>  @samp{core} attribute, if present, specifies which processor core
>  the thread was last executing on.  The @samp{name} attribute, if
> -present, specifies the human-readable name of the thread.  The content
> -of the of @samp{thread} element is interpreted as human-readable
> +present, specifies the human-readable name of the thread.  The
> +@samp{tdesc} attribute, if present, specifies the target description
> +ID of the thread (@pxref{Target Description ID}).  The content of
> +the @samp{thread} element is interpreted as human-readable
>  auxiliary information.  The @samp{handle} attribute, if present,
>  is a hex encoded representation of the thread handle.
>  
> @@ -46767,6 +46773,8 @@ descriptions are accurate, and that @value{GDBN} understands them.
>  target descriptions.  @xref{Expat}.
>  
>  @menu
> +* Target Description ID::           Referencing different descriptions in the
> +                                    remote protocol.
>  * Retrieving Descriptions::         How descriptions are fetched from a target.
>  * Target Description Format::       The contents of a target description.
>  * Predefined Target Types::         Standard types available for target
> @@ -46775,6 +46783,14 @@ target descriptions.  @xref{Expat}.
>  * Standard Target Features::        Features @value{GDBN} knows about.
>  @end menu
>  
> +@node Target Description ID
> +@section Target Description ID
> +
> +In cases where a remote target supports threads having different
> +target descriptions than their parent process, the remote protocol
> +assigns a non-negative integer to each target description to reference
> +it in the communication between the host and the target.

Having read some of the later patches, I have some additional thoughts
here:

I think we should make it explicit here that IDs are connection wide,
not per-process.  We should also make it clear that GDB might[1] cache
target descriptions per remote connection, and so a remote target should
not reuse a target description ID except where the target description is
identical.

[1] I say "GDB might" here because if we say "GDB will" then this would
imply each target description will only be asked for once.  And I
figure, why be overly restrictive.


Thanks,
Andrew


> +
>  @node Retrieving Descriptions
>  @section Retrieving Descriptions
>  
> @@ -46787,6 +46803,11 @@ qXfer}).  The @var{annex} in the @samp{qXfer} packet will be
>  XML document, of the form described in @ref{Target Description
>  Format}.
>  
> +If target description IDs are being used (@pxref{Target Description ID}),
> +@value{GDBN} can retrieve a target description with a given ID by using
> +@samp{target-id-ID.xml} as the @var{annex}, where @var{ID} is the
> +non-negative integer identifier of the desired target description.
> +
>  Alternatively, you can specify a file to read for the target description.
>  If a file is set, the target will not be queried.  The commands to
>  specify a file are:
> diff --git a/gdbserver/remote-utils.cc b/gdbserver/remote-utils.cc
> index 80310bc2c709..baff899307cc 100644
> --- a/gdbserver/remote-utils.cc
> +++ b/gdbserver/remote-utils.cc
> @@ -1049,6 +1049,53 @@ outreg (struct regcache *regcache, int regno, char *buf)
>    return buf;
>  }
>  
> +/* See remote-utils.h.  */
> +
> +unsigned int
> +get_tdesc_rsp_id (const target_desc *tdesc)
> +{
> +  client_state &cs = get_client_state ();
> +  unsigned int i;
> +
> +  for (i = 0; i < cs.tdescs.size (); i++)
> +    if (cs.tdescs[i] == tdesc)
> +      return i;
> +
> +  cs.tdescs.push_back (tdesc);
> +
> +  return i;
> +}
> +
> +/* See remote-utils.h.  */
> +
> +const target_desc *
> +get_tdesc_from_rsp_id (unsigned int id)
> +{
> +  client_state &cs = get_client_state ();
> +
> +  if (id >= cs.tdescs.size ())
> +    return nullptr;
> +
> +  return cs.tdescs[id];
> +}
> +
> +/* Return the ID as used in the remote protocol for the target descriptor of the
> +   given PTID.  */
> +
> +static unsigned int
> +get_tdesc_rsp_id (ptid_t ptid)
> +{
> +  const thread_info *thread = find_thread_ptid (ptid);
> +  const target_desc *tdesc;
> +
> +  if (thread == nullptr)
> +    tdesc = find_process_pid (ptid.pid ())->tdesc;
> +  else
> +    tdesc = get_thread_target_desc (thread);
> +
> +  return get_tdesc_rsp_id (tdesc);
> +}
> +
>  void
>  prepare_resume_reply (char *buf, ptid_t ptid, const target_waitstatus &status)
>  {
> @@ -1241,6 +1288,16 @@ prepare_resume_reply (char *buf, ptid_t ptid, const target_waitstatus &status)
>  	    buf += strlen (buf);
>  	    current_process ()->dlls_changed = false;
>  	  }
> +
> +	if (current_thread->tdesc != nullptr
> +	    && current_thread->tdesc != current_process ()->tdesc)
> +	  {
> +	    sprintf (buf, "tdesc:");
> +	    buf += strlen (buf);
> +	    sprintf (buf, "%x", get_tdesc_rsp_id (ptid));
> +	    strcat (buf, ";");
> +	    buf += strlen (buf);
> +	  }
>        }
>        break;
>      case TARGET_WAITKIND_EXITED:
> diff --git a/gdbserver/remote-utils.h b/gdbserver/remote-utils.h
> index cb2d6c346c99..61ef80b4dad7 100644
> --- a/gdbserver/remote-utils.h
> +++ b/gdbserver/remote-utils.h
> @@ -75,4 +75,13 @@ int relocate_instruction (CORE_ADDR *to, CORE_ADDR oldloc);
>  
>  void monitor_output (const char *msg);
>  
> +/* Return the ID as used in the remote protocol for the given target
> +   descriptor.  */
> +
> +unsigned int get_tdesc_rsp_id (const target_desc *tdesc);
> +
> +/* Return the target description corresponding to the remote protocol ID.  */
> +
> +const target_desc *get_tdesc_from_rsp_id (unsigned int id);
> +
>  #endif /* GDBSERVER_REMOTE_UTILS_H */
> diff --git a/gdbserver/server.cc b/gdbserver/server.cc
> index 21fb51a45d16..2d1062f98468 100644
> --- a/gdbserver/server.cc
> +++ b/gdbserver/server.cc
> @@ -976,7 +976,15 @@ handle_general_set (char *own_buf)
>  static const char *
>  get_features_xml (const char *annex)
>  {
> -  const struct target_desc *desc = current_target_desc ();
> +  const struct target_desc *desc;
> +  unsigned int id;
> +
> +  if (strcmp (annex, "target.xml") == 0)
> +    desc = current_target_desc ();
> +  else if (sscanf (annex, "target-id-%u.xml", &id) == 1)
> +    desc = get_tdesc_from_rsp_id (id);
> +  else
> +    desc = nullptr;
>  
>    /* `desc->xmltarget' defines what to return when looking for the
>       "target.xml" file.  Its contents can either be verbatim XML code
> @@ -986,7 +994,7 @@ get_features_xml (const char *annex)
>       This variable is set up from the auto-generated
>       init_registers_... routine for the current target.  */
>  
> -  if (strcmp (annex, "target.xml") == 0)
> +  if (desc != nullptr)
>      {
>        const char *ret = tdesc_get_features_xml (desc);
>  
> @@ -1664,6 +1672,11 @@ handle_qxfer_threads_worker (thread_info *thread, struct buffer *buffer)
>    if (name != NULL)
>      buffer_xml_printf (buffer, " name=\"%s\"", name);
>  
> +  if (thread->tdesc != nullptr
> +      && thread->tdesc != get_thread_process (thread)->tdesc)
> +    buffer_xml_printf (buffer, " tdesc=\"%u\"",
> +		       get_tdesc_rsp_id (thread->tdesc));
> +
>    if (handle_status)
>      {
>        char *handle_s = (char *) alloca (handle_len * 2 + 1);
> diff --git a/gdbserver/server.h b/gdbserver/server.h
> index 7997d1a32e6e..58be5027795b 100644
> --- a/gdbserver/server.h
> +++ b/gdbserver/server.h
> @@ -193,6 +193,10 @@ struct client_state
>    /* If true, memory tagging features are supported.  */
>    bool memory_tagging_feature = false;
>  
> +  /* The target descriptions that have been communicated to the client.  The
> +     index of a target description in this vector is the ID used to reference it
> +     in the remote protocol.  */
> +  std::vector<const target_desc *> tdescs;
>  };
>  
>  client_state &get_client_state ();
  
Simon Marchi Feb. 1, 2023, 5:03 p.m. UTC | #7
> Having read some of the later patches, I have some additional thoughts
> here:
> 
> I think we should make it explicit here that IDs are connection wide,
> not per-process.  We should also make it clear that GDB might[1] cache
> target descriptions per remote connection, and so a remote target should
> not reuse a target description ID except where the target description is
> identical.
> 
> [1] I say "GDB might" here because if we say "GDB will" then this would
> imply each target description will only be asked for once.  And I
> figure, why be overly restrictive.

Thanks for pointing this out, I had the same thought while reading the
patch.

In my original idea, I imagined that target description IDs could be
some hashes computed from the XML content (a bit like git hashes or ELF
build IDs), such that a given target description would always have the
same ID.  This would give clients the possibility to cache target
descriptions locally, a bit like the index cache.  It did sound nice,
but perhaps it's not really important.

Simon
  
Simon Marchi Feb. 1, 2023, 5:37 p.m. UTC | #8
On 2/1/23 07:07, Andrew Burgess via Gdb-patches wrote:
> Thiago Jung Bauermann via Gdb-patches <gdb-patches@sourceware.org>
> writes:
> 
>> Now that an inferior thread can have a different target description than
>> its process, there needs to be a way to communicate this target
>> description to GDB.  So add the concept of a target description ID to the
>> remote protocol, which is used to reference them and allows them to be
>> transferred only once over the wire.
>>
>> The ID is an unsigned integer, and is sent in the 'T AA n1:r1;n2:r2;...'
>> stop reply packet as a new 'n:r' pair, where n is "tdesc" and "r" is an
>> unsigned integer containing the ID.
>>
>> It is also sent in the threads list XML in the response of a
>> qXfer:threads:read request.  The ID is sent as a new "tdesc" attribute of
>> the <thread> node.
>>
>> To request the target description XML of a given ID, GDB sends the
>> qXfer:features:read request with "target-id-%u.xml" as the annex, where %u
>> is the target description ID.
> 
> Luis already commented that in some locations the ID is hex, while in
> others it is not explicitly stated if the value is hex or decimal.  I'd
> like to second that feedback, and suggest that we pick one, and use that
> consistently throughout.
> 
> My thinking is that it will be easier to understand remote packet traces
> if target descriptions are requested using an ID in the safe format as
> was sent to GDB, and if IDs in different packets match up.
> 
> Thus, I would suggest we switch to using 'target-id-%x.xml' here, and
> send the ID as hex in the threads reply packet.

Not that I disagree with you, but I just wanted to note that this
decimal / hexadecimal mismatch already exists for the core field.  It's
hex in stop replies, decimal in XML.  The thread id, however, is always
hex (the special thread-id syntax).

Regardless how this new field is formatted, it would be nice to be
explicit about the core field too.

Simon
  
Simon Marchi Feb. 1, 2023, 8:46 p.m. UTC | #9
On 2/1/23 07:07, Andrew Burgess via Gdb-patches wrote:
> Thiago Jung Bauermann via Gdb-patches <gdb-patches@sourceware.org>
> writes:
> 
>> Now that an inferior thread can have a different target description than
>> its process, there needs to be a way to communicate this target
>> description to GDB.  So add the concept of a target description ID to the
>> remote protocol, which is used to reference them and allows them to be
>> transferred only once over the wire.
>>
>> The ID is an unsigned integer, and is sent in the 'T AA n1:r1;n2:r2;...'
>> stop reply packet as a new 'n:r' pair, where n is "tdesc" and "r" is an
>> unsigned integer containing the ID.
>>
>> It is also sent in the threads list XML in the response of a
>> qXfer:threads:read request.  The ID is sent as a new "tdesc" attribute of
>> the <thread> node.
>>
>> To request the target description XML of a given ID, GDB sends the
>> qXfer:features:read request with "target-id-%u.xml" as the annex, where %u
>> is the target description ID.
> 
> Luis already commented that in some locations the ID is hex, while in
> others it is not explicitly stated if the value is hex or decimal.  I'd
> like to second that feedback, and suggest that we pick one, and use that
> consistently throughout.
> 
> My thinking is that it will be easier to understand remote packet traces
> if target descriptions are requested using an ID in the safe format as
> was sent to GDB, and if IDs in different packets match up.
> 
> Thus, I would suggest we switch to using 'target-id-%x.xml' here, and
> send the ID as hex in the threads reply packet.
> 
> Additionally, I think it would be worth adding a new feature to the
> qSupported packet, maybe 'per-thread-tdesc'.  With this added, GDB would
> be able to tell gdbserver that it supports this feature, and gdbserver
> will be able to confirm that the feature is supported.
> 
> I'm not 100% sure what we'd want to do if it turns out GDB doesn't
> support the feature?  Is it better to push on with GDB using the wrong
> target description?  Or would it be better if gdbserver exits with an
> error suggesting the GDB needs updating?  In some ways, _what_ we do
> doesn't really matter to me, but I think having the feature will allow
> us to pick a suitable error handling solution later if needed.
> 
> I'd be happy if adding the feature was done as a separate patch in this
> series, but I do think it should be part of this series.

I forgot to comment on that part.  Just to think out loud:

- If GDB supports the new feature but GDBserver doesn't, I guess that
  debugging works exactly like today?  What happens today if a remote
  thread changes its SVE vector size, the registers that GDB shows will
  simply not reflect the reality, they might have bogus values, but
  debugging will otherwise work?

- If GDB does not support the new feature but GDBserver does, I guess it
  is problematic, because GDBserver will assume that GDB will have
  adjusted its expectations about the length (and layout) of the g
  packet response, since it told GDB about the thread-specific tdesc.
  But since GDB doesn't know about thread-specific tdescs, it will try
  to interpret the g packet response with the register layout of the
  process-wide tdesc, possibly leading to the infamous "Remote 'g'
  packet reply is too long", if the received response is longer than
  expected.

So, I agree that we need some feature flag for this.

Simon
  
Thiago Jung Bauermann Feb. 2, 2023, 7:52 p.m. UTC | #10
Simon Marchi <simark@simark.ca> writes:

>> Having read some of the later patches, I have some additional thoughts
>> here:
>> 
>> I think we should make it explicit here that IDs are connection wide,
>> not per-process.  We should also make it clear that GDB might[1] cache
>> target descriptions per remote connection, and so a remote target should
>> not reuse a target description ID except where the target description is
>> identical.

Ah, good point. I will make that clarification.

>> [1] I say "GDB might" here because if we say "GDB will" then this would
>> imply each target description will only be asked for once.  And I
>> figure, why be overly restrictive.
>
> Thanks for pointing this out, I had the same thought while reading the
> patch.
>
> In my original idea, I imagined that target description IDs could be
> some hashes computed from the XML content (a bit like git hashes or ELF
> build IDs), such that a given target description would always have the
> same ID.  This would give clients the possibility to cache target
> descriptions locally, a bit like the index cache.  It did sound nice,
> but perhaps it's not really important.

Ah, sorry I misunderstood this part of your suggestion. I thought that
the caching was supposed to be limited to the duration of the connection,
and thus the m_tdescs map in struct remote state would be enough to
provide that functionality. Do you mean that the cache should be on
disk, so that it survives GDB quitting? I can look into that if you want
and implement it, if not complicated.

In this case, I think the tdesc ID should be of the format
<prefix>:<hex number>, so that <prefix> can be “sha1”, “sha256” etc to
indicate that <hex number> is a hash with the given algo, or even
“number” to indicate that it's a simple integer like it is today.

Perhaps we can do the prefix thing even if not implementing the cache,
to leave the possibility of adding it in the future?
  
Thiago Jung Bauermann Feb. 2, 2023, 8:36 p.m. UTC | #11
Simon Marchi <simark@simark.ca> writes:

> On 2/1/23 07:07, Andrew Burgess via Gdb-patches wrote:
>> Thiago Jung Bauermann via Gdb-patches <gdb-patches@sourceware.org>
>> writes:
>> 
>>> Now that an inferior thread can have a different target description than
>>> its process, there needs to be a way to communicate this target
>>> description to GDB.  So add the concept of a target description ID to the
>>> remote protocol, which is used to reference them and allows them to be
>>> transferred only once over the wire.
>>>
>>> The ID is an unsigned integer, and is sent in the 'T AA n1:r1;n2:r2;...'
>>> stop reply packet as a new 'n:r' pair, where n is "tdesc" and "r" is an
>>> unsigned integer containing the ID.
>>>
>>> It is also sent in the threads list XML in the response of a
>>> qXfer:threads:read request.  The ID is sent as a new "tdesc" attribute of
>>> the <thread> node.
>>>
>>> To request the target description XML of a given ID, GDB sends the
>>> qXfer:features:read request with "target-id-%u.xml" as the annex, where %u
>>> is the target description ID.
>> 
>> Luis already commented that in some locations the ID is hex, while in
>> others it is not explicitly stated if the value is hex or decimal.  I'd
>> like to second that feedback, and suggest that we pick one, and use that
>> consistently throughout.
>> 
>> My thinking is that it will be easier to understand remote packet traces
>> if target descriptions are requested using an ID in the safe format as
>> was sent to GDB, and if IDs in different packets match up.
>> 
>> Thus, I would suggest we switch to using 'target-id-%x.xml' here, and
>> send the ID as hex in the threads reply packet.
>
> Not that I disagree with you, but I just wanted to note that this
> decimal / hexadecimal mismatch already exists for the core field.  It's
> hex in stop replies, decimal in XML.  The thread id, however, is always
> hex (the special thread-id syntax).
>
> Regardless how this new field is formatted, it would be nice to be
> explicit about the core field too.

I agree it makes more sense to use hex everywhere.

In the case of XML attribute fields, enforcing hex would mean adding a
base parameter to xml_parse_unsigned_integer (which currently accepts
decimal, octal and hex). I'll do that.

In the case of the core field, to avoid breaking existing stubs it's
probably better to document that it can accept any of the three bases.
  
Simon Marchi Feb. 2, 2023, 8:51 p.m. UTC | #12
On 2/2/23 14:52, Thiago Jung Bauermann wrote:
> 
> Simon Marchi <simark@simark.ca> writes:
> 
>>> Having read some of the later patches, I have some additional thoughts
>>> here:
>>>
>>> I think we should make it explicit here that IDs are connection wide,
>>> not per-process.  We should also make it clear that GDB might[1] cache
>>> target descriptions per remote connection, and so a remote target should
>>> not reuse a target description ID except where the target description is
>>> identical.
> 
> Ah, good point. I will make that clarification.
> 
>>> [1] I say "GDB might" here because if we say "GDB will" then this would
>>> imply each target description will only be asked for once.  And I
>>> figure, why be overly restrictive.
>>
>> Thanks for pointing this out, I had the same thought while reading the
>> patch.
>>
>> In my original idea, I imagined that target description IDs could be
>> some hashes computed from the XML content (a bit like git hashes or ELF
>> build IDs), such that a given target description would always have the
>> same ID.  This would give clients the possibility to cache target
>> descriptions locally, a bit like the index cache.  It did sound nice,
>> but perhaps it's not really important.
> 
> Ah, sorry I misunderstood this part of your suggestion. I thought that
> the caching was supposed to be limited to the duration of the connection,
> and thus the m_tdescs map in struct remote state would be enough to
> provide that functionality. Do you mean that the cache should be on
> disk, so that it survives GDB quitting? I can look into that if you want
> and implement it, if not complicated.

To be clear, I'm not asking you to implement an on-disk cache, I'm just
trying to think about what the limitations of the proposed solution are.
Because mistakes or shortcomings introduced in the remote protocol (like
any API / ABI meant to be stable) are difficult to fix afterwards.  If
there is a chance we want to do an on-disk cache (or share a cache
between multiple concurrent connections), we should think about that
now.

On one hand, perhaps we consider that target descriptions are small
enough that there's no point in having a persistent cache like that.
The time to fetch each target description once per connection is
probably insignificant.

On the other hand, perhaps doing the hash approach is easy enough that
we might as well do it, and that just leaves more doors open.  In my
mind, as a start, we could just pass the XML of the tdesc through a
sha256 function, possibly doing something more sophisticated later (this
means the hash of a given tdesc could change over time, but still two
identical hashes mean identical tdescs).  However, I don't think there
are sha256 functions built in on the OSes we want to support, so we'd
have to depend on some external library.  And that complicates things
quite a bit...  We already have a CRC32 function in gdbserver, but I
don't know if that's good enough to be confident there won't be
collisions.

> In this case, I think the tdesc ID should be of the format
> <prefix>:<hex number>, so that <prefix> can be “sha1”, “sha256” etc to
> indicate that <hex number> is a hash with the given algo, or even
> “number” to indicate that it's a simple integer like it is today.
> 
> Perhaps we can do the prefix thing even if not implementing the cache,
> to leave the possibility of adding it in the future?

I think I would choose between hash and number, but not do both, I don't
see the need to have both.

Simon
  
Simon Marchi Feb. 2, 2023, 8:56 p.m. UTC | #13
> I agree it makes more sense to use hex everywhere.
> 
> In the case of XML attribute fields, enforcing hex would mean adding a
> base parameter to xml_parse_unsigned_integer (which currently accepts
> decimal, octal and hex). I'll do that.
> 
> In the case of the core field, to avoid breaking existing stubs it's
> probably better to document that it can accept any of the three bases.

The "core" field is parsed using gdb_xml_parse_attr_ulongest, which uses
strtoulst with a base of 0.  So I suppose that an hex value with the 0x
prefix would work.  So bases other than 10 can work, but they need to
have the appropriate prefix.

Simon
  
Thiago Jung Bauermann Feb. 2, 2023, 9:43 p.m. UTC | #14
Simon Marchi <simark@simark.ca> writes:

> On 2/1/23 07:07, Andrew Burgess via Gdb-patches wrote:
>> Additionally, I think it would be worth adding a new feature to the
>> qSupported packet, maybe 'per-thread-tdesc'.  With this added, GDB would
>> be able to tell gdbserver that it supports this feature, and gdbserver
>> will be able to confirm that the feature is supported.
>> 
>> I'm not 100% sure what we'd want to do if it turns out GDB doesn't
>> support the feature?  Is it better to push on with GDB using the wrong
>> target description?  Or would it be better if gdbserver exits with an
>> error suggesting the GDB needs updating?  In some ways, _what_ we do
>> doesn't really matter to me, but I think having the feature will allow
>> us to pick a suitable error handling solution later if needed.
>> 
>> I'd be happy if adding the feature was done as a separate patch in this
>> series, but I do think it should be part of this series.
>
> I forgot to comment on that part.  Just to think out loud:

I tested the two cases you mentioned below.

> - If GDB supports the new feature but GDBserver doesn't, I guess that
>   debugging works exactly like today?  What happens today if a remote
>   thread changes its SVE vector size, the registers that GDB shows will
>   simply not reflect the reality, they might have bogus values, but
>   debugging will otherwise work?

Yes, that's exactly what happens.

> - If GDB does not support the new feature but GDBserver does, I guess it
>   is problematic, because GDBserver will assume that GDB will have
>   adjusted its expectations about the length (and layout) of the g
>   packet response, since it told GDB about the thread-specific tdesc.
>   But since GDB doesn't know about thread-specific tdescs, it will try
>   to interpret the g packet response with the register layout of the
>   process-wide tdesc, possibly leading to the infamous "Remote 'g'
>   packet reply is too long", if the received response is longer than
>   expected.

This too.

> So, I agree that we need some feature flag for this.

Ok, I will add it for the next version of the patches.
  
Thiago Jung Bauermann Feb. 3, 2023, 2:44 a.m. UTC | #15
Simon Marchi <simark@simark.ca> writes:

> On 2/2/23 14:52, Thiago Jung Bauermann wrote:
>> 
>> Simon Marchi <simark@simark.ca> writes:
>> 
>>>> Having read some of the later patches, I have some additional thoughts
>>>> here:
>>>>
>>>> I think we should make it explicit here that IDs are connection wide,
>>>> not per-process.  We should also make it clear that GDB might[1] cache
>>>> target descriptions per remote connection, and so a remote target should
>>>> not reuse a target description ID except where the target description is
>>>> identical.
>> 
>> Ah, good point. I will make that clarification.
>> 
>>>> [1] I say "GDB might" here because if we say "GDB will" then this would
>>>> imply each target description will only be asked for once.  And I
>>>> figure, why be overly restrictive.
>>>
>>> Thanks for pointing this out, I had the same thought while reading the
>>> patch.
>>>
>>> In my original idea, I imagined that target description IDs could be
>>> some hashes computed from the XML content (a bit like git hashes or ELF
>>> build IDs), such that a given target description would always have the
>>> same ID.  This would give clients the possibility to cache target
>>> descriptions locally, a bit like the index cache.  It did sound nice,
>>> but perhaps it's not really important.
>> 
>> Ah, sorry I misunderstood this part of your suggestion. I thought that
>> the caching was supposed to be limited to the duration of the connection,
>> and thus the m_tdescs map in struct remote state would be enough to
>> provide that functionality. Do you mean that the cache should be on
>> disk, so that it survives GDB quitting? I can look into that if you want
>> and implement it, if not complicated.
>
> To be clear, I'm not asking you to implement an on-disk cache, I'm just
> trying to think about what the limitations of the proposed solution are.
> Because mistakes or shortcomings introduced in the remote protocol (like
> any API / ABI meant to be stable) are difficult to fix afterwards.  If
> there is a chance we want to do an on-disk cache (or share a cache
> between multiple concurrent connections), we should think about that
> now.

Indeed, that's a good idea.

> On one hand, perhaps we consider that target descriptions are small
> enough that there's no point in having a persistent cache like that.
> The time to fetch each target description once per connection is
> probably insignificant.

I don't have a good intuition about that. My uneducated guess is that
it's not worth it, but as you say I think it's a good idea to leave the
door open if we can.

> On the other hand, perhaps doing the hash approach is easy enough that
> we might as well do it, and that just leaves more doors open.  In my
> mind, as a start, we could just pass the XML of the tdesc through a
> sha256 function, possibly doing something more sophisticated later (this
> means the hash of a given tdesc could change over time, but still two
> identical hashes mean identical tdescs).  However, I don't think there
> are sha256 functions built in on the OSes we want to support, so we'd
> have to depend on some external library.  And that complicates things
> quite a bit...  We already have a CRC32 function in gdbserver, but I
> don't know if that's good enough to be confident there won't be
> collisions.

I tried to research this a bit, but I'm not confident enough in my
knowledge of the subject to reach a conclusion. I only know enough about
hashes to be dangerous. I did find a hash testsuite called SMhasher
which seems to be well regarded, and its website¹ has
reports for CRC32², which shows that it fails many collision and
distribution bias tests. So it doesn't look like it is very reliable for
using as a hash.

We do support libxxhash as an optional dependency, so we could use that
if gdbserver is linked with it. Its SMhasher report³ looks good, IUUC
(but not the 32 bits version though⁴).

We could also vendor coreutils's lib/sha256.[ch] files. They're 660
lines in total.

>> In this case, I think the tdesc ID should be of the format
>> <prefix>:<hex number>, so that <prefix> can be “sha1”, “sha256” etc to
>> indicate that <hex number> is a hash with the given algo, or even
>> “number” to indicate that it's a simple integer like it is today.
>> 
>> Perhaps we can do the prefix thing even if not implementing the cache,
>> to leave the possibility of adding it in the future?
>
> I think I would choose between hash and number, but not do both, I don't
> see the need to have both.

I suggested a prefix to designate an integer ID as a way to implement
only this possibility for now, but which would allow implementing hash
IDs later, as additional prefixes. Or another possibility is that if we
later decide to support hashes, we could add a separate protocol feature
to signal that.
  
Luis Machado Feb. 3, 2023, 11:22 a.m. UTC | #16
On 2/1/23 17:03, Simon Marchi via Gdb-patches wrote:
> 
>> Having read some of the later patches, I have some additional thoughts
>> here:
>>
>> I think we should make it explicit here that IDs are connection wide,
>> not per-process.  We should also make it clear that GDB might[1] cache
>> target descriptions per remote connection, and so a remote target should
>> not reuse a target description ID except where the target description is
>> identical.
>>
>> [1] I say "GDB might" here because if we say "GDB will" then this would
>> imply each target description will only be asked for once.  And I
>> figure, why be overly restrictive.
> 
> Thanks for pointing this out, I had the same thought while reading the
> patch.
> 
> In my original idea, I imagined that target description IDs could be
> some hashes computed from the XML content (a bit like git hashes or ELF
> build IDs), such that a given target description would always have the
> same ID.  This would give clients the possibility to cache target
> descriptions locally, a bit like the index cache.  It did sound nice,
> but perhaps it's not really important.

Are you considering an encoded block of data that gdbserver sends to gdb, which in turn unpacks it on its end and use it to compute a target description?

Something along the lines of gdb/arch/aarch64.h:"struct aarch64_features" and how we encode features through bits?

> 
> Simon
  
Simon Marchi Feb. 3, 2023, 12:50 p.m. UTC | #17
> Are you considering an encoded block of data that gdbserver sends to gdb, which in turn unpacks it on its end and use it to compute a target description?
> 
> Something along the lines of gdb/arch/aarch64.h:"struct aarch64_features" and how we encode features through bits?

No, I imagine a hash being transmitted as a hex string, much like git
commit sha1s or build-ids.  Imagine passing the XML file through
sha256sum and sending that as a fingerprint for the tdesc.  Except that
an actual implementation could be smarter and hash the contents of the
XML in some way so that the particular (meaningless) formatting of the
XML doesn't change the hash (but that's an implementation detail).

Simon
  
Andrew Burgess Feb. 3, 2023, 4:29 p.m. UTC | #18
Thiago Jung Bauermann <thiago.bauermann@linaro.org> writes:

> Simon Marchi <simark@simark.ca> writes:
>
>> On 2/2/23 14:52, Thiago Jung Bauermann wrote:
>>> 
>>> Simon Marchi <simark@simark.ca> writes:
>>> 
>>>>> Having read some of the later patches, I have some additional thoughts
>>>>> here:
>>>>>
>>>>> I think we should make it explicit here that IDs are connection wide,
>>>>> not per-process.  We should also make it clear that GDB might[1] cache
>>>>> target descriptions per remote connection, and so a remote target should
>>>>> not reuse a target description ID except where the target description is
>>>>> identical.
>>> 
>>> Ah, good point. I will make that clarification.
>>> 
>>>>> [1] I say "GDB might" here because if we say "GDB will" then this would
>>>>> imply each target description will only be asked for once.  And I
>>>>> figure, why be overly restrictive.
>>>>
>>>> Thanks for pointing this out, I had the same thought while reading the
>>>> patch.
>>>>
>>>> In my original idea, I imagined that target description IDs could be
>>>> some hashes computed from the XML content (a bit like git hashes or ELF
>>>> build IDs), such that a given target description would always have the
>>>> same ID.  This would give clients the possibility to cache target
>>>> descriptions locally, a bit like the index cache.  It did sound nice,
>>>> but perhaps it's not really important.
>>> 
>>> Ah, sorry I misunderstood this part of your suggestion. I thought that
>>> the caching was supposed to be limited to the duration of the connection,
>>> and thus the m_tdescs map in struct remote state would be enough to
>>> provide that functionality. Do you mean that the cache should be on
>>> disk, so that it survives GDB quitting? I can look into that if you want
>>> and implement it, if not complicated.
>>
>> To be clear, I'm not asking you to implement an on-disk cache, I'm just
>> trying to think about what the limitations of the proposed solution are.
>> Because mistakes or shortcomings introduced in the remote protocol (like
>> any API / ABI meant to be stable) are difficult to fix afterwards.  If
>> there is a chance we want to do an on-disk cache (or share a cache
>> between multiple concurrent connections), we should think about that
>> now.
>
> Indeed, that's a good idea.
>
>> On one hand, perhaps we consider that target descriptions are small
>> enough that there's no point in having a persistent cache like that.
>> The time to fetch each target description once per connection is
>> probably insignificant.
>
> I don't have a good intuition about that. My uneducated guess is that
> it's not worth it, but as you say I think it's a good idea to leave the
> door open if we can.
>
>> On the other hand, perhaps doing the hash approach is easy enough that
>> we might as well do it, and that just leaves more doors open.  In my
>> mind, as a start, we could just pass the XML of the tdesc through a
>> sha256 function, possibly doing something more sophisticated later (this
>> means the hash of a given tdesc could change over time, but still two
>> identical hashes mean identical tdescs).  However, I don't think there
>> are sha256 functions built in on the OSes we want to support, so we'd
>> have to depend on some external library.  And that complicates things
>> quite a bit...  We already have a CRC32 function in gdbserver, but I
>> don't know if that's good enough to be confident there won't be
>> collisions.
>
> I tried to research this a bit, but I'm not confident enough in my
> knowledge of the subject to reach a conclusion. I only know enough about
> hashes to be dangerous. I did find a hash testsuite called SMhasher
> which seems to be well regarded, and its website¹ has
> reports for CRC32², which shows that it fails many collision and
> distribution bias tests. So it doesn't look like it is very reliable for
> using as a hash.
>
> We do support libxxhash as an optional dependency, so we could use that
> if gdbserver is linked with it. Its SMhasher report³ looks good, IUUC
> (but not the 32 bits version though⁴).
>
> We could also vendor coreutils's lib/sha256.[ch] files. They're 660
> lines in total.
>
>>> In this case, I think the tdesc ID should be of the format
>>> <prefix>:<hex number>, so that <prefix> can be “sha1”, “sha256” etc to
>>> indicate that <hex number> is a hash with the given algo, or even
>>> “number” to indicate that it's a simple integer like it is today.
>>> 
>>> Perhaps we can do the prefix thing even if not implementing the cache,
>>> to leave the possibility of adding it in the future?
>>
>> I think I would choose between hash and number, but not do both, I don't
>> see the need to have both.
>
> I suggested a prefix to designate an integer ID as a way to implement
> only this possibility for now, but which would allow implementing hash
> IDs later, as additional prefixes. Or another possibility is that if we
> later decide to support hashes, we could add a separate protocol feature
> to signal that.

So, I'd also be in favour of supporting the simple "just a number"
scheme.  Not everything that talks to GDB is our GDBServer, and not
every target might have the right hashing library.  I'd hate to force
folk to have to implement some hashing code, just to use this feature of
the remote protocol.

I have a proposal for how to negotiate different ID types without
adding a prefix to every ID sent from gdbserver...

We don't need to use a prefix in the actual ID.  I believe you're
already adding a qSupported feature for per-thread-tdesc.  The
qSupported mechanism already supports value passing.

So, we could pass a value back and forth in the qSupported negotiation
where GDB and GDBServer can agree on a numbering scheme.

If GDB sends out:

  per-thread-tdesc=value

Where `value` is a comma separated list of letters, each letter
representing a scheme that GDB understands:

  N: Each ID is a number,

  H: Each ID is a hash of the target description contents.

Then GDBServer replies with:

  per-thread-tdesc=value

Where `value` is a single letter drawn from the set of letters that GDB
sent out, this is the scheme that GDBServer will be using.  Thus,
GDBServer might reply with _one_ of the following:

  per-thread-tdesc=N
  per-thread-tdesc=H

If GDB only offers a scheme that GDBServer doesn't understand, e.g. GDB
sends:

  per-thread-tdesc=X

Then GDBServer just doesn't send this feature back, effectively claiming
it doesn't support 'per-thread-tdesc'.

The great thing about this is that the hard work here is all in writing
the documentation.

To begin with you _only_ need to support the 'N' scheme.  GDB always
sends out just 'per-thread-tdesc=N', GDBServer just checks that the
value is 'N', and replies 'per-thread-tdesc=N'.  GDB checks that the
return value is also 'N', and then we carry on as before.  BUT, we now
have a planned route to add more complex schemes in the future.

Thanks,
Andrew
  
Thiago Jung Bauermann Feb. 4, 2023, 6:08 a.m. UTC | #19
Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:

> Thiago Jung Bauermann <thiago.bauermann@linaro.org> writes:
>
>> I suggested a prefix to designate an integer ID as a way to implement
>> only this possibility for now, but which would allow implementing hash
>> IDs later, as additional prefixes. Or another possibility is that if we
>> later decide to support hashes, we could add a separate protocol feature
>> to signal that.
>
> So, I'd also be in favour of supporting the simple "just a number"
> scheme.  Not everything that talks to GDB is our GDBServer, and not
> every target might have the right hashing library.  I'd hate to force
> folk to have to implement some hashing code, just to use this feature of
> the remote protocol.

Yes, that is a good point.

> I have a proposal for how to negotiate different ID types without
> adding a prefix to every ID sent from gdbserver...
>
> We don't need to use a prefix in the actual ID.  I believe you're
> already adding a qSupported feature for per-thread-tdesc.  The
> qSupported mechanism already supports value passing.
>
> So, we could pass a value back and forth in the qSupported negotiation
> where GDB and GDBServer can agree on a numbering scheme.

    ⋮
  <snip>
    ⋮

> The great thing about this is that the hard work here is all in writing
> the documentation.

:-)

> To begin with you _only_ need to support the 'N' scheme.  GDB always
> sends out just 'per-thread-tdesc=N', GDBServer just checks that the
> value is 'N', and replies 'per-thread-tdesc=N'.  GDB checks that the
> return value is also 'N', and then we carry on as before.  BUT, we now
> have a planned route to add more complex schemes in the future.

This is a great idea! Thank you for coming up with it. I will implement
it for the next version of the patch series.
  

Patch

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 9c0018ea5c14..fbf7e59853b5 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -42030,6 +42030,10 @@  the stopped thread, as specified in @ref{thread-id syntax}.
 If @var{n} is @samp{core}, then @var{r} is the hexadecimal number of
 the core on which the stop event was detected.
 
+@item
+If @var{n} is @samp{tdesc}, then @var{r} is the hexadecimal number of
+the target description ID (@pxref{Target Description ID}).
+
 @item
 If @var{n} is a recognized @dfn{stop reason}, it describes a more
 specific event that stopped the target.  The currently defined stop
@@ -46546,7 +46550,7 @@  the following structure:
 @smallexample
 <?xml version="1.0"?>
 <threads>
-    <thread id="id" core="0" name="name">
+    <thread id="id" core="0" name="name" tdesc="0">
     ... description ...
     </thread>
 </threads>
@@ -46556,8 +46560,10 @@  Each @samp{thread} element must have the @samp{id} attribute that
 identifies the thread (@pxref{thread-id syntax}).  The
 @samp{core} attribute, if present, specifies which processor core
 the thread was last executing on.  The @samp{name} attribute, if
-present, specifies the human-readable name of the thread.  The content
-of the of @samp{thread} element is interpreted as human-readable
+present, specifies the human-readable name of the thread.  The
+@samp{tdesc} attribute, if present, specifies the target description
+ID of the thread (@pxref{Target Description ID}).  The content of
+the @samp{thread} element is interpreted as human-readable
 auxiliary information.  The @samp{handle} attribute, if present,
 is a hex encoded representation of the thread handle.
 
@@ -46767,6 +46773,8 @@  descriptions are accurate, and that @value{GDBN} understands them.
 target descriptions.  @xref{Expat}.
 
 @menu
+* Target Description ID::           Referencing different descriptions in the
+                                    remote protocol.
 * Retrieving Descriptions::         How descriptions are fetched from a target.
 * Target Description Format::       The contents of a target description.
 * Predefined Target Types::         Standard types available for target
@@ -46775,6 +46783,14 @@  target descriptions.  @xref{Expat}.
 * Standard Target Features::        Features @value{GDBN} knows about.
 @end menu
 
+@node Target Description ID
+@section Target Description ID
+
+In cases where a remote target supports threads having different
+target descriptions than their parent process, the remote protocol
+assigns a non-negative integer to each target description to reference
+it in the communication between the host and the target.
+
 @node Retrieving Descriptions
 @section Retrieving Descriptions
 
@@ -46787,6 +46803,11 @@  qXfer}).  The @var{annex} in the @samp{qXfer} packet will be
 XML document, of the form described in @ref{Target Description
 Format}.
 
+If target description IDs are being used (@pxref{Target Description ID}),
+@value{GDBN} can retrieve a target description with a given ID by using
+@samp{target-id-ID.xml} as the @var{annex}, where @var{ID} is the
+non-negative integer identifier of the desired target description.
+
 Alternatively, you can specify a file to read for the target description.
 If a file is set, the target will not be queried.  The commands to
 specify a file are:
diff --git a/gdbserver/remote-utils.cc b/gdbserver/remote-utils.cc
index 80310bc2c709..baff899307cc 100644
--- a/gdbserver/remote-utils.cc
+++ b/gdbserver/remote-utils.cc
@@ -1049,6 +1049,53 @@  outreg (struct regcache *regcache, int regno, char *buf)
   return buf;
 }
 
+/* See remote-utils.h.  */
+
+unsigned int
+get_tdesc_rsp_id (const target_desc *tdesc)
+{
+  client_state &cs = get_client_state ();
+  unsigned int i;
+
+  for (i = 0; i < cs.tdescs.size (); i++)
+    if (cs.tdescs[i] == tdesc)
+      return i;
+
+  cs.tdescs.push_back (tdesc);
+
+  return i;
+}
+
+/* See remote-utils.h.  */
+
+const target_desc *
+get_tdesc_from_rsp_id (unsigned int id)
+{
+  client_state &cs = get_client_state ();
+
+  if (id >= cs.tdescs.size ())
+    return nullptr;
+
+  return cs.tdescs[id];
+}
+
+/* Return the ID as used in the remote protocol for the target descriptor of the
+   given PTID.  */
+
+static unsigned int
+get_tdesc_rsp_id (ptid_t ptid)
+{
+  const thread_info *thread = find_thread_ptid (ptid);
+  const target_desc *tdesc;
+
+  if (thread == nullptr)
+    tdesc = find_process_pid (ptid.pid ())->tdesc;
+  else
+    tdesc = get_thread_target_desc (thread);
+
+  return get_tdesc_rsp_id (tdesc);
+}
+
 void
 prepare_resume_reply (char *buf, ptid_t ptid, const target_waitstatus &status)
 {
@@ -1241,6 +1288,16 @@  prepare_resume_reply (char *buf, ptid_t ptid, const target_waitstatus &status)
 	    buf += strlen (buf);
 	    current_process ()->dlls_changed = false;
 	  }
+
+	if (current_thread->tdesc != nullptr
+	    && current_thread->tdesc != current_process ()->tdesc)
+	  {
+	    sprintf (buf, "tdesc:");
+	    buf += strlen (buf);
+	    sprintf (buf, "%x", get_tdesc_rsp_id (ptid));
+	    strcat (buf, ";");
+	    buf += strlen (buf);
+	  }
       }
       break;
     case TARGET_WAITKIND_EXITED:
diff --git a/gdbserver/remote-utils.h b/gdbserver/remote-utils.h
index cb2d6c346c99..61ef80b4dad7 100644
--- a/gdbserver/remote-utils.h
+++ b/gdbserver/remote-utils.h
@@ -75,4 +75,13 @@  int relocate_instruction (CORE_ADDR *to, CORE_ADDR oldloc);
 
 void monitor_output (const char *msg);
 
+/* Return the ID as used in the remote protocol for the given target
+   descriptor.  */
+
+unsigned int get_tdesc_rsp_id (const target_desc *tdesc);
+
+/* Return the target description corresponding to the remote protocol ID.  */
+
+const target_desc *get_tdesc_from_rsp_id (unsigned int id);
+
 #endif /* GDBSERVER_REMOTE_UTILS_H */
diff --git a/gdbserver/server.cc b/gdbserver/server.cc
index 21fb51a45d16..2d1062f98468 100644
--- a/gdbserver/server.cc
+++ b/gdbserver/server.cc
@@ -976,7 +976,15 @@  handle_general_set (char *own_buf)
 static const char *
 get_features_xml (const char *annex)
 {
-  const struct target_desc *desc = current_target_desc ();
+  const struct target_desc *desc;
+  unsigned int id;
+
+  if (strcmp (annex, "target.xml") == 0)
+    desc = current_target_desc ();
+  else if (sscanf (annex, "target-id-%u.xml", &id) == 1)
+    desc = get_tdesc_from_rsp_id (id);
+  else
+    desc = nullptr;
 
   /* `desc->xmltarget' defines what to return when looking for the
      "target.xml" file.  Its contents can either be verbatim XML code
@@ -986,7 +994,7 @@  get_features_xml (const char *annex)
      This variable is set up from the auto-generated
      init_registers_... routine for the current target.  */
 
-  if (strcmp (annex, "target.xml") == 0)
+  if (desc != nullptr)
     {
       const char *ret = tdesc_get_features_xml (desc);
 
@@ -1664,6 +1672,11 @@  handle_qxfer_threads_worker (thread_info *thread, struct buffer *buffer)
   if (name != NULL)
     buffer_xml_printf (buffer, " name=\"%s\"", name);
 
+  if (thread->tdesc != nullptr
+      && thread->tdesc != get_thread_process (thread)->tdesc)
+    buffer_xml_printf (buffer, " tdesc=\"%u\"",
+		       get_tdesc_rsp_id (thread->tdesc));
+
   if (handle_status)
     {
       char *handle_s = (char *) alloca (handle_len * 2 + 1);
diff --git a/gdbserver/server.h b/gdbserver/server.h
index 7997d1a32e6e..58be5027795b 100644
--- a/gdbserver/server.h
+++ b/gdbserver/server.h
@@ -193,6 +193,10 @@  struct client_state
   /* If true, memory tagging features are supported.  */
   bool memory_tagging_feature = false;
 
+  /* The target descriptions that have been communicated to the client.  The
+     index of a target description in this vector is the ID used to reference it
+     in the remote protocol.  */
+  std::vector<const target_desc *> tdescs;
 };
 
 client_state &get_client_state ();