[PATCHv5,09/11] gdbserver: update target description creation for x86/linux

Message ID 7f0e5aac3e3f52b6119658af5c4e5237811aec56.1714143669.git.aburgess@redhat.com
State New
Headers
Series x86/Linux Target Description Changes |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Testing passed

Commit Message

Andrew Burgess April 26, 2024, 3:01 p.m. UTC
  This commit is part of a series which aims to share more of the target
description creation between GDB and gdbserver for x86/Linux.

After some refactoring earlier in this series the shared
x86_linux_tdesc_for_tid function was added into nat/x86-linux-tdesc.c.
However, this function still relies on amd64_linux_read_description
and i386_linux_read_description which are implemented separately for
both gdbserver and GDB.  Given that at their core, all these functions
do is:

  1. take an xcr0 value as input,
  2. mask out some feature bits,
  3. look for a cached pre-generated target description and return it
     if found,
  4. if no cached target description is found then call either
     amd64_create_target_description or
     i386_create_target_description to create a new target
     description, which is then added to the cache.  Return the newly
     created target description.

The inner functions amd64_create_target_description and
i386_create_target_description are already shared between GDB and
gdbserver (in the gdb/arch/ directory), so the only thing that
the *_read_description functions really do is add the caching layer,
and it feels like this really could be shared.

However, we have a small problem.

On the GDB side we create target descriptions using a different set of
cpu features than on the gdbserver side!  This means that for the
exact same target, we might get a different target description when
using native GDB vs using gdbserver.  This surely feels like a
mistake, I would expect to get the same target description on each.

The table below shows the number of possible different target
descriptions that we can create on the GDB side vs on the gdbserver
side for each target type:

        | GDB | gdbserver
  ------|-----|----------
  i386  | 64  | 7
  amd64 | 32  | 7
  x32   | 16  | 7

So in theory, all I want to do is move the GDB version
of *_read_description into the arch/ directory and have gdbserver use
that, then both GDB and gdbserver would be able to create any of the
possible target descriptions.

Unfortunately it's a little more complex than that due to the in
process agent (IPA).

When the IPA is in use, gdbserver sends a target description index to
the IPA, and the IPA uses this to find the correct target description
to use.

** START OF AN ASIDE **

Back in the day I suspect this approach made perfect sense.  However
since this commit:

  commit a8806230241d201f808d856eaae4d44088117b0c
  Date:   Thu Dec 7 17:07:01 2017 +0000

      Initialize target description early in IPA

I think passing the index is now more trouble than its worth.

We used to pass the index, and then use that index to lookup which
target description to instantiate and use.  However, the above commit
fixed an issue where we can't call malloc() within (certain parts of)
the IPA (apparently), so instead we now pre-compute _every_ possible
target description within the IPA.  The index is now only used to
lookup which of the (many) pre-computed target descriptions to use.

It would (I think) have been easier all around if the IPA just
self-inspected, figured out its own xcr0 value, and used that to
create the one target description that is required.  So long as the
xcr0 to target description code is shared (at compile time) with
gdbserver, then we can be sure that the IPA will derive the same
target description as gdbserver, and we would avoid all this index
passing business, which has made this commit so very, very painful.

** END OF AN ASIDE **

Currently then for x86/linux, gdbserver sends a number between 0 and 7
to the IPA, and the IPA uses this to create a target description.

However, I am proposing that gdbserver should now create one of (up
to) 64 different target descriptions for i386, so this 0 to 7 index
isn't going to be good enough any more (amd64 and x32 have slightly
fewer possible target descriptions, but still more than 8, so the
problem is the same).

For a while I wondered if I was going to have to try and find some
backward compatible solution for this mess.  But after seeing how
lightly the IPA is actually documented, I wonder if it is not the case
that there is a tight coupling between a version of gdbserver and a
version of the IPA?  At least I'm hoping so.

In this commit I have thrown out the old IPA target description index
numbering scheme, and switched to a completely new numbering scheme.
Instead of the index that is passed being arbitrary, the index is
instead calculated from the set of cpu features that are present on
the target.  Within the IPA we can then reverse this logic to recreate
the xcr0 value based on the index, and from the xcr0 value we can
choose the correct target description.

With the gdbserver to IPA numbering scheme issue resolved I have then
update the gdbserver versions of amd64_linux_read_description and
i386_linux_read_description so that they create target descriptions
using the same set of cpu features as GDB itself.

After this gdbserver should now always come up with the same target
description as GDB does on any x86/Linux target.

This commit does not introduce any new code sharing between GDB and
gdbserver as previous commits in this series have done.  Instead this
commit is all about bringing GDB and gdbserver into alignment
functionally so that the next commit(s) can merge the GDB and
gdbserver versions of these functions.

Approved-By: John Baldwin <jhb@FreeBSD.org>
---
 gdbserver/linux-amd64-ipa.cc |  43 +----
 gdbserver/linux-i386-ipa.cc  |  23 +--
 gdbserver/linux-x86-low.cc   |  15 +-
 gdbserver/linux-x86-tdesc.cc | 316 +++++++++++++++++++++++++----------
 gdbserver/linux-x86-tdesc.h  |  49 +++---
 5 files changed, 278 insertions(+), 168 deletions(-)
  

Comments

Willgerodt, Felix April 29, 2024, 2:35 p.m. UTC | #1
> -----Original Message-----
> From: Andrew Burgess <aburgess@redhat.com>
> Sent: Freitag, 26. April 2024 17:02
> To: gdb-patches@sourceware.org
> Cc: Andrew Burgess <aburgess@redhat.com>; Willgerodt, Felix
> <felix.willgerodt@intel.com>; John Baldwin <jhb@FreeBSD.org>
> Subject: [PATCHv5 09/11] gdbserver: update target description creation for
> x86/linux
> 
> This commit is part of a series which aims to share more of the target
> description creation between GDB and gdbserver for x86/Linux.
> 
> After some refactoring earlier in this series the shared
> x86_linux_tdesc_for_tid function was added into nat/x86-linux-tdesc.c.
> However, this function still relies on amd64_linux_read_description
> and i386_linux_read_description which are implemented separately for
> both gdbserver and GDB.  Given that at their core, all these functions
> do is:
> 
>   1. take an xcr0 value as input,
>   2. mask out some feature bits,
>   3. look for a cached pre-generated target description and return it
>      if found,
>   4. if no cached target description is found then call either
>      amd64_create_target_description or
>      i386_create_target_description to create a new target
>      description, which is then added to the cache.  Return the newly
>      created target description.
> 
> The inner functions amd64_create_target_description and
> i386_create_target_description are already shared between GDB and
> gdbserver (in the gdb/arch/ directory), so the only thing that
> the *_read_description functions really do is add the caching layer,
> and it feels like this really could be shared.
> 
> However, we have a small problem.
> 
> On the GDB side we create target descriptions using a different set of
> cpu features than on the gdbserver side!  This means that for the
> exact same target, we might get a different target description when
> using native GDB vs using gdbserver.  This surely feels like a
> mistake, I would expect to get the same target description on each.
> 
> The table below shows the number of possible different target
> descriptions that we can create on the GDB side vs on the gdbserver
> side for each target type:
> 
>         | GDB | gdbserver
>   ------|-----|----------
>   i386  | 64  | 7
>   amd64 | 32  | 7
>   x32   | 16  | 7

I would have somehow expected to have the highest number for 64-bit
CPUs. As it should include 32-bit and x32.
Do you know why that isn't the case and 32-bit has double?
Or is my problem that this is reflecting GDB code, which shares stuff
between amd64 and i386?
Maybe this is also solved with the comments I made in code below.


> So in theory, all I want to do is move the GDB version
> of *_read_description into the arch/ directory and have gdbserver use
> that, then both GDB and gdbserver would be able to create any of the
> possible target descriptions.
> 
> Unfortunately it's a little more complex than that due to the in
> process agent (IPA).
> 
> When the IPA is in use, gdbserver sends a target description index to
> the IPA, and the IPA uses this to find the correct target description
> to use.
> 
> ** START OF AN ASIDE **
> 
> Back in the day I suspect this approach made perfect sense.  However
> since this commit:
> 
>   commit a8806230241d201f808d856eaae4d44088117b0c
>   Date:   Thu Dec 7 17:07:01 2017 +0000
> 
>       Initialize target description early in IPA
> 
> I think passing the index is now more trouble than its worth.
> 
> We used to pass the index, and then use that index to lookup which
> target description to instantiate and use.  However, the above commit
> fixed an issue where we can't call malloc() within (certain parts of)
> the IPA (apparently), so instead we now pre-compute _every_ possible
> target description within the IPA.  The index is now only used to
> lookup which of the (many) pre-computed target descriptions to use.
> 
> It would (I think) have been easier all around if the IPA just
> self-inspected, figured out its own xcr0 value, and used that to
> create the one target description that is required.  So long as the
> xcr0 to target description code is shared (at compile time) with
> gdbserver, then we can be sure that the IPA will derive the same
> target description as gdbserver, and we would avoid all this index
> passing business, which has made this commit so very, very painful.
> 
> ** END OF AN ASIDE **
> 
> Currently then for x86/linux, gdbserver sends a number between 0 and 7
> to the IPA, and the IPA uses this to create a target description.
> 
> However, I am proposing that gdbserver should now create one of (up
> to) 64 different target descriptions for i386, so this 0 to 7 index
> isn't going to be good enough any more (amd64 and x32 have slightly
> fewer possible target descriptions, but still more than 8, so the
> problem is the same).
> 
> For a while I wondered if I was going to have to try and find some
> backward compatible solution for this mess.  But after seeing how
> lightly the IPA is actually documented, I wonder if it is not the case
> that there is a tight coupling between a version of gdbserver and a
> version of the IPA?  At least I'm hoping so.
> 
> In this commit I have thrown out the old IPA target description index
> numbering scheme, and switched to a completely new numbering scheme.
> Instead of the index that is passed being arbitrary, the index is
> instead calculated from the set of cpu features that are present on
> the target.  Within the IPA we can then reverse this logic to recreate
> the xcr0 value based on the index, and from the xcr0 value we can
> choose the correct target description.
> 
> With the gdbserver to IPA numbering scheme issue resolved I have then
> update the gdbserver versions of amd64_linux_read_description and
> i386_linux_read_description so that they create target descriptions
> using the same set of cpu features as GDB itself.
> 
> After this gdbserver should now always come up with the same target
> description as GDB does on any x86/Linux target.
> 
> This commit does not introduce any new code sharing between GDB and
> gdbserver as previous commits in this series have done.  Instead this
> commit is all about bringing GDB and gdbserver into alignment
> functionally so that the next commit(s) can merge the GDB and
> gdbserver versions of these functions.
> 
> Approved-By: John Baldwin <jhb@FreeBSD.org>
> ---
>  gdbserver/linux-amd64-ipa.cc |  43 +----
>  gdbserver/linux-i386-ipa.cc  |  23 +--
>  gdbserver/linux-x86-low.cc   |  15 +-
>  gdbserver/linux-x86-tdesc.cc | 316 +++++++++++++++++++++++++----------
>  gdbserver/linux-x86-tdesc.h  |  49 +++---
>  5 files changed, 278 insertions(+), 168 deletions(-)
> 
> diff --git a/gdbserver/linux-amd64-ipa.cc b/gdbserver/linux-amd64-ipa.cc
> index df5e6aca081..0c80812cc6f 100644
> --- a/gdbserver/linux-amd64-ipa.cc
> +++ b/gdbserver/linux-amd64-ipa.cc
> @@ -164,47 +164,21 @@ supply_static_tracepoint_registers (struct regcache
> *regcache,
> 
>  #endif /* HAVE_UST */
> 
> -#if !defined __ILP32__
> -/* Map the tdesc index to xcr0 mask.  */
> -static uint64_t idx2mask[X86_TDESC_LAST] = {
> -  X86_XSTATE_X87_MASK,
> -  X86_XSTATE_SSE_MASK,
> -  X86_XSTATE_AVX_MASK,
> -  X86_XSTATE_MPX_MASK,
> -  X86_XSTATE_AVX_MPX_MASK,
> -  X86_XSTATE_AVX_AVX512_MASK,
> -  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
> -};
> -#endif
> -
>  /* Return target_desc to use for IPA, given the tdesc index passed by
>     gdbserver.  */
> 
>  const struct target_desc *
>  get_ipa_tdesc (int idx)
>  {
> -  if (idx >= X86_TDESC_LAST)
> -    {
> -      internal_error ("unknown ipa tdesc index: %d", idx);
> -    }
> +  uint64_t xcr0 = x86_linux_tdesc_idx_to_xcr0 (idx);
> 
>  #if defined __ILP32__
> -  switch (idx)
> -    {
> -    case X86_TDESC_SSE:
> -      return amd64_linux_read_description (X86_XSTATE_SSE_MASK, true);
> -    case X86_TDESC_AVX:
> -      return amd64_linux_read_description (X86_XSTATE_AVX_MASK, true);
> -    case X86_TDESC_AVX_AVX512:
> -      return amd64_linux_read_description (X86_XSTATE_AVX_AVX512_MASK,
> true);
> -    default:
> -      break;
> -    }
> +  bool is_x32 = true;
>  #else
> -  return amd64_linux_read_description (idx2mask[idx], false);
> +  bool is_x32 = false;
>  #endif
> 
> -  internal_error ("unknown ipa tdesc index: %d", idx);
> +  return amd64_linux_read_description (xcr0, is_x32);
>  }
> 
>  /* Allocate buffer for the jump pads.  The branch instruction has a
> @@ -272,11 +246,10 @@ void
>  initialize_low_tracepoint (void)
>  {
>  #if defined __ILP32__
> -  amd64_linux_read_description (X86_XSTATE_SSE_MASK, true);
> -  amd64_linux_read_description (X86_XSTATE_AVX_MASK, true);
> -  amd64_linux_read_description (X86_XSTATE_AVX_AVX512_MASK, true);
> +  for (auto i = 0; i < x86_linux_x32_tdesc_count (); i++)
> +    amd64_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i), true);
>  #else
> -  for (auto i = 0; i < X86_TDESC_LAST; i++)
> -    amd64_linux_read_description (idx2mask[i], false);
> +  for (auto i = 0; i < x86_linux_amd64_tdesc_count (); i++)
> +    amd64_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i), false);

I know this was already here and I know there are different opinions on this.
But to me using auto here (and in similar locations in this patch) is just wrong.
So I would vote for making this a proper type.
But this is a bit of my personal "pet peeve" ;)


>  #endif
>  }
> diff --git a/gdbserver/linux-i386-ipa.cc b/gdbserver/linux-i386-ipa.cc
> index aa346fc9bc3..c1c3152fb04 100644
> --- a/gdbserver/linux-i386-ipa.cc
> +++ b/gdbserver/linux-i386-ipa.cc
> @@ -245,28 +245,15 @@ initialize_fast_tracepoint_trampoline_buffer (void)
>      }
>  }
> 
> -/* Map the tdesc index to xcr0 mask.  */
> -static uint64_t idx2mask[X86_TDESC_LAST] = {
> -  X86_XSTATE_X87_MASK,
> -  X86_XSTATE_SSE_MASK,
> -  X86_XSTATE_AVX_MASK,
> -  X86_XSTATE_MPX_MASK,
> -  X86_XSTATE_AVX_MPX_MASK,
> -  X86_XSTATE_AVX_AVX512_MASK,
> -  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
> -};
> -
>  /* Return target_desc to use for IPA, given the tdesc index passed by
>     gdbserver.  */
> 
>  const struct target_desc *
>  get_ipa_tdesc (int idx)
>  {
> -  if (idx >= X86_TDESC_LAST)
> -    {
> -      internal_error ("unknown ipa tdesc index: %d", idx);
> -    }
> -  return i386_linux_read_description (idx2mask[idx]);
> +  uint64_t xcr0 = x86_linux_tdesc_idx_to_xcr0 (idx);
> +
> +  return i386_linux_read_description (xcr0);
>  }
> 
>  /* Allocate buffer for the jump pads.  On i386, we can reach an arbitrary
> @@ -288,6 +275,6 @@ void
>  initialize_low_tracepoint (void)
>  {
>    initialize_fast_tracepoint_trampoline_buffer ();
> -  for (auto i = 0; i < X86_TDESC_LAST; i++)
> -    i386_linux_read_description (idx2mask[i]);
> +  for (auto i = 0; i < x86_linux_i386_tdesc_count (); i++)
> +    i386_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i));
>  }
> diff --git a/gdbserver/linux-x86-low.cc b/gdbserver/linux-x86-low.cc
> index 68d2f13537c..6e23a53118b 100644
> --- a/gdbserver/linux-x86-low.cc
> +++ b/gdbserver/linux-x86-low.cc
> @@ -2882,14 +2882,17 @@ x86_target::get_ipa_tdesc_idx ()
>    struct regcache *regcache = get_thread_regcache (current_thread, 0);
>    const struct target_desc *tdesc = regcache->tdesc;
> 
> +  if (!use_xml)
> +    {
> +      if (tdesc == tdesc_i386_linux_no_xml.get ()
>  #ifdef __x86_64__
> -  return amd64_get_ipa_tdesc_idx (tdesc);
> -#endif
> -
> -  if (tdesc == tdesc_i386_linux_no_xml.get ())
> -    return X86_TDESC_SSE;
> +	  || tdesc == tdesc_amd64_linux_no_xml.get ()
> +#endif /* __x86_64__ */
> +	  )
> +	return x86_linux_xcr0_to_tdesc_idx (X86_XSTATE_SSE_MASK);

What happens if neither of the two is true? Do we need to assert this?
Or do we need to just return the X86_XSTATE_SSE_MASK index
without checking, as this can't ever happen?


> +    }
> 
> -  return i386_get_ipa_tdesc_idx (tdesc);
> +  return x86_linux_xcr0_to_tdesc_idx (xcr0_storage);
>  }
> 
>  /* The linux target ops object.  */
> diff --git a/gdbserver/linux-x86-tdesc.cc b/gdbserver/linux-x86-tdesc.cc
> index af3735aa895..5e12526bf17 100644
> --- a/gdbserver/linux-x86-tdesc.cc
> +++ b/gdbserver/linux-x86-tdesc.cc
> @@ -28,96 +28,278 @@
>  #include "x86-tdesc.h"
>  #include "arch/i386-linux-tdesc.h"
> 
> -/* Return the right x86_linux_tdesc index for a given XCR0.  Return
> -   X86_TDESC_LAST if can't find a match.  */
> +/* A structure used to describe a single cpu feature that might, or might
> +   not, be checked for when creating a target description for one of i386,
> +   amd64, or x32.  */
> 
> -static enum x86_linux_tdesc
> -xcr0_to_tdesc_idx (uint64_t xcr0, bool is_x32)
> +struct x86_tdesc_feature {
> +  /* The cpu feature mask.  This is a mask against an xcr0 value.  */
> +  uint64_t feature;

Can we elaborate the comment? Why do we need to mask anything?
Or maybe we can refer to the table below.


> +  /* Is this feature checked when creating an i386 target description.  */
> +  bool is_i386;
> +
> +  /* Is this feature checked when creating an amd64 target description.  */
> +  bool is_amd64;
> +
> +  /* Is this feature checked when creating an x32 target description.  */
> +  bool is_x32;
> +};
> +
> +/* A constant table that describes all of the cpu features that are
> +   checked when building a target description for i386, amd64, or x32.  */
> +

I am missing a bit of elaboration on why we can't only rely on XCR0. And
the table doesn't describe all CPU features that are checked. It just describes
all XSTATE features. There is also segments and orig_rax and in the near future
a new register for CET, which are independent of XSTATE.


> +static constexpr x86_tdesc_feature x86_linux_all_tdesc_features[] = {
> +  /* Feature,           i386,	amd64,	x32.  */
> +  { X86_XSTATE_PKRU,	true,	true, 	true },
> +  { X86_XSTATE_AVX512,	true,	true, 	true },
> +  { X86_XSTATE_AVX,	true,	true, 	true },
> +  { X86_XSTATE_MPX,	true,	true, 	false },
> +  { X86_XSTATE_SSE,	true,	false, 	false },
> +  { X86_XSTATE_X87,	true,	false, 	false }
> +};

I have never looked at IPA code much. But this table looked a bit weird to me.
I get that MPX is not supported for x32. Though MPX is already deprecated and will
be removed once GDB 15 is branched (at least that is the plan).
But why does amd64 not have x87 and SSE? I do see e.g. st0 and xmm0 on amd64.

After digging a bit:
In arch/amd64.c, we call create_feature_i386_64bit_sse() without any check for
SSE in XCR0. And I see that we have st0 in the "core" registers with EAX etc.
But in arch/i386.c it is different, and we add SSE based on X86_XSTATE_SSE.
And while st0 is also in the "core" registers with EAX etc., we only add them based
on X86_XSTATE_X87. To me this looks wrong. Why would CPUs without x87
not add EAX? And even if it isn't for some reason, do we really want to support
such old CPUs? In my view we could just align the two. 

If we would do that, and deprecate MPX, the table looks unnecessary. Or did I miss
something else?

This is a complicated series. Like others I am wondering how many users IPA
really has and if it is worth maintaining. But I have no data. Is it in any distros?


Thanks,
Felix
Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
  
Andrew Burgess May 7, 2024, 2:24 p.m. UTC | #2
"Willgerodt, Felix" <felix.willgerodt@intel.com> writes:

>> -----Original Message-----
>> From: Andrew Burgess <aburgess@redhat.com>
>> Sent: Freitag, 26. April 2024 17:02
>> To: gdb-patches@sourceware.org
>> Cc: Andrew Burgess <aburgess@redhat.com>; Willgerodt, Felix
>> <felix.willgerodt@intel.com>; John Baldwin <jhb@FreeBSD.org>
>> Subject: [PATCHv5 09/11] gdbserver: update target description creation for
>> x86/linux
>> 
>> This commit is part of a series which aims to share more of the target
>> description creation between GDB and gdbserver for x86/Linux.
>> 
>> After some refactoring earlier in this series the shared
>> x86_linux_tdesc_for_tid function was added into nat/x86-linux-tdesc.c.
>> However, this function still relies on amd64_linux_read_description
>> and i386_linux_read_description which are implemented separately for
>> both gdbserver and GDB.  Given that at their core, all these functions
>> do is:
>> 
>>   1. take an xcr0 value as input,
>>   2. mask out some feature bits,
>>   3. look for a cached pre-generated target description and return it
>>      if found,
>>   4. if no cached target description is found then call either
>>      amd64_create_target_description or
>>      i386_create_target_description to create a new target
>>      description, which is then added to the cache.  Return the newly
>>      created target description.
>> 
>> The inner functions amd64_create_target_description and
>> i386_create_target_description are already shared between GDB and
>> gdbserver (in the gdb/arch/ directory), so the only thing that
>> the *_read_description functions really do is add the caching layer,
>> and it feels like this really could be shared.
>> 
>> However, we have a small problem.
>> 
>> On the GDB side we create target descriptions using a different set of
>> cpu features than on the gdbserver side!  This means that for the
>> exact same target, we might get a different target description when
>> using native GDB vs using gdbserver.  This surely feels like a
>> mistake, I would expect to get the same target description on each.
>> 
>> The table below shows the number of possible different target
>> descriptions that we can create on the GDB side vs on the gdbserver
>> side for each target type:
>> 
>>         | GDB | gdbserver
>>   ------|-----|----------
>>   i386  | 64  | 7
>>   amd64 | 32  | 7
>>   x32   | 16  | 7
>
> I would have somehow expected to have the highest number for 64-bit
> CPUs. As it should include 32-bit and x32.
> Do you know why that isn't the case and 32-bit has double?
> Or is my problem that this is reflecting GDB code, which shares stuff
> between amd64 and i386?

The table reflects the different tdesc split by category, not the total
number that a user might get on a particular system.

So on an _actual_ i386 CPU, then sure, the user will (on the GDB side)
only have a choice of 64 different tdesc.

On an _actual_ amd64 CPU the user could run an i386 compiled binary, in
which case GDB will return one of those same 64 tdesc.  Or the user
might run an amd64 compiled binary, in which case GDB will return one of
the 32 amd64 tdesc.

So yes, from a user's point of view on amd64 there are 96 different
tdesc, but GDB splits these into 3 different categories (i386, amd64,
and x32), which is what this table represents.

> Maybe this is also solved with the comments I made in code below.
>
>
>> So in theory, all I want to do is move the GDB version
>> of *_read_description into the arch/ directory and have gdbserver use
>> that, then both GDB and gdbserver would be able to create any of the
>> possible target descriptions.
>> 
>> Unfortunately it's a little more complex than that due to the in
>> process agent (IPA).
>> 
>> When the IPA is in use, gdbserver sends a target description index to
>> the IPA, and the IPA uses this to find the correct target description
>> to use.
>> 
>> ** START OF AN ASIDE **
>> 
>> Back in the day I suspect this approach made perfect sense.  However
>> since this commit:
>> 
>>   commit a8806230241d201f808d856eaae4d44088117b0c
>>   Date:   Thu Dec 7 17:07:01 2017 +0000
>> 
>>       Initialize target description early in IPA
>> 
>> I think passing the index is now more trouble than its worth.
>> 
>> We used to pass the index, and then use that index to lookup which
>> target description to instantiate and use.  However, the above commit
>> fixed an issue where we can't call malloc() within (certain parts of)
>> the IPA (apparently), so instead we now pre-compute _every_ possible
>> target description within the IPA.  The index is now only used to
>> lookup which of the (many) pre-computed target descriptions to use.
>> 
>> It would (I think) have been easier all around if the IPA just
>> self-inspected, figured out its own xcr0 value, and used that to
>> create the one target description that is required.  So long as the
>> xcr0 to target description code is shared (at compile time) with
>> gdbserver, then we can be sure that the IPA will derive the same
>> target description as gdbserver, and we would avoid all this index
>> passing business, which has made this commit so very, very painful.
>> 
>> ** END OF AN ASIDE **
>> 
>> Currently then for x86/linux, gdbserver sends a number between 0 and 7
>> to the IPA, and the IPA uses this to create a target description.
>> 
>> However, I am proposing that gdbserver should now create one of (up
>> to) 64 different target descriptions for i386, so this 0 to 7 index
>> isn't going to be good enough any more (amd64 and x32 have slightly
>> fewer possible target descriptions, but still more than 8, so the
>> problem is the same).
>> 
>> For a while I wondered if I was going to have to try and find some
>> backward compatible solution for this mess.  But after seeing how
>> lightly the IPA is actually documented, I wonder if it is not the case
>> that there is a tight coupling between a version of gdbserver and a
>> version of the IPA?  At least I'm hoping so.
>> 
>> In this commit I have thrown out the old IPA target description index
>> numbering scheme, and switched to a completely new numbering scheme.
>> Instead of the index that is passed being arbitrary, the index is
>> instead calculated from the set of cpu features that are present on
>> the target.  Within the IPA we can then reverse this logic to recreate
>> the xcr0 value based on the index, and from the xcr0 value we can
>> choose the correct target description.
>> 
>> With the gdbserver to IPA numbering scheme issue resolved I have then
>> update the gdbserver versions of amd64_linux_read_description and
>> i386_linux_read_description so that they create target descriptions
>> using the same set of cpu features as GDB itself.
>> 
>> After this gdbserver should now always come up with the same target
>> description as GDB does on any x86/Linux target.
>> 
>> This commit does not introduce any new code sharing between GDB and
>> gdbserver as previous commits in this series have done.  Instead this
>> commit is all about bringing GDB and gdbserver into alignment
>> functionally so that the next commit(s) can merge the GDB and
>> gdbserver versions of these functions.
>> 
>> Approved-By: John Baldwin <jhb@FreeBSD.org>
>> ---
>>  gdbserver/linux-amd64-ipa.cc |  43 +----
>>  gdbserver/linux-i386-ipa.cc  |  23 +--
>>  gdbserver/linux-x86-low.cc   |  15 +-
>>  gdbserver/linux-x86-tdesc.cc | 316 +++++++++++++++++++++++++----------
>>  gdbserver/linux-x86-tdesc.h  |  49 +++---
>>  5 files changed, 278 insertions(+), 168 deletions(-)
>> 
>> diff --git a/gdbserver/linux-amd64-ipa.cc b/gdbserver/linux-amd64-ipa.cc
>> index df5e6aca081..0c80812cc6f 100644
>> --- a/gdbserver/linux-amd64-ipa.cc
>> +++ b/gdbserver/linux-amd64-ipa.cc
>> @@ -164,47 +164,21 @@ supply_static_tracepoint_registers (struct regcache
>> *regcache,
>> 
>>  #endif /* HAVE_UST */
>> 
>> -#if !defined __ILP32__
>> -/* Map the tdesc index to xcr0 mask.  */
>> -static uint64_t idx2mask[X86_TDESC_LAST] = {
>> -  X86_XSTATE_X87_MASK,
>> -  X86_XSTATE_SSE_MASK,
>> -  X86_XSTATE_AVX_MASK,
>> -  X86_XSTATE_MPX_MASK,
>> -  X86_XSTATE_AVX_MPX_MASK,
>> -  X86_XSTATE_AVX_AVX512_MASK,
>> -  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
>> -};
>> -#endif
>> -
>>  /* Return target_desc to use for IPA, given the tdesc index passed by
>>     gdbserver.  */
>> 
>>  const struct target_desc *
>>  get_ipa_tdesc (int idx)
>>  {
>> -  if (idx >= X86_TDESC_LAST)
>> -    {
>> -      internal_error ("unknown ipa tdesc index: %d", idx);
>> -    }
>> +  uint64_t xcr0 = x86_linux_tdesc_idx_to_xcr0 (idx);
>> 
>>  #if defined __ILP32__
>> -  switch (idx)
>> -    {
>> -    case X86_TDESC_SSE:
>> -      return amd64_linux_read_description (X86_XSTATE_SSE_MASK, true);
>> -    case X86_TDESC_AVX:
>> -      return amd64_linux_read_description (X86_XSTATE_AVX_MASK, true);
>> -    case X86_TDESC_AVX_AVX512:
>> -      return amd64_linux_read_description (X86_XSTATE_AVX_AVX512_MASK,
>> true);
>> -    default:
>> -      break;
>> -    }
>> +  bool is_x32 = true;
>>  #else
>> -  return amd64_linux_read_description (idx2mask[idx], false);
>> +  bool is_x32 = false;
>>  #endif
>> 
>> -  internal_error ("unknown ipa tdesc index: %d", idx);
>> +  return amd64_linux_read_description (xcr0, is_x32);
>>  }
>> 
>>  /* Allocate buffer for the jump pads.  The branch instruction has a
>> @@ -272,11 +246,10 @@ void
>>  initialize_low_tracepoint (void)
>>  {
>>  #if defined __ILP32__
>> -  amd64_linux_read_description (X86_XSTATE_SSE_MASK, true);
>> -  amd64_linux_read_description (X86_XSTATE_AVX_MASK, true);
>> -  amd64_linux_read_description (X86_XSTATE_AVX_AVX512_MASK, true);
>> +  for (auto i = 0; i < x86_linux_x32_tdesc_count (); i++)
>> +    amd64_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i), true);
>>  #else
>> -  for (auto i = 0; i < X86_TDESC_LAST; i++)
>> -    amd64_linux_read_description (idx2mask[i], false);
>> +  for (auto i = 0; i < x86_linux_amd64_tdesc_count (); i++)
>> +    amd64_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i), false);
>
> I know this was already here and I know there are different opinions on this.
> But to me using auto here (and in similar locations in this patch) is just wrong.
> So I would vote for making this a proper type.
> But this is a bit of my personal "pet peeve" ;)
>
>
>>  #endif
>>  }
>> diff --git a/gdbserver/linux-i386-ipa.cc b/gdbserver/linux-i386-ipa.cc
>> index aa346fc9bc3..c1c3152fb04 100644
>> --- a/gdbserver/linux-i386-ipa.cc
>> +++ b/gdbserver/linux-i386-ipa.cc
>> @@ -245,28 +245,15 @@ initialize_fast_tracepoint_trampoline_buffer (void)
>>      }
>>  }
>> 
>> -/* Map the tdesc index to xcr0 mask.  */
>> -static uint64_t idx2mask[X86_TDESC_LAST] = {
>> -  X86_XSTATE_X87_MASK,
>> -  X86_XSTATE_SSE_MASK,
>> -  X86_XSTATE_AVX_MASK,
>> -  X86_XSTATE_MPX_MASK,
>> -  X86_XSTATE_AVX_MPX_MASK,
>> -  X86_XSTATE_AVX_AVX512_MASK,
>> -  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
>> -};
>> -
>>  /* Return target_desc to use for IPA, given the tdesc index passed by
>>     gdbserver.  */
>> 
>>  const struct target_desc *
>>  get_ipa_tdesc (int idx)
>>  {
>> -  if (idx >= X86_TDESC_LAST)
>> -    {
>> -      internal_error ("unknown ipa tdesc index: %d", idx);
>> -    }
>> -  return i386_linux_read_description (idx2mask[idx]);
>> +  uint64_t xcr0 = x86_linux_tdesc_idx_to_xcr0 (idx);
>> +
>> +  return i386_linux_read_description (xcr0);
>>  }
>> 
>>  /* Allocate buffer for the jump pads.  On i386, we can reach an arbitrary
>> @@ -288,6 +275,6 @@ void
>>  initialize_low_tracepoint (void)
>>  {
>>    initialize_fast_tracepoint_trampoline_buffer ();
>> -  for (auto i = 0; i < X86_TDESC_LAST; i++)
>> -    i386_linux_read_description (idx2mask[i]);
>> +  for (auto i = 0; i < x86_linux_i386_tdesc_count (); i++)
>> +    i386_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i));
>>  }
>> diff --git a/gdbserver/linux-x86-low.cc b/gdbserver/linux-x86-low.cc
>> index 68d2f13537c..6e23a53118b 100644
>> --- a/gdbserver/linux-x86-low.cc
>> +++ b/gdbserver/linux-x86-low.cc
>> @@ -2882,14 +2882,17 @@ x86_target::get_ipa_tdesc_idx ()
>>    struct regcache *regcache = get_thread_regcache (current_thread, 0);
>>    const struct target_desc *tdesc = regcache->tdesc;
>> 
>> +  if (!use_xml)
>> +    {
>> +      if (tdesc == tdesc_i386_linux_no_xml.get ()
>>  #ifdef __x86_64__
>> -  return amd64_get_ipa_tdesc_idx (tdesc);
>> -#endif
>> -
>> -  if (tdesc == tdesc_i386_linux_no_xml.get ())
>> -    return X86_TDESC_SSE;
>> +	  || tdesc == tdesc_amd64_linux_no_xml.get ()
>> +#endif /* __x86_64__ */
>> +	  )
>> +	return x86_linux_xcr0_to_tdesc_idx (X86_XSTATE_SSE_MASK);
>
> What happens if neither of the two is true? Do we need to assert this?
> Or do we need to just return the X86_XSTATE_SSE_MASK index
> without checking, as this can't ever happen?

You're right.  I've changed the 'if' into 'gdb_assert', and made the
return of an index based on X86_XSTATE_SSE_MASK unconditional (within
the '!use_xml' block.

>
>
>> +    }
>> 
>> -  return i386_get_ipa_tdesc_idx (tdesc);
>> +  return x86_linux_xcr0_to_tdesc_idx (xcr0_storage);
>>  }
>> 
>>  /* The linux target ops object.  */
>> diff --git a/gdbserver/linux-x86-tdesc.cc b/gdbserver/linux-x86-tdesc.cc
>> index af3735aa895..5e12526bf17 100644
>> --- a/gdbserver/linux-x86-tdesc.cc
>> +++ b/gdbserver/linux-x86-tdesc.cc
>> @@ -28,96 +28,278 @@
>>  #include "x86-tdesc.h"
>>  #include "arch/i386-linux-tdesc.h"
>> 
>> -/* Return the right x86_linux_tdesc index for a given XCR0.  Return
>> -   X86_TDESC_LAST if can't find a match.  */
>> +/* A structure used to describe a single cpu feature that might, or might
>> +   not, be checked for when creating a target description for one of i386,
>> +   amd64, or x32.  */
>> 
>> -static enum x86_linux_tdesc
>> -xcr0_to_tdesc_idx (uint64_t xcr0, bool is_x32)
>> +struct x86_tdesc_feature {
>> +  /* The cpu feature mask.  This is a mask against an xcr0 value.  */
>> +  uint64_t feature;
>
> Can we elaborate the comment? Why do we need to mask anything?
> Or maybe we can refer to the table below.
>
>
>> +  /* Is this feature checked when creating an i386 target description.  */
>> +  bool is_i386;
>> +
>> +  /* Is this feature checked when creating an amd64 target description.  */
>> +  bool is_amd64;
>> +
>> +  /* Is this feature checked when creating an x32 target description.  */
>> +  bool is_x32;
>> +};
>> +
>> +/* A constant table that describes all of the cpu features that are
>> +   checked when building a target description for i386, amd64, or x32.  */
>> +
>
> I am missing a bit of elaboration on why we can't only rely on XCR0. And
> the table doesn't describe all CPU features that are checked. It just describes
> all XSTATE features. There is also segments and orig_rax and in the near future
> a new register for CET, which are independent of XSTATE.

I can rename the data structures to make it clearer that this all
relates to xstate/xcr0.  I was aware about the segments flag that is
passed into the tdesc creation, but even with prompting I can't see
anything related to orig_rax -- how is that fed into the tdesc creation
process?


>
>
>> +static constexpr x86_tdesc_feature x86_linux_all_tdesc_features[] = {
>> +  /* Feature,           i386,	amd64,	x32.  */
>> +  { X86_XSTATE_PKRU,	true,	true, 	true },
>> +  { X86_XSTATE_AVX512,	true,	true, 	true },
>> +  { X86_XSTATE_AVX,	true,	true, 	true },
>> +  { X86_XSTATE_MPX,	true,	true, 	false },
>> +  { X86_XSTATE_SSE,	true,	false, 	false },
>> +  { X86_XSTATE_X87,	true,	false, 	false }
>> +};
>
> I have never looked at IPA code much. But this table looked a bit weird to me.
> I get that MPX is not supported for x32. Though MPX is already deprecated and will
> be removed once GDB 15 is branched (at least that is the plan).

You must excuse my lack of up to date knowledge of different x86
features, but is there not hardware in the wild with the MPX feature?
Wont removing support for this mean folk will be unable to debug on that
hardware?

> But why does amd64 not have x87 and SSE? I do see e.g. st0 and xmm0 on amd64.
>
> After digging a bit:
> In arch/amd64.c, we call create_feature_i386_64bit_sse() without any check for
> SSE in XCR0. And I see that we have st0 in the "core" registers with EAX etc.
> But in arch/i386.c it is different, and we add SSE based on X86_XSTATE_SSE.
> And while st0 is also in the "core" registers with EAX etc., we only add them based
> on X86_XSTATE_X87. To me this looks wrong. Why would CPUs without x87
> not add EAX? And even if it isn't for some reason, do we really want to support
> such old CPUs? In my view we could just align the two. 
>
> If we would do that, and deprecate MPX, the table looks unnecessary. Or did I miss
> something else?

That looks like a reasonable conclusion.

Honestly, pretty much everything in this patch is based on the idea of
merging gdbserver and GDB code, while keeping functionality as similar
as possible.

>
> This is a complicated series. Like others I am wondering how many users IPA
> really has and if it is worth maintaining. But I have no data. Is it in any distros?

I have no idea.  But I really don't want to tie this series (which fixes
actual bugs caused by tdesc mismatch between GDB and gdbserver) to the
removal of IPA.

As far as I can tell the IPA is tested as part of the gdb.trace/
sub-directory, so it's not like we're removing some untested broken
corner of GDB.  We'd be removing a somewhat tested, at least minimally
working, feature.

I'll take another pass over this patch, maybe I can convince myself that
we can just check for all features for all machine types, in which case
the table driven approach becomes unnecessary.

Also, maybe I can split the series in two.  Like so many of my series
it just seems to keep growing.  While addressing your earlier comments
I've already spotted another bug, which is going to push the patch count
to at least 12 *sigh*.

Lets see what I can come up with for v6!

Thanks,
Andrew


>
>
> Thanks,
> Felix
> Intel Deutschland GmbH
> Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
> Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
> Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
> Chairperson of the Supervisory Board: Nicole Lau
> Registered Office: Munich
> Commercial Register: Amtsgericht Muenchen HRB 186928
  
Willgerodt, Felix May 8, 2024, 7:47 a.m. UTC | #3
> -----Original Message-----
> From: Andrew Burgess <aburgess@redhat.com>
> Sent: Dienstag, 7. Mai 2024 16:25
> To: Willgerodt, Felix <felix.willgerodt@intel.com>; gdb-patches@sourceware.org
> Cc: John Baldwin <jhb@FreeBSD.org>
> Subject: RE: [PATCHv5 09/11] gdbserver: update target description creation for
> x86/linux
> 
> "Willgerodt, Felix" <felix.willgerodt@intel.com> writes:
> 
> >> -----Original Message-----
> >> From: Andrew Burgess <aburgess@redhat.com>
> >> Sent: Freitag, 26. April 2024 17:02
> >> To: gdb-patches@sourceware.org
> >> Cc: Andrew Burgess <aburgess@redhat.com>; Willgerodt, Felix
> >> <felix.willgerodt@intel.com>; John Baldwin <jhb@FreeBSD.org>
> >> Subject: [PATCHv5 09/11] gdbserver: update target description creation for
> >> x86/linux
> >>
> >> This commit is part of a series which aims to share more of the target
> >> description creation between GDB and gdbserver for x86/Linux.
> >>
> >> After some refactoring earlier in this series the shared
> >> x86_linux_tdesc_for_tid function was added into nat/x86-linux-tdesc.c.
> >> However, this function still relies on amd64_linux_read_description
> >> and i386_linux_read_description which are implemented separately for
> >> both gdbserver and GDB.  Given that at their core, all these functions
> >> do is:
> >>
> >>   1. take an xcr0 value as input,
> >>   2. mask out some feature bits,
> >>   3. look for a cached pre-generated target description and return it
> >>      if found,
> >>   4. if no cached target description is found then call either
> >>      amd64_create_target_description or
> >>      i386_create_target_description to create a new target
> >>      description, which is then added to the cache.  Return the newly
> >>      created target description.
> >>
> >> The inner functions amd64_create_target_description and
> >> i386_create_target_description are already shared between GDB and
> >> gdbserver (in the gdb/arch/ directory), so the only thing that
> >> the *_read_description functions really do is add the caching layer,
> >> and it feels like this really could be shared.
> >>
> >> However, we have a small problem.
> >>
> >> On the GDB side we create target descriptions using a different set of
> >> cpu features than on the gdbserver side!  This means that for the
> >> exact same target, we might get a different target description when
> >> using native GDB vs using gdbserver.  This surely feels like a
> >> mistake, I would expect to get the same target description on each.
> >>
> >> The table below shows the number of possible different target
> >> descriptions that we can create on the GDB side vs on the gdbserver
> >> side for each target type:
> >>
> >>         | GDB | gdbserver
> >>   ------|-----|----------
> >>   i386  | 64  | 7
> >>   amd64 | 32  | 7
> >>   x32   | 16  | 7
> >
> > I would have somehow expected to have the highest number for 64-bit
> > CPUs. As it should include 32-bit and x32.
> > Do you know why that isn't the case and 32-bit has double?
> > Or is my problem that this is reflecting GDB code, which shares stuff
> > between amd64 and i386?
> 
> The table reflects the different tdesc split by category, not the total
> number that a user might get on a particular system.
> 
> So on an _actual_ i386 CPU, then sure, the user will (on the GDB side)
> only have a choice of 64 different tdesc.
> 
> On an _actual_ amd64 CPU the user could run an i386 compiled binary, in
> which case GDB will return one of those same 64 tdesc.  Or the user
> might run an amd64 compiled binary, in which case GDB will return one of
> the 32 amd64 tdesc.
> 
> So yes, from a user's point of view on amd64 there are 96 different
> tdesc, but GDB splits these into 3 different categories (i386, amd64,
> and x32), which is what this table represents.

Thanks for explaining. Understood.


> > Maybe this is also solved with the comments I made in code below.
> >
> >
> >> So in theory, all I want to do is move the GDB version
> >> of *_read_description into the arch/ directory and have gdbserver use
> >> that, then both GDB and gdbserver would be able to create any of the
> >> possible target descriptions.
> >>
> >> Unfortunately it's a little more complex than that due to the in
> >> process agent (IPA).
> >>
> >> When the IPA is in use, gdbserver sends a target description index to
> >> the IPA, and the IPA uses this to find the correct target description
> >> to use.
> >>
> >> ** START OF AN ASIDE **
> >>
> >> Back in the day I suspect this approach made perfect sense.  However
> >> since this commit:
> >>
> >>   commit a8806230241d201f808d856eaae4d44088117b0c
> >>   Date:   Thu Dec 7 17:07:01 2017 +0000
> >>
> >>       Initialize target description early in IPA
> >>
> >> I think passing the index is now more trouble than its worth.
> >>
> >> We used to pass the index, and then use that index to lookup which
> >> target description to instantiate and use.  However, the above commit
> >> fixed an issue where we can't call malloc() within (certain parts of)
> >> the IPA (apparently), so instead we now pre-compute _every_ possible
> >> target description within the IPA.  The index is now only used to
> >> lookup which of the (many) pre-computed target descriptions to use.
> >>
> >> It would (I think) have been easier all around if the IPA just
> >> self-inspected, figured out its own xcr0 value, and used that to
> >> create the one target description that is required.  So long as the
> >> xcr0 to target description code is shared (at compile time) with
> >> gdbserver, then we can be sure that the IPA will derive the same
> >> target description as gdbserver, and we would avoid all this index
> >> passing business, which has made this commit so very, very painful.
> >>
> >> ** END OF AN ASIDE **
> >>
> >> Currently then for x86/linux, gdbserver sends a number between 0 and 7
> >> to the IPA, and the IPA uses this to create a target description.
> >>
> >> However, I am proposing that gdbserver should now create one of (up
> >> to) 64 different target descriptions for i386, so this 0 to 7 index
> >> isn't going to be good enough any more (amd64 and x32 have slightly
> >> fewer possible target descriptions, but still more than 8, so the
> >> problem is the same).
> >>
> >> For a while I wondered if I was going to have to try and find some
> >> backward compatible solution for this mess.  But after seeing how
> >> lightly the IPA is actually documented, I wonder if it is not the case
> >> that there is a tight coupling between a version of gdbserver and a
> >> version of the IPA?  At least I'm hoping so.
> >>
> >> In this commit I have thrown out the old IPA target description index
> >> numbering scheme, and switched to a completely new numbering scheme.
> >> Instead of the index that is passed being arbitrary, the index is
> >> instead calculated from the set of cpu features that are present on
> >> the target.  Within the IPA we can then reverse this logic to recreate
> >> the xcr0 value based on the index, and from the xcr0 value we can
> >> choose the correct target description.
> >>
> >> With the gdbserver to IPA numbering scheme issue resolved I have then
> >> update the gdbserver versions of amd64_linux_read_description and
> >> i386_linux_read_description so that they create target descriptions
> >> using the same set of cpu features as GDB itself.
> >>
> >> After this gdbserver should now always come up with the same target
> >> description as GDB does on any x86/Linux target.
> >>
> >> This commit does not introduce any new code sharing between GDB and
> >> gdbserver as previous commits in this series have done.  Instead this
> >> commit is all about bringing GDB and gdbserver into alignment
> >> functionally so that the next commit(s) can merge the GDB and
> >> gdbserver versions of these functions.
> >>
> >> Approved-By: John Baldwin <jhb@FreeBSD.org>
> >> ---
> >>  gdbserver/linux-amd64-ipa.cc |  43 +----
> >>  gdbserver/linux-i386-ipa.cc  |  23 +--
> >>  gdbserver/linux-x86-low.cc   |  15 +-
> >>  gdbserver/linux-x86-tdesc.cc | 316 +++++++++++++++++++++++++----------
> >>  gdbserver/linux-x86-tdesc.h  |  49 +++---
> >>  5 files changed, 278 insertions(+), 168 deletions(-)
> >>
> >> diff --git a/gdbserver/linux-amd64-ipa.cc b/gdbserver/linux-amd64-ipa.cc
> >> index df5e6aca081..0c80812cc6f 100644
> >> --- a/gdbserver/linux-amd64-ipa.cc
> >> +++ b/gdbserver/linux-amd64-ipa.cc
> >> @@ -164,47 +164,21 @@ supply_static_tracepoint_registers (struct regcache
> >> *regcache,
> >>
> >>  #endif /* HAVE_UST */
> >>
> >> -#if !defined __ILP32__
> >> -/* Map the tdesc index to xcr0 mask.  */
> >> -static uint64_t idx2mask[X86_TDESC_LAST] = {
> >> -  X86_XSTATE_X87_MASK,
> >> -  X86_XSTATE_SSE_MASK,
> >> -  X86_XSTATE_AVX_MASK,
> >> -  X86_XSTATE_MPX_MASK,
> >> -  X86_XSTATE_AVX_MPX_MASK,
> >> -  X86_XSTATE_AVX_AVX512_MASK,
> >> -  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
> >> -};
> >> -#endif
> >> -
> >>  /* Return target_desc to use for IPA, given the tdesc index passed by
> >>     gdbserver.  */
> >>
> >>  const struct target_desc *
> >>  get_ipa_tdesc (int idx)
> >>  {
> >> -  if (idx >= X86_TDESC_LAST)
> >> -    {
> >> -      internal_error ("unknown ipa tdesc index: %d", idx);
> >> -    }
> >> +  uint64_t xcr0 = x86_linux_tdesc_idx_to_xcr0 (idx);
> >>
> >>  #if defined __ILP32__
> >> -  switch (idx)
> >> -    {
> >> -    case X86_TDESC_SSE:
> >> -      return amd64_linux_read_description (X86_XSTATE_SSE_MASK, true);
> >> -    case X86_TDESC_AVX:
> >> -      return amd64_linux_read_description (X86_XSTATE_AVX_MASK, true);
> >> -    case X86_TDESC_AVX_AVX512:
> >> -      return amd64_linux_read_description (X86_XSTATE_AVX_AVX512_MASK,
> >> true);
> >> -    default:
> >> -      break;
> >> -    }
> >> +  bool is_x32 = true;
> >>  #else
> >> -  return amd64_linux_read_description (idx2mask[idx], false);
> >> +  bool is_x32 = false;
> >>  #endif
> >>
> >> -  internal_error ("unknown ipa tdesc index: %d", idx);
> >> +  return amd64_linux_read_description (xcr0, is_x32);
> >>  }
> >>
> >>  /* Allocate buffer for the jump pads.  The branch instruction has a
> >> @@ -272,11 +246,10 @@ void
> >>  initialize_low_tracepoint (void)
> >>  {
> >>  #if defined __ILP32__
> >> -  amd64_linux_read_description (X86_XSTATE_SSE_MASK, true);
> >> -  amd64_linux_read_description (X86_XSTATE_AVX_MASK, true);
> >> -  amd64_linux_read_description (X86_XSTATE_AVX_AVX512_MASK, true);
> >> +  for (auto i = 0; i < x86_linux_x32_tdesc_count (); i++)
> >> +    amd64_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i), true);
> >>  #else
> >> -  for (auto i = 0; i < X86_TDESC_LAST; i++)
> >> -    amd64_linux_read_description (idx2mask[i], false);
> >> +  for (auto i = 0; i < x86_linux_amd64_tdesc_count (); i++)
> >> +    amd64_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i), false);
> >
> > I know this was already here and I know there are different opinions on this.
> > But to me using auto here (and in similar locations in this patch) is just wrong.
> > So I would vote for making this a proper type.
> > But this is a bit of my personal "pet peeve" ;)
> >
> >
> >>  #endif
> >>  }
> >> diff --git a/gdbserver/linux-i386-ipa.cc b/gdbserver/linux-i386-ipa.cc
> >> index aa346fc9bc3..c1c3152fb04 100644
> >> --- a/gdbserver/linux-i386-ipa.cc
> >> +++ b/gdbserver/linux-i386-ipa.cc
> >> @@ -245,28 +245,15 @@ initialize_fast_tracepoint_trampoline_buffer (void)
> >>      }
> >>  }
> >>
> >> -/* Map the tdesc index to xcr0 mask.  */
> >> -static uint64_t idx2mask[X86_TDESC_LAST] = {
> >> -  X86_XSTATE_X87_MASK,
> >> -  X86_XSTATE_SSE_MASK,
> >> -  X86_XSTATE_AVX_MASK,
> >> -  X86_XSTATE_MPX_MASK,
> >> -  X86_XSTATE_AVX_MPX_MASK,
> >> -  X86_XSTATE_AVX_AVX512_MASK,
> >> -  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
> >> -};
> >> -
> >>  /* Return target_desc to use for IPA, given the tdesc index passed by
> >>     gdbserver.  */
> >>
> >>  const struct target_desc *
> >>  get_ipa_tdesc (int idx)
> >>  {
> >> -  if (idx >= X86_TDESC_LAST)
> >> -    {
> >> -      internal_error ("unknown ipa tdesc index: %d", idx);
> >> -    }
> >> -  return i386_linux_read_description (idx2mask[idx]);
> >> +  uint64_t xcr0 = x86_linux_tdesc_idx_to_xcr0 (idx);
> >> +
> >> +  return i386_linux_read_description (xcr0);
> >>  }
> >>
> >>  /* Allocate buffer for the jump pads.  On i386, we can reach an arbitrary
> >> @@ -288,6 +275,6 @@ void
> >>  initialize_low_tracepoint (void)
> >>  {
> >>    initialize_fast_tracepoint_trampoline_buffer ();
> >> -  for (auto i = 0; i < X86_TDESC_LAST; i++)
> >> -    i386_linux_read_description (idx2mask[i]);
> >> +  for (auto i = 0; i < x86_linux_i386_tdesc_count (); i++)
> >> +    i386_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i));
> >>  }
> >> diff --git a/gdbserver/linux-x86-low.cc b/gdbserver/linux-x86-low.cc
> >> index 68d2f13537c..6e23a53118b 100644
> >> --- a/gdbserver/linux-x86-low.cc
> >> +++ b/gdbserver/linux-x86-low.cc
> >> @@ -2882,14 +2882,17 @@ x86_target::get_ipa_tdesc_idx ()
> >>    struct regcache *regcache = get_thread_regcache (current_thread, 0);
> >>    const struct target_desc *tdesc = regcache->tdesc;
> >>
> >> +  if (!use_xml)
> >> +    {
> >> +      if (tdesc == tdesc_i386_linux_no_xml.get ()
> >>  #ifdef __x86_64__
> >> -  return amd64_get_ipa_tdesc_idx (tdesc);
> >> -#endif
> >> -
> >> -  if (tdesc == tdesc_i386_linux_no_xml.get ())
> >> -    return X86_TDESC_SSE;
> >> +	  || tdesc == tdesc_amd64_linux_no_xml.get ()
> >> +#endif /* __x86_64__ */
> >> +	  )
> >> +	return x86_linux_xcr0_to_tdesc_idx (X86_XSTATE_SSE_MASK);
> >
> > What happens if neither of the two is true? Do we need to assert this?
> > Or do we need to just return the X86_XSTATE_SSE_MASK index
> > without checking, as this can't ever happen?
> 
> You're right.  I've changed the 'if' into 'gdb_assert', and made the
> return of an index based on X86_XSTATE_SSE_MASK unconditional (within
> the '!use_xml' block.
> 
> >
> >
> >> +    }
> >>
> >> -  return i386_get_ipa_tdesc_idx (tdesc);
> >> +  return x86_linux_xcr0_to_tdesc_idx (xcr0_storage);
> >>  }
> >>
> >>  /* The linux target ops object.  */
> >> diff --git a/gdbserver/linux-x86-tdesc.cc b/gdbserver/linux-x86-tdesc.cc
> >> index af3735aa895..5e12526bf17 100644
> >> --- a/gdbserver/linux-x86-tdesc.cc
> >> +++ b/gdbserver/linux-x86-tdesc.cc
> >> @@ -28,96 +28,278 @@
> >>  #include "x86-tdesc.h"
> >>  #include "arch/i386-linux-tdesc.h"
> >>
> >> -/* Return the right x86_linux_tdesc index for a given XCR0.  Return
> >> -   X86_TDESC_LAST if can't find a match.  */
> >> +/* A structure used to describe a single cpu feature that might, or might
> >> +   not, be checked for when creating a target description for one of i386,
> >> +   amd64, or x32.  */
> >>
> >> -static enum x86_linux_tdesc
> >> -xcr0_to_tdesc_idx (uint64_t xcr0, bool is_x32)
> >> +struct x86_tdesc_feature {
> >> +  /* The cpu feature mask.  This is a mask against an xcr0 value.  */
> >> +  uint64_t feature;
> >
> > Can we elaborate the comment? Why do we need to mask anything?
> > Or maybe we can refer to the table below.
> >
> >
> >> +  /* Is this feature checked when creating an i386 target description.  */
> >> +  bool is_i386;
> >> +
> >> +  /* Is this feature checked when creating an amd64 target description.  */
> >> +  bool is_amd64;
> >> +
> >> +  /* Is this feature checked when creating an x32 target description.  */
> >> +  bool is_x32;
> >> +};
> >> +
> >> +/* A constant table that describes all of the cpu features that are
> >> +   checked when building a target description for i386, amd64, or x32.  */
> >> +
> >
> > I am missing a bit of elaboration on why we can't only rely on XCR0. And
> > the table doesn't describe all CPU features that are checked. It just describes
> > all XSTATE features. There is also segments and orig_rax and in the near future
> > a new register for CET, which are independent of XSTATE.
> 
> I can rename the data structures to make it clearer that this all
> relates to xstate/xcr0.

That would be good.

A side node: Over the last couple of days I remember that we will need something
aside from XCR0 also for AVX10 (basically avx512 but 256-bit registers only) and
probably AMX as well. And APX. Mostly some CPUID bits or a register in the
case of AMX. But I see that you already responded in patch 11 to my fear about
the map.


> I was aware about the segments flag that is
> passed into the tdesc creation, but even with prompting I can't see
> anything related to orig_rax -- how is that fed into the tdesc creation
> process?

Both segments and orig_rax are added unconditionally in amd64-linux-tdep.c
(or the i386 equivalent). They are controlled via the "bool segments" and
"bool is_linux" args of "*_create_target_description()". Unfortunately my
knowledge is pretty spotty on these two as well, and I didn't try to dig
out the patches that added them.


> >
> >
> >> +static constexpr x86_tdesc_feature x86_linux_all_tdesc_features[] = {
> >> +  /* Feature,           i386,	amd64,	x32.  */
> >> +  { X86_XSTATE_PKRU,	true,	true, 	true },
> >> +  { X86_XSTATE_AVX512,	true,	true, 	true },
> >> +  { X86_XSTATE_AVX,	true,	true, 	true },
> >> +  { X86_XSTATE_MPX,	true,	true, 	false },
> >> +  { X86_XSTATE_SSE,	true,	false, 	false },
> >> +  { X86_XSTATE_X87,	true,	false, 	false }
> >> +};
> >
> > I have never looked at IPA code much. But this table looked a bit weird to me.
> > I get that MPX is not supported for x32. Though MPX is already deprecated and
> will
> > be removed once GDB 15 is branched (at least that is the plan).
> 
> You must excuse my lack of up to date knowledge of different x86
> features, but is there not hardware in the wild with the MPX feature?
> Wont removing support for this mean folk will be unable to debug on that
> hardware?

Intel deprecated MPX in 2019 for various reasons:
https://en.wikipedia.org/wiki/Intel_MPX
I think that means no-one really should use MPX anymore, even if their CPU
supports it.

Since GCC, glibc and Linux don't support it anymore for a while, we also
deprecated it:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=7650ea38908e98a5823a334286813eca2ff5719e

Another argument for that deprecation is Intel APX, which has been announced
for future CPUs. It will re-use the MPX XSTATE area. While there is a new/different
bit for APX in the XSTATE header, which would allow us to keep support
for both types of registers, it is still a much needed deprecation in my view.
The code is already complex enough.


> > But why does amd64 not have x87 and SSE? I do see e.g. st0 and xmm0 on
> amd64.
> >
> > After digging a bit:
> > In arch/amd64.c, we call create_feature_i386_64bit_sse() without any check
> for
> > SSE in XCR0. And I see that we have st0 in the "core" registers with EAX etc.
> > But in arch/i386.c it is different, and we add SSE based on X86_XSTATE_SSE.
> > And while st0 is also in the "core" registers with EAX etc., we only add them
> based
> > on X86_XSTATE_X87. To me this looks wrong. Why would CPUs without x87
> > not add EAX? And even if it isn't for some reason, do we really want to support
> > such old CPUs? In my view we could just align the two.
> >
> > If we would do that, and deprecate MPX, the table looks unnecessary. Or did I
> miss
> > something else?
> 
> That looks like a reasonable conclusion.
> 
> Honestly, pretty much everything in this patch is based on the idea of
> merging gdbserver and GDB code, while keeping functionality as similar
> as possible.

Which is a really good thing. Thanks for tackling this!

> >
> > This is a complicated series. Like others I am wondering how many users IPA
> > really has and if it is worth maintaining. But I have no data. Is it in any distros?
> 
> I have no idea.  But I really don't want to tie this series (which fixes
> actual bugs caused by tdesc mismatch between GDB and gdbserver) to the
> removal of IPA.
>

Totally fine with me.

> As far as I can tell the IPA is tested as part of the gdb.trace/
> sub-directory, so it's not like we're removing some untested broken
> corner of GDB.  We'd be removing a somewhat tested, at least minimally
> working, feature.
> 

Ok by me, I was only wondering. It would for sure need a separate discussion
and series.

Thanks,
Felix
Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Sean Fennelly, Jeffrey Schneiderman, Tiffany Doon Silva
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
  
Andrew Burgess May 8, 2024, 1:28 p.m. UTC | #4
"Willgerodt, Felix" <felix.willgerodt@intel.com> writes:

>> -----Original Message-----
>> From: Andrew Burgess <aburgess@redhat.com>
>> Sent: Dienstag, 7. Mai 2024 16:25
>> To: Willgerodt, Felix <felix.willgerodt@intel.com>; gdb-patches@sourceware.org
>> Cc: John Baldwin <jhb@FreeBSD.org>
>> Subject: RE: [PATCHv5 09/11] gdbserver: update target description creation for
>> x86/linux
>> 
>> "Willgerodt, Felix" <felix.willgerodt@intel.com> writes:
>> 
>> >> -----Original Message-----
>> >> From: Andrew Burgess <aburgess@redhat.com>
>> >> Sent: Freitag, 26. April 2024 17:02
>> >> To: gdb-patches@sourceware.org
>> >> Cc: Andrew Burgess <aburgess@redhat.com>; Willgerodt, Felix
>> >> <felix.willgerodt@intel.com>; John Baldwin <jhb@FreeBSD.org>
>> >> Subject: [PATCHv5 09/11] gdbserver: update target description creation for
>> >> x86/linux
>> >>
>> >> This commit is part of a series which aims to share more of the target
>> >> description creation between GDB and gdbserver for x86/Linux.
>> >>
>> >> After some refactoring earlier in this series the shared
>> >> x86_linux_tdesc_for_tid function was added into nat/x86-linux-tdesc.c.
>> >> However, this function still relies on amd64_linux_read_description
>> >> and i386_linux_read_description which are implemented separately for
>> >> both gdbserver and GDB.  Given that at their core, all these functions
>> >> do is:
>> >>
>> >>   1. take an xcr0 value as input,
>> >>   2. mask out some feature bits,
>> >>   3. look for a cached pre-generated target description and return it
>> >>      if found,
>> >>   4. if no cached target description is found then call either
>> >>      amd64_create_target_description or
>> >>      i386_create_target_description to create a new target
>> >>      description, which is then added to the cache.  Return the newly
>> >>      created target description.
>> >>
>> >> The inner functions amd64_create_target_description and
>> >> i386_create_target_description are already shared between GDB and
>> >> gdbserver (in the gdb/arch/ directory), so the only thing that
>> >> the *_read_description functions really do is add the caching layer,
>> >> and it feels like this really could be shared.
>> >>
>> >> However, we have a small problem.
>> >>
>> >> On the GDB side we create target descriptions using a different set of
>> >> cpu features than on the gdbserver side!  This means that for the
>> >> exact same target, we might get a different target description when
>> >> using native GDB vs using gdbserver.  This surely feels like a
>> >> mistake, I would expect to get the same target description on each.
>> >>
>> >> The table below shows the number of possible different target
>> >> descriptions that we can create on the GDB side vs on the gdbserver
>> >> side for each target type:
>> >>
>> >>         | GDB | gdbserver
>> >>   ------|-----|----------
>> >>   i386  | 64  | 7
>> >>   amd64 | 32  | 7
>> >>   x32   | 16  | 7
>> >
>> > I would have somehow expected to have the highest number for 64-bit
>> > CPUs. As it should include 32-bit and x32.
>> > Do you know why that isn't the case and 32-bit has double?
>> > Or is my problem that this is reflecting GDB code, which shares stuff
>> > between amd64 and i386?
>> 
>> The table reflects the different tdesc split by category, not the total
>> number that a user might get on a particular system.
>> 
>> So on an _actual_ i386 CPU, then sure, the user will (on the GDB side)
>> only have a choice of 64 different tdesc.
>> 
>> On an _actual_ amd64 CPU the user could run an i386 compiled binary, in
>> which case GDB will return one of those same 64 tdesc.  Or the user
>> might run an amd64 compiled binary, in which case GDB will return one of
>> the 32 amd64 tdesc.
>> 
>> So yes, from a user's point of view on amd64 there are 96 different
>> tdesc, but GDB splits these into 3 different categories (i386, amd64,
>> and x32), which is what this table represents.
>
> Thanks for explaining. Understood.
>
>
>> > Maybe this is also solved with the comments I made in code below.
>> >
>> >
>> >> So in theory, all I want to do is move the GDB version
>> >> of *_read_description into the arch/ directory and have gdbserver use
>> >> that, then both GDB and gdbserver would be able to create any of the
>> >> possible target descriptions.
>> >>
>> >> Unfortunately it's a little more complex than that due to the in
>> >> process agent (IPA).
>> >>
>> >> When the IPA is in use, gdbserver sends a target description index to
>> >> the IPA, and the IPA uses this to find the correct target description
>> >> to use.
>> >>
>> >> ** START OF AN ASIDE **
>> >>
>> >> Back in the day I suspect this approach made perfect sense.  However
>> >> since this commit:
>> >>
>> >>   commit a8806230241d201f808d856eaae4d44088117b0c
>> >>   Date:   Thu Dec 7 17:07:01 2017 +0000
>> >>
>> >>       Initialize target description early in IPA
>> >>
>> >> I think passing the index is now more trouble than its worth.
>> >>
>> >> We used to pass the index, and then use that index to lookup which
>> >> target description to instantiate and use.  However, the above commit
>> >> fixed an issue where we can't call malloc() within (certain parts of)
>> >> the IPA (apparently), so instead we now pre-compute _every_ possible
>> >> target description within the IPA.  The index is now only used to
>> >> lookup which of the (many) pre-computed target descriptions to use.
>> >>
>> >> It would (I think) have been easier all around if the IPA just
>> >> self-inspected, figured out its own xcr0 value, and used that to
>> >> create the one target description that is required.  So long as the
>> >> xcr0 to target description code is shared (at compile time) with
>> >> gdbserver, then we can be sure that the IPA will derive the same
>> >> target description as gdbserver, and we would avoid all this index
>> >> passing business, which has made this commit so very, very painful.
>> >>
>> >> ** END OF AN ASIDE **
>> >>
>> >> Currently then for x86/linux, gdbserver sends a number between 0 and 7
>> >> to the IPA, and the IPA uses this to create a target description.
>> >>
>> >> However, I am proposing that gdbserver should now create one of (up
>> >> to) 64 different target descriptions for i386, so this 0 to 7 index
>> >> isn't going to be good enough any more (amd64 and x32 have slightly
>> >> fewer possible target descriptions, but still more than 8, so the
>> >> problem is the same).
>> >>
>> >> For a while I wondered if I was going to have to try and find some
>> >> backward compatible solution for this mess.  But after seeing how
>> >> lightly the IPA is actually documented, I wonder if it is not the case
>> >> that there is a tight coupling between a version of gdbserver and a
>> >> version of the IPA?  At least I'm hoping so.
>> >>
>> >> In this commit I have thrown out the old IPA target description index
>> >> numbering scheme, and switched to a completely new numbering scheme.
>> >> Instead of the index that is passed being arbitrary, the index is
>> >> instead calculated from the set of cpu features that are present on
>> >> the target.  Within the IPA we can then reverse this logic to recreate
>> >> the xcr0 value based on the index, and from the xcr0 value we can
>> >> choose the correct target description.
>> >>
>> >> With the gdbserver to IPA numbering scheme issue resolved I have then
>> >> update the gdbserver versions of amd64_linux_read_description and
>> >> i386_linux_read_description so that they create target descriptions
>> >> using the same set of cpu features as GDB itself.
>> >>
>> >> After this gdbserver should now always come up with the same target
>> >> description as GDB does on any x86/Linux target.
>> >>
>> >> This commit does not introduce any new code sharing between GDB and
>> >> gdbserver as previous commits in this series have done.  Instead this
>> >> commit is all about bringing GDB and gdbserver into alignment
>> >> functionally so that the next commit(s) can merge the GDB and
>> >> gdbserver versions of these functions.
>> >>
>> >> Approved-By: John Baldwin <jhb@FreeBSD.org>
>> >> ---
>> >>  gdbserver/linux-amd64-ipa.cc |  43 +----
>> >>  gdbserver/linux-i386-ipa.cc  |  23 +--
>> >>  gdbserver/linux-x86-low.cc   |  15 +-
>> >>  gdbserver/linux-x86-tdesc.cc | 316 +++++++++++++++++++++++++----------
>> >>  gdbserver/linux-x86-tdesc.h  |  49 +++---
>> >>  5 files changed, 278 insertions(+), 168 deletions(-)
>> >>
>> >> diff --git a/gdbserver/linux-amd64-ipa.cc b/gdbserver/linux-amd64-ipa.cc
>> >> index df5e6aca081..0c80812cc6f 100644
>> >> --- a/gdbserver/linux-amd64-ipa.cc
>> >> +++ b/gdbserver/linux-amd64-ipa.cc
>> >> @@ -164,47 +164,21 @@ supply_static_tracepoint_registers (struct regcache
>> >> *regcache,
>> >>
>> >>  #endif /* HAVE_UST */
>> >>
>> >> -#if !defined __ILP32__
>> >> -/* Map the tdesc index to xcr0 mask.  */
>> >> -static uint64_t idx2mask[X86_TDESC_LAST] = {
>> >> -  X86_XSTATE_X87_MASK,
>> >> -  X86_XSTATE_SSE_MASK,
>> >> -  X86_XSTATE_AVX_MASK,
>> >> -  X86_XSTATE_MPX_MASK,
>> >> -  X86_XSTATE_AVX_MPX_MASK,
>> >> -  X86_XSTATE_AVX_AVX512_MASK,
>> >> -  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
>> >> -};
>> >> -#endif
>> >> -
>> >>  /* Return target_desc to use for IPA, given the tdesc index passed by
>> >>     gdbserver.  */
>> >>
>> >>  const struct target_desc *
>> >>  get_ipa_tdesc (int idx)
>> >>  {
>> >> -  if (idx >= X86_TDESC_LAST)
>> >> -    {
>> >> -      internal_error ("unknown ipa tdesc index: %d", idx);
>> >> -    }
>> >> +  uint64_t xcr0 = x86_linux_tdesc_idx_to_xcr0 (idx);
>> >>
>> >>  #if defined __ILP32__
>> >> -  switch (idx)
>> >> -    {
>> >> -    case X86_TDESC_SSE:
>> >> -      return amd64_linux_read_description (X86_XSTATE_SSE_MASK, true);
>> >> -    case X86_TDESC_AVX:
>> >> -      return amd64_linux_read_description (X86_XSTATE_AVX_MASK, true);
>> >> -    case X86_TDESC_AVX_AVX512:
>> >> -      return amd64_linux_read_description (X86_XSTATE_AVX_AVX512_MASK,
>> >> true);
>> >> -    default:
>> >> -      break;
>> >> -    }
>> >> +  bool is_x32 = true;
>> >>  #else
>> >> -  return amd64_linux_read_description (idx2mask[idx], false);
>> >> +  bool is_x32 = false;
>> >>  #endif
>> >>
>> >> -  internal_error ("unknown ipa tdesc index: %d", idx);
>> >> +  return amd64_linux_read_description (xcr0, is_x32);
>> >>  }
>> >>
>> >>  /* Allocate buffer for the jump pads.  The branch instruction has a
>> >> @@ -272,11 +246,10 @@ void
>> >>  initialize_low_tracepoint (void)
>> >>  {
>> >>  #if defined __ILP32__
>> >> -  amd64_linux_read_description (X86_XSTATE_SSE_MASK, true);
>> >> -  amd64_linux_read_description (X86_XSTATE_AVX_MASK, true);
>> >> -  amd64_linux_read_description (X86_XSTATE_AVX_AVX512_MASK, true);
>> >> +  for (auto i = 0; i < x86_linux_x32_tdesc_count (); i++)
>> >> +    amd64_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i), true);
>> >>  #else
>> >> -  for (auto i = 0; i < X86_TDESC_LAST; i++)
>> >> -    amd64_linux_read_description (idx2mask[i], false);
>> >> +  for (auto i = 0; i < x86_linux_amd64_tdesc_count (); i++)
>> >> +    amd64_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i), false);
>> >
>> > I know this was already here and I know there are different opinions on this.
>> > But to me using auto here (and in similar locations in this patch) is just wrong.
>> > So I would vote for making this a proper type.
>> > But this is a bit of my personal "pet peeve" ;)
>> >
>> >
>> >>  #endif
>> >>  }
>> >> diff --git a/gdbserver/linux-i386-ipa.cc b/gdbserver/linux-i386-ipa.cc
>> >> index aa346fc9bc3..c1c3152fb04 100644
>> >> --- a/gdbserver/linux-i386-ipa.cc
>> >> +++ b/gdbserver/linux-i386-ipa.cc
>> >> @@ -245,28 +245,15 @@ initialize_fast_tracepoint_trampoline_buffer (void)
>> >>      }
>> >>  }
>> >>
>> >> -/* Map the tdesc index to xcr0 mask.  */
>> >> -static uint64_t idx2mask[X86_TDESC_LAST] = {
>> >> -  X86_XSTATE_X87_MASK,
>> >> -  X86_XSTATE_SSE_MASK,
>> >> -  X86_XSTATE_AVX_MASK,
>> >> -  X86_XSTATE_MPX_MASK,
>> >> -  X86_XSTATE_AVX_MPX_MASK,
>> >> -  X86_XSTATE_AVX_AVX512_MASK,
>> >> -  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
>> >> -};
>> >> -
>> >>  /* Return target_desc to use for IPA, given the tdesc index passed by
>> >>     gdbserver.  */
>> >>
>> >>  const struct target_desc *
>> >>  get_ipa_tdesc (int idx)
>> >>  {
>> >> -  if (idx >= X86_TDESC_LAST)
>> >> -    {
>> >> -      internal_error ("unknown ipa tdesc index: %d", idx);
>> >> -    }
>> >> -  return i386_linux_read_description (idx2mask[idx]);
>> >> +  uint64_t xcr0 = x86_linux_tdesc_idx_to_xcr0 (idx);
>> >> +
>> >> +  return i386_linux_read_description (xcr0);
>> >>  }
>> >>
>> >>  /* Allocate buffer for the jump pads.  On i386, we can reach an arbitrary
>> >> @@ -288,6 +275,6 @@ void
>> >>  initialize_low_tracepoint (void)
>> >>  {
>> >>    initialize_fast_tracepoint_trampoline_buffer ();
>> >> -  for (auto i = 0; i < X86_TDESC_LAST; i++)
>> >> -    i386_linux_read_description (idx2mask[i]);
>> >> +  for (auto i = 0; i < x86_linux_i386_tdesc_count (); i++)
>> >> +    i386_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i));
>> >>  }
>> >> diff --git a/gdbserver/linux-x86-low.cc b/gdbserver/linux-x86-low.cc
>> >> index 68d2f13537c..6e23a53118b 100644
>> >> --- a/gdbserver/linux-x86-low.cc
>> >> +++ b/gdbserver/linux-x86-low.cc
>> >> @@ -2882,14 +2882,17 @@ x86_target::get_ipa_tdesc_idx ()
>> >>    struct regcache *regcache = get_thread_regcache (current_thread, 0);
>> >>    const struct target_desc *tdesc = regcache->tdesc;
>> >>
>> >> +  if (!use_xml)
>> >> +    {
>> >> +      if (tdesc == tdesc_i386_linux_no_xml.get ()
>> >>  #ifdef __x86_64__
>> >> -  return amd64_get_ipa_tdesc_idx (tdesc);
>> >> -#endif
>> >> -
>> >> -  if (tdesc == tdesc_i386_linux_no_xml.get ())
>> >> -    return X86_TDESC_SSE;
>> >> +	  || tdesc == tdesc_amd64_linux_no_xml.get ()
>> >> +#endif /* __x86_64__ */
>> >> +	  )
>> >> +	return x86_linux_xcr0_to_tdesc_idx (X86_XSTATE_SSE_MASK);
>> >
>> > What happens if neither of the two is true? Do we need to assert this?
>> > Or do we need to just return the X86_XSTATE_SSE_MASK index
>> > without checking, as this can't ever happen?
>> 
>> You're right.  I've changed the 'if' into 'gdb_assert', and made the
>> return of an index based on X86_XSTATE_SSE_MASK unconditional (within
>> the '!use_xml' block.
>> 
>> >
>> >
>> >> +    }
>> >>
>> >> -  return i386_get_ipa_tdesc_idx (tdesc);
>> >> +  return x86_linux_xcr0_to_tdesc_idx (xcr0_storage);
>> >>  }
>> >>
>> >>  /* The linux target ops object.  */
>> >> diff --git a/gdbserver/linux-x86-tdesc.cc b/gdbserver/linux-x86-tdesc.cc
>> >> index af3735aa895..5e12526bf17 100644
>> >> --- a/gdbserver/linux-x86-tdesc.cc
>> >> +++ b/gdbserver/linux-x86-tdesc.cc
>> >> @@ -28,96 +28,278 @@
>> >>  #include "x86-tdesc.h"
>> >>  #include "arch/i386-linux-tdesc.h"
>> >>
>> >> -/* Return the right x86_linux_tdesc index for a given XCR0.  Return
>> >> -   X86_TDESC_LAST if can't find a match.  */
>> >> +/* A structure used to describe a single cpu feature that might, or might
>> >> +   not, be checked for when creating a target description for one of i386,
>> >> +   amd64, or x32.  */
>> >>
>> >> -static enum x86_linux_tdesc
>> >> -xcr0_to_tdesc_idx (uint64_t xcr0, bool is_x32)
>> >> +struct x86_tdesc_feature {
>> >> +  /* The cpu feature mask.  This is a mask against an xcr0 value.  */
>> >> +  uint64_t feature;
>> >
>> > Can we elaborate the comment? Why do we need to mask anything?
>> > Or maybe we can refer to the table below.
>> >
>> >
>> >> +  /* Is this feature checked when creating an i386 target description.  */
>> >> +  bool is_i386;
>> >> +
>> >> +  /* Is this feature checked when creating an amd64 target description.  */
>> >> +  bool is_amd64;
>> >> +
>> >> +  /* Is this feature checked when creating an x32 target description.  */
>> >> +  bool is_x32;
>> >> +};
>> >> +
>> >> +/* A constant table that describes all of the cpu features that are
>> >> +   checked when building a target description for i386, amd64, or x32.  */
>> >> +
>> >
>> > I am missing a bit of elaboration on why we can't only rely on XCR0. And
>> > the table doesn't describe all CPU features that are checked. It just describes
>> > all XSTATE features. There is also segments and orig_rax and in the near future
>> > a new register for CET, which are independent of XSTATE.
>> 
>> I can rename the data structures to make it clearer that this all
>> relates to xstate/xcr0.
>
> That would be good.

I've done this in the new version that I'll post shortly.

>
> A side node: Over the last couple of days I remember that we will need something
> aside from XCR0 also for AVX10 (basically avx512 but 256-bit registers only) and
> probably AMX as well. And APX. Mostly some CPUID bits or a register in the
> case of AMX. But I see that you already responded in patch 11 to my fear about
> the map.
>
>
>> I was aware about the segments flag that is
>> passed into the tdesc creation, but even with prompting I can't see
>> anything related to orig_rax -- how is that fed into the tdesc creation
>> process?
>
> Both segments and orig_rax are added unconditionally in amd64-linux-tdep.c
> (or the i386 equivalent). They are controlled via the "bool segments" and
> "bool is_linux" args of "*_create_target_description()". Unfortunately my
> knowledge is pretty spotty on these two as well, and I didn't try to dig
> out the patches that added them.

I did eventually figure out that this is what we were talking about.
And the reason that this doesn't matter, as far as the caching in this
series is concerned, is that these fields always take fixed values for
Linux targets, which is the only target this series touches.

As I said in the patch #11 thread, beyond this series I definitely want
to look into merging _all_ x86 target description caching, for all
sub-architecture variants (i386, amd64, x32), on all OS (Linux, FreeBSD,
etc), and this will mean the caching needs to start considering things
outside of xcr0.

The plan I have in mind would 100% be extendable to cover the AVX10 case
you mention above.

But for me, step #1 is merging the gdb/gdbserver code, then we can start
expanding to cover more cases.

>
>
>> >
>> >
>> >> +static constexpr x86_tdesc_feature x86_linux_all_tdesc_features[] = {
>> >> +  /* Feature,           i386,	amd64,	x32.  */
>> >> +  { X86_XSTATE_PKRU,	true,	true, 	true },
>> >> +  { X86_XSTATE_AVX512,	true,	true, 	true },
>> >> +  { X86_XSTATE_AVX,	true,	true, 	true },
>> >> +  { X86_XSTATE_MPX,	true,	true, 	false },
>> >> +  { X86_XSTATE_SSE,	true,	false, 	false },
>> >> +  { X86_XSTATE_X87,	true,	false, 	false }
>> >> +};
>> >
>> > I have never looked at IPA code much. But this table looked a bit weird to me.
>> > I get that MPX is not supported for x32. Though MPX is already deprecated and
>> will
>> > be removed once GDB 15 is branched (at least that is the plan).
>> 
>> You must excuse my lack of up to date knowledge of different x86
>> features, but is there not hardware in the wild with the MPX feature?
>> Wont removing support for this mean folk will be unable to debug on that
>> hardware?
>
> Intel deprecated MPX in 2019 for various reasons:
> https://en.wikipedia.org/wiki/Intel_MPX
> I think that means no-one really should use MPX anymore, even if their CPU
> supports it.
>
> Since GCC, glibc and Linux don't support it anymore for a while, we also
> deprecated it:
> https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=7650ea38908e98a5823a334286813eca2ff5719e
>
> Another argument for that deprecation is Intel APX, which has been announced
> for future CPUs. It will re-use the MPX XSTATE area. While there is a new/different
> bit for APX in the XSTATE header, which would allow us to keep support
> for both types of registers, it is still a much needed deprecation in my view.
> The code is already complex enough.

I'm happy to defer to your @intel email address :)  So long as we're not
going to leave users of older hardware in a bind then I'm good.

Thanks,
Andrew

>
>
>> > But why does amd64 not have x87 and SSE? I do see e.g. st0 and xmm0 on
>> amd64.
>> >
>> > After digging a bit:
>> > In arch/amd64.c, we call create_feature_i386_64bit_sse() without any check
>> for
>> > SSE in XCR0. And I see that we have st0 in the "core" registers with EAX etc.
>> > But in arch/i386.c it is different, and we add SSE based on X86_XSTATE_SSE.
>> > And while st0 is also in the "core" registers with EAX etc., we only add them
>> based
>> > on X86_XSTATE_X87. To me this looks wrong. Why would CPUs without x87
>> > not add EAX? And even if it isn't for some reason, do we really want to support
>> > such old CPUs? In my view we could just align the two.
>> >
>> > If we would do that, and deprecate MPX, the table looks unnecessary. Or did I
>> miss
>> > something else?
>> 
>> That looks like a reasonable conclusion.
>> 
>> Honestly, pretty much everything in this patch is based on the idea of
>> merging gdbserver and GDB code, while keeping functionality as similar
>> as possible.
>
> Which is a really good thing. Thanks for tackling this!
>
>> >
>> > This is a complicated series. Like others I am wondering how many users IPA
>> > really has and if it is worth maintaining. But I have no data. Is it in any distros?
>> 
>> I have no idea.  But I really don't want to tie this series (which fixes
>> actual bugs caused by tdesc mismatch between GDB and gdbserver) to the
>> removal of IPA.
>>
>
> Totally fine with me.
>
>> As far as I can tell the IPA is tested as part of the gdb.trace/
>> sub-directory, so it's not like we're removing some untested broken
>> corner of GDB.  We'd be removing a somewhat tested, at least minimally
>> working, feature.
>> 
>
> Ok by me, I was only wondering. It would for sure need a separate discussion
> and series.
>
> Thanks,
> Felix
> Intel Deutschland GmbH
> Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
> Tel: +49 89 99 8853-0, www.intel.de
> Managing Directors: Sean Fennelly, Jeffrey Schneiderman, Tiffany Doon Silva
> Chairperson of the Supervisory Board: Nicole Lau
> Registered Office: Munich
> Commercial Register: Amtsgericht Muenchen HRB 186928
  

Patch

diff --git a/gdbserver/linux-amd64-ipa.cc b/gdbserver/linux-amd64-ipa.cc
index df5e6aca081..0c80812cc6f 100644
--- a/gdbserver/linux-amd64-ipa.cc
+++ b/gdbserver/linux-amd64-ipa.cc
@@ -164,47 +164,21 @@  supply_static_tracepoint_registers (struct regcache *regcache,
 
 #endif /* HAVE_UST */
 
-#if !defined __ILP32__
-/* Map the tdesc index to xcr0 mask.  */
-static uint64_t idx2mask[X86_TDESC_LAST] = {
-  X86_XSTATE_X87_MASK,
-  X86_XSTATE_SSE_MASK,
-  X86_XSTATE_AVX_MASK,
-  X86_XSTATE_MPX_MASK,
-  X86_XSTATE_AVX_MPX_MASK,
-  X86_XSTATE_AVX_AVX512_MASK,
-  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
-};
-#endif
-
 /* Return target_desc to use for IPA, given the tdesc index passed by
    gdbserver.  */
 
 const struct target_desc *
 get_ipa_tdesc (int idx)
 {
-  if (idx >= X86_TDESC_LAST)
-    {
-      internal_error ("unknown ipa tdesc index: %d", idx);
-    }
+  uint64_t xcr0 = x86_linux_tdesc_idx_to_xcr0 (idx);
 
 #if defined __ILP32__
-  switch (idx)
-    {
-    case X86_TDESC_SSE:
-      return amd64_linux_read_description (X86_XSTATE_SSE_MASK, true);
-    case X86_TDESC_AVX:
-      return amd64_linux_read_description (X86_XSTATE_AVX_MASK, true);
-    case X86_TDESC_AVX_AVX512:
-      return amd64_linux_read_description (X86_XSTATE_AVX_AVX512_MASK, true);
-    default:
-      break;
-    }
+  bool is_x32 = true;
 #else
-  return amd64_linux_read_description (idx2mask[idx], false);
+  bool is_x32 = false;
 #endif
 
-  internal_error ("unknown ipa tdesc index: %d", idx);
+  return amd64_linux_read_description (xcr0, is_x32);
 }
 
 /* Allocate buffer for the jump pads.  The branch instruction has a
@@ -272,11 +246,10 @@  void
 initialize_low_tracepoint (void)
 {
 #if defined __ILP32__
-  amd64_linux_read_description (X86_XSTATE_SSE_MASK, true);
-  amd64_linux_read_description (X86_XSTATE_AVX_MASK, true);
-  amd64_linux_read_description (X86_XSTATE_AVX_AVX512_MASK, true);
+  for (auto i = 0; i < x86_linux_x32_tdesc_count (); i++)
+    amd64_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i), true);
 #else
-  for (auto i = 0; i < X86_TDESC_LAST; i++)
-    amd64_linux_read_description (idx2mask[i], false);
+  for (auto i = 0; i < x86_linux_amd64_tdesc_count (); i++)
+    amd64_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i), false);
 #endif
 }
diff --git a/gdbserver/linux-i386-ipa.cc b/gdbserver/linux-i386-ipa.cc
index aa346fc9bc3..c1c3152fb04 100644
--- a/gdbserver/linux-i386-ipa.cc
+++ b/gdbserver/linux-i386-ipa.cc
@@ -245,28 +245,15 @@  initialize_fast_tracepoint_trampoline_buffer (void)
     }
 }
 
-/* Map the tdesc index to xcr0 mask.  */
-static uint64_t idx2mask[X86_TDESC_LAST] = {
-  X86_XSTATE_X87_MASK,
-  X86_XSTATE_SSE_MASK,
-  X86_XSTATE_AVX_MASK,
-  X86_XSTATE_MPX_MASK,
-  X86_XSTATE_AVX_MPX_MASK,
-  X86_XSTATE_AVX_AVX512_MASK,
-  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
-};
-
 /* Return target_desc to use for IPA, given the tdesc index passed by
    gdbserver.  */
 
 const struct target_desc *
 get_ipa_tdesc (int idx)
 {
-  if (idx >= X86_TDESC_LAST)
-    {
-      internal_error ("unknown ipa tdesc index: %d", idx);
-    }
-  return i386_linux_read_description (idx2mask[idx]);
+  uint64_t xcr0 = x86_linux_tdesc_idx_to_xcr0 (idx);
+
+  return i386_linux_read_description (xcr0);
 }
 
 /* Allocate buffer for the jump pads.  On i386, we can reach an arbitrary
@@ -288,6 +275,6 @@  void
 initialize_low_tracepoint (void)
 {
   initialize_fast_tracepoint_trampoline_buffer ();
-  for (auto i = 0; i < X86_TDESC_LAST; i++)
-    i386_linux_read_description (idx2mask[i]);
+  for (auto i = 0; i < x86_linux_i386_tdesc_count (); i++)
+    i386_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i));
 }
diff --git a/gdbserver/linux-x86-low.cc b/gdbserver/linux-x86-low.cc
index 68d2f13537c..6e23a53118b 100644
--- a/gdbserver/linux-x86-low.cc
+++ b/gdbserver/linux-x86-low.cc
@@ -2882,14 +2882,17 @@  x86_target::get_ipa_tdesc_idx ()
   struct regcache *regcache = get_thread_regcache (current_thread, 0);
   const struct target_desc *tdesc = regcache->tdesc;
 
+  if (!use_xml)
+    {
+      if (tdesc == tdesc_i386_linux_no_xml.get ()
 #ifdef __x86_64__
-  return amd64_get_ipa_tdesc_idx (tdesc);
-#endif
-
-  if (tdesc == tdesc_i386_linux_no_xml.get ())
-    return X86_TDESC_SSE;
+	  || tdesc == tdesc_amd64_linux_no_xml.get ()
+#endif /* __x86_64__ */
+	  )
+	return x86_linux_xcr0_to_tdesc_idx (X86_XSTATE_SSE_MASK);
+    }
 
-  return i386_get_ipa_tdesc_idx (tdesc);
+  return x86_linux_xcr0_to_tdesc_idx (xcr0_storage);
 }
 
 /* The linux target ops object.  */
diff --git a/gdbserver/linux-x86-tdesc.cc b/gdbserver/linux-x86-tdesc.cc
index af3735aa895..5e12526bf17 100644
--- a/gdbserver/linux-x86-tdesc.cc
+++ b/gdbserver/linux-x86-tdesc.cc
@@ -28,96 +28,278 @@ 
 #include "x86-tdesc.h"
 #include "arch/i386-linux-tdesc.h"
 
-/* Return the right x86_linux_tdesc index for a given XCR0.  Return
-   X86_TDESC_LAST if can't find a match.  */
+/* A structure used to describe a single cpu feature that might, or might
+   not, be checked for when creating a target description for one of i386,
+   amd64, or x32.  */
 
-static enum x86_linux_tdesc
-xcr0_to_tdesc_idx (uint64_t xcr0, bool is_x32)
+struct x86_tdesc_feature {
+  /* The cpu feature mask.  This is a mask against an xcr0 value.  */
+  uint64_t feature;
+
+  /* Is this feature checked when creating an i386 target description.  */
+  bool is_i386;
+
+  /* Is this feature checked when creating an amd64 target description.  */
+  bool is_amd64;
+
+  /* Is this feature checked when creating an x32 target description.  */
+  bool is_x32;
+};
+
+/* A constant table that describes all of the cpu features that are
+   checked when building a target description for i386, amd64, or x32.  */
+
+static constexpr x86_tdesc_feature x86_linux_all_tdesc_features[] = {
+  /* Feature,           i386,	amd64,	x32.  */
+  { X86_XSTATE_PKRU,	true,	true, 	true },
+  { X86_XSTATE_AVX512,	true,	true, 	true },
+  { X86_XSTATE_AVX,	true,	true, 	true },
+  { X86_XSTATE_MPX,	true,	true, 	false },
+  { X86_XSTATE_SSE,	true,	false, 	false },
+  { X86_XSTATE_X87,	true,	false, 	false }
+};
+
+/* Return a compile time constant which is a mask of all the cpu features
+   that are checked for when building an i386 target description.  */
+
+static constexpr uint64_t
+x86_linux_i386_tdesc_feature_mask ()
 {
-  if (xcr0 & X86_XSTATE_PKRU)
-    {
-      if (is_x32)
-	{
-	  /* No x32 MPX and PKU, fall back to avx_avx512.  */
-	  return X86_TDESC_AVX_AVX512;
-	}
-      else
-	return X86_TDESC_AVX_MPX_AVX512_PKU;
-    }
-  else if (xcr0 & X86_XSTATE_AVX512)
-    return X86_TDESC_AVX_AVX512;
-  else if ((xcr0 & X86_XSTATE_AVX_MPX_MASK) == X86_XSTATE_AVX_MPX_MASK)
+  uint64_t mask = 0;
+
+  for (const auto &entry : x86_linux_all_tdesc_features)
+    if (entry.is_i386)
+      mask |= entry.feature;
+
+  return mask;
+}
+
+/* Return a compile time constant which is a mask of all the cpu features
+   that are checked for when building an amd64 target description.  */
+
+static constexpr uint64_t
+x86_linux_amd64_tdesc_feature_mask ()
+{
+  uint64_t mask = 0;
+
+  for (const auto &entry : x86_linux_all_tdesc_features)
+    if (entry.is_amd64)
+      mask |= entry.feature;
+
+  return mask;
+}
+
+/* Return a compile time constant which is a mask of all the cpu features
+   that are checked for when building an x32 target description.  */
+
+static constexpr uint64_t
+x86_linux_x32_tdesc_feature_mask ()
+{
+  uint64_t mask = 0;
+
+  for (const auto &entry : x86_linux_all_tdesc_features)
+    if (entry.is_x32)
+      mask |= entry.feature;
+
+  return mask;
+}
+
+/* Return a compile time constant which is a count of the number of cpu
+   features that are checked for when building an i386 target description.  */
+
+static constexpr int
+x86_linux_i386_tdesc_count_1 ()
+{
+  uint64_t count = 0;
+
+  for (const auto &entry : x86_linux_all_tdesc_features)
+    if (entry.is_i386)
+      ++count;
+
+  gdb_assert (count > 0);
+
+  return (1 << count);
+}
+
+/* Return a compile time constant which is a count of the number of cpu
+   features that are checked for when building an amd64 target description.  */
+
+static constexpr int
+x86_linux_amd64_tdesc_count_1 ()
+{
+  uint64_t count = 0;
+
+  for (const auto &entry : x86_linux_all_tdesc_features)
+    if (entry.is_amd64)
+      ++count;
+
+  gdb_assert (count > 0);
+
+  return (1 << count);
+}
+
+/* Return a compile time constant which is a count of the number of cpu
+   features that are checked for when building an x32 target description.  */
+
+static constexpr int
+x86_linux_x32_tdesc_count_1 ()
+{
+  uint64_t count = 0;
+
+  for (const auto &entry : x86_linux_all_tdesc_features)
+    if (entry.is_x32)
+      ++count;
+
+  gdb_assert (count > 0);
+
+  return (1 << count);
+}
+
+#ifdef IN_PROCESS_AGENT
+
+/* See linux-x86-tdesc.h.  */
+
+int
+x86_linux_amd64_tdesc_count ()
+{
+  return x86_linux_amd64_tdesc_count_1 ();
+}
+
+/* See linux-x86-tdesc.h.  */
+
+int
+x86_linux_x32_tdesc_count ()
+{
+  return x86_linux_x32_tdesc_count_1 ();
+}
+
+/* See linux-x86-tdesc.h.  */
+
+int
+x86_linux_i386_tdesc_count ()
+{
+  return x86_linux_i386_tdesc_count_1 ();
+}
+
+#endif /* IN_PROCESS_AGENT */
+
+/* Convert an xcr0 value into an integer.  The integer will be passed to
+   the in-process-agent where it will then be passed to
+   x86_linux_tdesc_idx_to_xcr0 to get back the xcr0 value.  */
+
+int
+x86_linux_xcr0_to_tdesc_idx (uint64_t xcr0)
+{
+  /* The following table shows which features are checked for when creating
+     the target descriptions (see nat/x86-linux-tdesc.c), the feature order
+     represents the bit order within the generated index number.
+
+     i386  | x87 sse mpx avx avx512 pkru
+     amd64 |         mpx avx avx512 pkru
+     i32   |             avx avx512 pkru
+
+     The features are ordered so that for each mode (i386, amd64, i32) the
+     generated index will form a continuous range.  */
+
+  int idx = 0;
+
+  for (int i = 0; i < ARRAY_SIZE (x86_linux_all_tdesc_features); ++i)
     {
-      if (is_x32) /* No MPX on x32.  */
-	return X86_TDESC_AVX;
-      else
-	return X86_TDESC_AVX_MPX;
+     if ((xcr0 & x86_linux_all_tdesc_features[i].feature)
+	  == x86_linux_all_tdesc_features[i].feature)
+	idx |= (1 << i);
     }
-  else if (xcr0 & X86_XSTATE_MPX)
+
+  return idx;
+}
+
+
+#ifdef IN_PROCESS_AGENT
+
+/* Convert an index number (as returned from x86_linux_xcr0_to_tdesc_idx)
+   into an xcr0 value which can then be used to create a target
+   description.  */
+
+uint64_t
+x86_linux_tdesc_idx_to_xcr0 (int idx)
+{
+  uint64_t xcr0 = 0;
+
+  for (int i = 0; i < ARRAY_SIZE (x86_linux_all_tdesc_features); ++i)
     {
-      if (is_x32) /* No MPX on x32.  */
-	return X86_TDESC_AVX;
-      else
-	return X86_TDESC_MPX;
+      if ((idx & (1 << i)) != 0)
+	xcr0 |= x86_linux_all_tdesc_features[i].feature;
     }
-  else if (xcr0 & X86_XSTATE_AVX)
-    return X86_TDESC_AVX;
-  else if (xcr0 & X86_XSTATE_SSE)
-    return X86_TDESC_SSE;
-  else if (xcr0 & X86_XSTATE_X87)
-    return X86_TDESC_MMX;
-  else
-    return X86_TDESC_LAST;
+
+  return xcr0;
 }
 
+#endif /* IN_PROCESS_AGENT */
+
 #if defined __i386__ || !defined IN_PROCESS_AGENT
 
-static struct target_desc *i386_tdescs[X86_TDESC_LAST] = { };
+/* A cache of all possible i386 target descriptions.  */
 
-/* Return the target description according to XCR0.  */
+static struct target_desc *i386_tdescs[x86_linux_i386_tdesc_count_1 ()] = { };
+
+/* See nat/x86-linux-tdesc.h.  */
 
 const struct target_desc *
 i386_linux_read_description (uint64_t xcr0)
 {
-  enum x86_linux_tdesc idx = xcr0_to_tdesc_idx (xcr0, false);
+  xcr0 &= x86_linux_i386_tdesc_feature_mask ();
+  int idx = x86_linux_xcr0_to_tdesc_idx (xcr0);
 
-  if (idx == X86_TDESC_LAST)
-    return NULL;
+  gdb_assert (idx >= 0 && idx < x86_linux_i386_tdesc_count_1 ());
 
-  struct target_desc **tdesc = &i386_tdescs[idx];
+  target_desc **tdesc = &i386_tdescs[idx];
 
-  if (*tdesc == NULL)
+  if (*tdesc == nullptr)
     {
       *tdesc = i386_create_target_description (xcr0, true, false);
 
       init_target_desc (*tdesc, i386_expedite_regs);
     }
 
-  return *tdesc;;
+  return *tdesc;
 }
 #endif
 
 #ifdef __x86_64__
 
-static target_desc *amd64_tdescs[X86_TDESC_LAST] = { };
-static target_desc *x32_tdescs[X86_TDESC_LAST] = { };
+/* A cache of all possible amd64 target descriptions.  */
+
+static target_desc *amd64_tdescs[x86_linux_amd64_tdesc_count_1 ()] = { };
+
+/* A cache of all possible x32 target descriptions.  */
+
+static target_desc *x32_tdescs[x86_linux_x32_tdesc_count_1 ()] = { };
+
+/* See nat/x86-linux-tdesc.h.  */
 
 const struct target_desc *
 amd64_linux_read_description (uint64_t xcr0, bool is_x32)
 {
-  enum x86_linux_tdesc idx = xcr0_to_tdesc_idx (xcr0, is_x32);
+  if (is_x32)
+    xcr0 &= x86_linux_x32_tdesc_feature_mask ();
+  else
+    xcr0 &= x86_linux_amd64_tdesc_feature_mask ();
+
+  int idx = x86_linux_xcr0_to_tdesc_idx (xcr0);
 
-  if (idx == X86_TDESC_LAST)
-    return NULL;
+  if (is_x32)
+    gdb_assert (idx >= 0 && idx < x86_linux_x32_tdesc_count_1 ());
+  else
+    gdb_assert (idx >= 0 && idx < x86_linux_amd64_tdesc_count_1 ());
 
-  struct target_desc **tdesc = NULL;
+  target_desc **tdesc = nullptr;
 
   if (is_x32)
     tdesc = &x32_tdescs[idx];
   else
     tdesc = &amd64_tdescs[idx];
 
-  if (*tdesc == NULL)
+  if (*tdesc == nullptr)
     {
       *tdesc = amd64_create_target_description (xcr0, is_x32, true, true);
 
@@ -127,39 +309,3 @@  amd64_linux_read_description (uint64_t xcr0, bool is_x32)
 }
 
 #endif
-
-#ifndef IN_PROCESS_AGENT
-
-int
-i386_get_ipa_tdesc_idx (const struct target_desc *tdesc)
-{
-  for (int i = 0; i < X86_TDESC_LAST; i++)
-    {
-      if (tdesc == i386_tdescs[i])
-	return i;
-    }
-
-  /* If none tdesc is found, return the one with minimum features.  */
-  return X86_TDESC_MMX;
-}
-
-#if defined __x86_64__
-int
-amd64_get_ipa_tdesc_idx (const struct target_desc *tdesc)
-{
-  for (int i = 0; i < X86_TDESC_LAST; i++)
-    {
-      if (tdesc == amd64_tdescs[i])
-	return i;
-    }
-  for (int i = 0; i < X86_TDESC_LAST; i++)
-    {
-      if (tdesc == x32_tdescs[i])
-	return i;
-    }
-
-  return X86_TDESC_SSE;
-}
-
-#endif
-#endif
diff --git a/gdbserver/linux-x86-tdesc.h b/gdbserver/linux-x86-tdesc.h
index 576aaf5e165..3d6e0e51833 100644
--- a/gdbserver/linux-x86-tdesc.h
+++ b/gdbserver/linux-x86-tdesc.h
@@ -21,29 +21,30 @@ 
 #ifndef GDBSERVER_LINUX_X86_TDESC_H
 #define GDBSERVER_LINUX_X86_TDESC_H
 
-/* Note: since IPA obviously knows what ABI it's running on (i386 vs x86_64
-   vs x32), it's sufficient to pass only the register set here.  This,
-   together with the ABI known at IPA compile time, maps to a tdesc.  */
-
-enum x86_linux_tdesc {
-  X86_TDESC_MMX = 0,
-  X86_TDESC_SSE = 1,
-  X86_TDESC_AVX = 2,
-  X86_TDESC_MPX = 3,
-  X86_TDESC_AVX_MPX = 4,
-  X86_TDESC_AVX_AVX512 = 5,
-  X86_TDESC_AVX_MPX_AVX512_PKU = 6,
-  X86_TDESC_LAST = 7,
-};
-
-#if defined __i386__ || !defined IN_PROCESS_AGENT
-int i386_get_ipa_tdesc_idx (const struct target_desc *tdesc);
-#endif
-
-#if defined __x86_64__ && !defined IN_PROCESS_AGENT
-int amd64_get_ipa_tdesc_idx (const struct target_desc *tdesc);
-#endif
-
-const struct target_desc *i386_get_ipa_tdesc (int idx);
+/* Convert an xcr0 value into an integer.  The integer will be passed to
+   the in-process-agent where it will then be passed to
+   x86_linux_tdesc_idx_to_xcr0 to get back the xcr0 value.  */
+
+extern int x86_linux_xcr0_to_tdesc_idx (uint64_t xcr0);
+
+#ifdef IN_PROCESS_AGENT
+
+/* Convert an index number (as returned from x86_linux_xcr0_to_tdesc_idx)
+   into an xcr0 value which can then be used to create a target
+   description.  */
+
+extern uint64_t x86_linux_tdesc_idx_to_xcr0 (int idx);
+
+/* Within the in-process-agent we need to pre-initialise all of the target
+   descriptions, to do this we need to know how many target descriptions
+   there are for each different target type.  These functions return the
+   target description count for the relevant target.  */
+
+extern int x86_linux_amd64_tdesc_count ();
+extern int x86_linux_x32_tdesc_count ();
+extern int x86_linux_i386_tdesc_count ();
+
+
+#endif /* IN_PROCESS_AGENT */
 
 #endif /* GDBSERVER_LINUX_X86_TDESC_H */