[v2,03/17,gdb/gdbserver] refactor: Simplify SVE interface to read/write registers

Message ID 20230519102508.14020-4-luis.machado@arm.com
State New
Headers
Series SME support for AArch64 gdb/gdbserver on Linux |

Commit Message

Luis Machado May 19, 2023, 10:24 a.m. UTC
  This is a patch in preparation to upcoming patches enabling SME support.  It
attempts to simplify the gdb/gdbserver shared interface used to read/write
SVE registers.

Where the current code makes use of unique_ptr, allocating a new buffer by
hand and passing a buffer around, this patch makes that code use
gdb::byte_vector and passes a reference to this byte vector to the functions,
allowing the functions to have ready access to the size of the buffer.

It also shares a bit more code between gdb and gdbserver, in particular around
handling of ptrace get/set requests for SVE.

I think gdbserver could be refactored to handle register reads/writes more
like gdb's native layer as opposed to letting the generic linux-low layer do
the ptrace calls.  This is not very flexible and assumes one size for the
responses.  If you have something like NT_ARM_SVE, where you can have either
FPSIMD or SVE contents, it doesn't work that well.

I didn't want to change that interface right now as it is a bit too much work
and touches all the targets, some of which I can't easily test.

Hence the reason why the buffer the generic linux-now passes down to
linux-aarch64-low is unused or ignored.

No user-visible changes should happen as part of this refactor other than a
slightly reworded warning message.

While doing the refactor, I also noticed what seems to be a mistake in checking
if the register cache contains active (non-zero) SVE data.

For instance, the original code did something like this in
aarch64_sve_regs_copy_from_reg_buf:

has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_Z0_REGNUM + i
				       reg, sizeof (__int128_t));

"reg" is a zeroed-out buffer that we compare the Z register contents
past the first 128 bits.  The problem here is that raw_compare returns
1 if the contents compare the same, which means has_sve_state will be
true.  But if we compared the Z register contents to 0, it means we
*do not* have SVE state, and therefore has_sve_state should be false.

The consequence of this mistake is that we convert the initial
FPSIMD-formatted data we get from ptrace for the NT_ARM_SVE register
set to a SVE-formatted one.

In the end, this doesn't cause user-visible differences because the
values of both the Z and V registers will still be the same.  But the
logic is not correct.

I used the opportunity to fix this, and it gets tested later on by
the additional SME tests.

I do plan on submitting some SVE-specific tests to make sure we have
a bit more coverage in GDB's testsuite.

Regression-tested on aarch64-linux Ubuntu 22.04/20.04.
---
 gdb/aarch64-linux-nat.c                 |  30 +---
 gdb/nat/aarch64-scalable-linux-ptrace.c | 203 +++++++++++++++---------
 gdb/nat/aarch64-scalable-linux-ptrace.h |  28 ++--
 gdbserver/linux-aarch64-low.cc          |  24 ++-
 4 files changed, 175 insertions(+), 110 deletions(-)
  

Patch

diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
index 0dd9652d001..a775631e9ec 100644
--- a/gdb/aarch64-linux-nat.c
+++ b/gdb/aarch64-linux-nat.c
@@ -319,9 +319,8 @@  store_fpregs_to_thread (const struct regcache *regcache)
 static void
 fetch_sveregs_from_thread (struct regcache *regcache)
 {
-  std::unique_ptr<gdb_byte[]> base
-    = aarch64_sve_get_sveregs (regcache->ptid ().lwp ());
-  aarch64_sve_regs_copy_to_reg_buf (regcache, base.get ());
+  /* Fetch SVE state from the thread and copy it into the register cache.  */
+  aarch64_sve_regs_copy_to_reg_buf (regcache->ptid ().lwp (), regcache);
 }
 
 /* Store to the current thread the valid sve register
@@ -330,28 +329,9 @@  fetch_sveregs_from_thread (struct regcache *regcache)
 static void
 store_sveregs_to_thread (struct regcache *regcache)
 {
-  int ret;
-  struct iovec iovec;
-  int tid = regcache->ptid ().lwp ();
-
-  /* First store vector length to the thread.  This is done first to ensure the
-     ptrace buffers read from the kernel are the correct size.  */
-  if (!aarch64_sve_set_vq (tid, regcache))
-    perror_with_name (_("Unable to set VG register"));
-
-  /* Obtain a dump of SVE registers from ptrace.  */
-  std::unique_ptr<gdb_byte[]> base = aarch64_sve_get_sveregs (tid);
-
-  /* Overwrite with regcache state.  */
-  aarch64_sve_regs_copy_from_reg_buf (regcache, base.get ());
-
-  /* Write back to the kernel.  */
-  iovec.iov_base = base.get ();
-  iovec.iov_len = ((struct user_sve_header *) base.get ())->size;
-  ret = ptrace (PTRACE_SETREGSET, tid, NT_ARM_SVE, &iovec);
-
-  if (ret < 0)
-    perror_with_name (_("Unable to store sve registers"));
+  /* Fetch SVE state from the register cache and update the thread TID with
+     it.  */
+  aarch64_sve_regs_copy_from_reg_buf (regcache->ptid ().lwp (), regcache);
 }
 
 /* Fill GDB's register array with the pointer authentication mask values from
diff --git a/gdb/nat/aarch64-scalable-linux-ptrace.c b/gdb/nat/aarch64-scalable-linux-ptrace.c
index cc43f510892..192eebcda19 100644
--- a/gdb/nat/aarch64-scalable-linux-ptrace.c
+++ b/gdb/nat/aarch64-scalable-linux-ptrace.c
@@ -120,28 +120,44 @@  aarch64_sve_set_vq (int tid, struct reg_buffer_common *reg_buf)
 
 /* See nat/aarch64-scalable-linux-ptrace.h.  */
 
-std::unique_ptr<gdb_byte[]>
-aarch64_sve_get_sveregs (int tid)
+gdb::byte_vector
+aarch64_fetch_sve_regset (int tid)
 {
-  struct iovec iovec;
   uint64_t vq = aarch64_sve_get_vq (tid);
 
   if (vq == 0)
-    perror_with_name (_("Unable to fetch SVE register header"));
+    perror_with_name (_("Unable to fetch SVE vector length"));
 
   /* A ptrace call with NT_ARM_SVE will return a header followed by either a
      dump of all the SVE and FP registers, or an fpsimd structure (identical to
      the one returned by NT_FPREGSET) if the kernel has not yet executed any
      SVE code.  Make sure we allocate enough space for a full SVE dump.  */
 
-  iovec.iov_len = SVE_PT_SIZE (vq, SVE_PT_REGS_SVE);
-  std::unique_ptr<gdb_byte[]> buf (new gdb_byte[iovec.iov_len]);
-  iovec.iov_base = buf.get ();
+  gdb::byte_vector sve_state (SVE_PT_SIZE (vq, SVE_PT_REGS_SVE), 0);
+
+  struct iovec iovec;
+  iovec.iov_base = sve_state.data ();
+  iovec.iov_len = sve_state.size ();
 
   if (ptrace (PTRACE_GETREGSET, tid, NT_ARM_SVE, &iovec) < 0)
     perror_with_name (_("Unable to fetch SVE registers"));
 
-  return buf;
+  return sve_state;
+}
+
+/* See nat/aarch64-scalable-linux-ptrace.h.  */
+
+bool
+aarch64_store_sve_regset (int tid, const gdb::byte_vector &sve_state)
+{
+  struct iovec iovec;
+  iovec.iov_base = (void *) sve_state.data ();
+  iovec.iov_len = sve_state.size ();
+
+  if (ptrace (PTRACE_SETREGSET, tid, NT_ARM_SVE, &iovec) < 0)
+    perror_with_name (_("Unable to store SVE registers"));
+
+  return true;
 }
 
 /* If we are running in BE mode, byteswap the contents
@@ -165,11 +181,13 @@  aarch64_maybe_swab128 (gdb_byte *dst, const gdb_byte *src, size_t size)
 /* See nat/aarch64-scalable-linux-ptrace.h.  */
 
 void
-aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
-				  const void *buf)
+aarch64_sve_regs_copy_to_reg_buf (int tid, struct reg_buffer_common *reg_buf)
 {
-  char *base = (char *) buf;
-  struct user_sve_header *header = (struct user_sve_header *) buf;
+  gdb::byte_vector sve_state = aarch64_fetch_sve_regset (tid);
+
+  char *base = (char *) sve_state.data ();
+  struct user_sve_header *header
+    = (struct user_sve_header *) sve_state.data ();
 
   uint64_t vq = sve_vq_from_vl (header->vl);
   uint64_t vg = sve_vg_from_vl (header->vl);
@@ -249,18 +267,33 @@  aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
 
       reg_buf->raw_supply (AARCH64_SVE_FFR_REGNUM, reg);
     }
+
+  /* At this point we have updated the register cache with the contents of
+     the NT_ARM_SVE register set.  */
 }
 
 /* See nat/aarch64-scalable-linux-ptrace.h.  */
 
 void
-aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
-				    void *buf)
+aarch64_sve_regs_copy_from_reg_buf (int tid,
+				    struct reg_buffer_common *reg_buf)
 {
-  struct user_sve_header *header = (struct user_sve_header *) buf;
-  char *base = (char *) buf;
+  /* First store the vector length to the thread.  This is done first to
+     ensure the ptrace buffers read from the kernel are the correct size.  */
+  if (!aarch64_sve_set_vq (tid, reg_buf))
+    perror_with_name (_("Unable to set VG register"));
+
+  /* Obtain a dump of SVE registers from ptrace.  */
+  gdb::byte_vector sve_state = aarch64_fetch_sve_regset (tid);
+
+  struct user_sve_header *header = (struct user_sve_header *) sve_state.data ();
   uint64_t vq = sve_vq_from_vl (header->vl);
 
+  gdb::byte_vector new_state (SVE_PT_SIZE (32, SVE_PT_REGS_SVE), 0);
+  memcpy (new_state.data (), sve_state.data (), sve_state.size ());
+  header = (struct user_sve_header *) new_state.data ();
+  char *base = (char *) new_state.data ();
+
   /* Sanity check the data in the header.  */
   if (!sve_vl_valid (header->vl)
       || SVE_PT_SIZE (vq, header->flags) != header->size)
@@ -275,36 +308,40 @@  aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
 	 resulting in the initialization of SVE state written back to the
 	 kernel, which is why we try to avoid it.  */
 
-      bool has_sve_state = false;
-      gdb_byte *reg = (gdb_byte *) alloca (SVE_PT_SVE_ZREG_SIZE (vq));
-      struct user_fpsimd_state *fpsimd
-	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
-
-      memset (reg, 0, SVE_PT_SVE_ZREG_SIZE (vq));
+      /* Buffer (using the maximum size a Z register) used to look for zeroed
+	 out sve state.  */
+      gdb_byte reg[256];
+      memset (reg, 0, sizeof (reg));
 
       /* Check in the reg_buf if any of the Z registers are set after the
 	 first 128 bits, or if any of the other SVE registers are set.  */
-
+      bool has_sve_state = false;
       for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
 	{
-	  has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_Z0_REGNUM + i,
-						 reg, sizeof (__int128_t));
-	  if (has_sve_state)
-	    break;
+	  if (!reg_buf->raw_compare (AARCH64_SVE_Z0_REGNUM + i, reg,
+				     V_REGISTER_SIZE))
+	    {
+	      has_sve_state = true;
+	      break;
+	    }
 	}
 
       if (!has_sve_state)
 	for (int i = 0; i < AARCH64_SVE_P_REGS_NUM; i++)
 	  {
-	    has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_P0_REGNUM + i,
-						   reg, 0);
-	    if (has_sve_state)
-	      break;
+	    if (!reg_buf->raw_compare (AARCH64_SVE_P0_REGNUM + i, reg, 0))
+	      {
+		has_sve_state = true;
+		break;
+	      }
 	  }
 
       if (!has_sve_state)
-	  has_sve_state |= reg_buf->raw_compare (AARCH64_SVE_FFR_REGNUM,
-						 reg, 0);
+	  has_sve_state
+	    = !reg_buf->raw_compare (AARCH64_SVE_FFR_REGNUM, reg, 0);
+
+      struct user_fpsimd_state *fpsimd
+	= (struct user_fpsimd_state *)(base + SVE_PT_FPSIMD_OFFSET);
 
       /* If no SVE state exists, then use the existing fpsimd structure to
 	 write out state and return.  */
@@ -344,50 +381,74 @@  aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
 	  if (REG_VALID == reg_buf->get_register_status (AARCH64_FPCR_REGNUM))
 	    reg_buf->raw_collect (AARCH64_FPCR_REGNUM, &fpsimd->fpcr);
 
-	  return;
-	}
-
-      /* Otherwise, reformat the fpsimd structure into a full SVE set, by
-	 expanding the V registers (working backwards so we don't splat
-	 registers before they are copied) and using null for everything else.
-	 Note that enough space for a full SVE dump was originally allocated
-	 for base.  */
+	  /* At this point we have collected all the data from the register
+	     cache and we are ready to update the FPSIMD register content
+	     of the thread.  */
 
-      header->flags |= SVE_PT_REGS_SVE;
-      header->size = SVE_PT_SIZE (vq, SVE_PT_REGS_SVE);
+	  /* Fall through so we can update the thread's contents with the
+	     FPSIMD register cache values.  */
+	}
+      else
+	{
+	  /* Otherwise, reformat the fpsimd structure into a full SVE set, by
+	     expanding the V registers (working backwards so we don't splat
+	     registers before they are copied) and using zero for everything
+	     else.
+	     Note that enough space for a full SVE dump was originally allocated
+	     for base.  */
+
+	  header->flags |= SVE_PT_REGS_SVE;
+	  header->size = SVE_PT_SIZE (vq, SVE_PT_REGS_SVE);
+
+	  memcpy (base + SVE_PT_SVE_FPSR_OFFSET (vq), &fpsimd->fpsr,
+		  sizeof (uint32_t));
+	  memcpy (base + SVE_PT_SVE_FPCR_OFFSET (vq), &fpsimd->fpcr,
+		  sizeof (uint32_t));
+
+	  for (int i = AARCH64_SVE_Z_REGS_NUM; i >= 0 ; i--)
+	    {
+	      memcpy (base + SVE_PT_SVE_ZREG_OFFSET (vq, i), &fpsimd->vregs[i],
+		      sizeof (__int128_t));
+	    }
 
-      memcpy (base + SVE_PT_SVE_FPSR_OFFSET (vq), &fpsimd->fpsr,
-	      sizeof (uint32_t));
-      memcpy (base + SVE_PT_SVE_FPCR_OFFSET (vq), &fpsimd->fpcr,
-	      sizeof (uint32_t));
+	  /* At this point we have converted the FPSIMD layout to an SVE
+	     layout and copied the register data.
 
-      for (int i = AARCH64_SVE_Z_REGS_NUM; i >= 0 ; i--)
-	{
-	  memcpy (base + SVE_PT_SVE_ZREG_OFFSET (vq, i), &fpsimd->vregs[i],
-		  sizeof (__int128_t));
+	     Fall through so we can update the thread's contents with the SVE
+	     register cache values.  */
 	}
     }
+  else
+    {
+      /* We already have SVE state for this thread, so we just need to update
+	 the values of the registers.  */
+      for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
+	if (REG_VALID == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM
+						       + i))
+	  reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i,
+				base + SVE_PT_SVE_ZREG_OFFSET (vq, i));
 
-  /* Replace the kernel values with those from reg_buf.  */
-
-  for (int i = 0; i < AARCH64_SVE_Z_REGS_NUM; i++)
-    if (REG_VALID == reg_buf->get_register_status (AARCH64_SVE_Z0_REGNUM + i))
-      reg_buf->raw_collect (AARCH64_SVE_Z0_REGNUM + i,
-			    base + SVE_PT_SVE_ZREG_OFFSET (vq, i));
-
-  for (int i = 0; i < AARCH64_SVE_P_REGS_NUM; i++)
-    if (REG_VALID == reg_buf->get_register_status (AARCH64_SVE_P0_REGNUM + i))
-      reg_buf->raw_collect (AARCH64_SVE_P0_REGNUM + i,
-			    base + SVE_PT_SVE_PREG_OFFSET (vq, i));
+      for (int i = 0; i < AARCH64_SVE_P_REGS_NUM; i++)
+	if (REG_VALID == reg_buf->get_register_status (AARCH64_SVE_P0_REGNUM
+						       + i))
+	  reg_buf->raw_collect (AARCH64_SVE_P0_REGNUM + i,
+				base + SVE_PT_SVE_PREG_OFFSET (vq, i));
+
+      if (REG_VALID == reg_buf->get_register_status (AARCH64_SVE_FFR_REGNUM))
+	reg_buf->raw_collect (AARCH64_SVE_FFR_REGNUM,
+			      base + SVE_PT_SVE_FFR_OFFSET (vq));
+      if (REG_VALID == reg_buf->get_register_status (AARCH64_FPSR_REGNUM))
+	reg_buf->raw_collect (AARCH64_FPSR_REGNUM,
+			      base + SVE_PT_SVE_FPSR_OFFSET (vq));
+      if (REG_VALID == reg_buf->get_register_status (AARCH64_FPCR_REGNUM))
+	reg_buf->raw_collect (AARCH64_FPCR_REGNUM,
+			      base + SVE_PT_SVE_FPCR_OFFSET (vq));
+    }
 
-  if (REG_VALID == reg_buf->get_register_status (AARCH64_SVE_FFR_REGNUM))
-    reg_buf->raw_collect (AARCH64_SVE_FFR_REGNUM,
-			  base + SVE_PT_SVE_FFR_OFFSET (vq));
-  if (REG_VALID == reg_buf->get_register_status (AARCH64_FPSR_REGNUM))
-    reg_buf->raw_collect (AARCH64_FPSR_REGNUM,
-			  base + SVE_PT_SVE_FPSR_OFFSET (vq));
-  if (REG_VALID == reg_buf->get_register_status (AARCH64_FPCR_REGNUM))
-    reg_buf->raw_collect (AARCH64_FPCR_REGNUM,
-			  base + SVE_PT_SVE_FPCR_OFFSET (vq));
+  /* At this point we have collected all the data from the register cache and
+     we are ready to update the SVE/FPSIMD register contents of the thread.
 
+     sve_state should contain all the data in the correct format, ready to be
+     passed on to ptrace.  */
+  aarch64_store_sve_regset (tid, new_state);
 }
diff --git a/gdb/nat/aarch64-scalable-linux-ptrace.h b/gdb/nat/aarch64-scalable-linux-ptrace.h
index 2847c4e0263..3215103c88a 100644
--- a/gdb/nat/aarch64-scalable-linux-ptrace.h
+++ b/gdb/nat/aarch64-scalable-linux-ptrace.h
@@ -52,22 +52,28 @@  uint64_t aarch64_sve_get_vq (int tid);
 bool aarch64_sve_set_vq (int tid, uint64_t vq);
 bool aarch64_sve_set_vq (int tid, struct reg_buffer_common *reg_buf);
 
-/* Read the current SVE register set using ptrace, allocating space as
-   required.  */
+/* Read the current SVE register set from thread TID and return its data
+   through a byte vector.  */
 
-extern std::unique_ptr<gdb_byte[]> aarch64_sve_get_sveregs (int tid);
+extern gdb::byte_vector aarch64_fetch_sve_regset (int tid);
 
-/* Put the registers from linux structure buf into register buffer.  Assumes the
-   vector lengths in the register buffer match the size in the kernel.  */
+/* Write the SVE contents from SVE_STATE to thread TID.  Return true if
+   successful and false otherwise.  */
 
-extern void aarch64_sve_regs_copy_to_reg_buf (struct reg_buffer_common *reg_buf,
-					      const void *buf);
+extern bool
+aarch64_store_sve_regset (int tid, const gdb::byte_vector &sve_state);
 
-/* Put the registers from register buffer into linux structure buf.  Assumes the
-   vector lengths in the register buffer match the size in the kernel.  */
+/* Given a thread id TID and a register buffer REG_BUF, update the register
+   buffer with the SVE state from thread TID.  */
 
 extern void
-aarch64_sve_regs_copy_from_reg_buf (const struct reg_buffer_common *reg_buf,
-				    void *buf);
+aarch64_sve_regs_copy_to_reg_buf (int tid, struct reg_buffer_common *reg_buf);
+
+/* Given a thread id TID and a register buffer REG_BUF containing SVE
+   register data, write the SVE data to thread TID.  */
+
+extern void
+aarch64_sve_regs_copy_from_reg_buf (int tid,
+				    struct reg_buffer_common *reg_buf);
 
 #endif /* NAT_AARCH64_SCALABLE_LINUX_PTRACE_H */
diff --git a/gdbserver/linux-aarch64-low.cc b/gdbserver/linux-aarch64-low.cc
index a20a3485ebc..af2e0fd9c10 100644
--- a/gdbserver/linux-aarch64-low.cc
+++ b/gdbserver/linux-aarch64-low.cc
@@ -719,9 +719,18 @@  aarch64_target::low_new_fork (process_info *parent,
 /* Wrapper for aarch64_sve_regs_copy_to_reg_buf.  */
 
 static void
-aarch64_sve_regs_copy_to_regcache (struct regcache *regcache, const void *buf)
+aarch64_sve_regs_copy_to_regcache (struct regcache *regcache,
+				   ATTRIBUTE_UNUSED const void *buf)
 {
-  return aarch64_sve_regs_copy_to_reg_buf (regcache, buf);
+  /* BUF is unused here since we collect the data straight from a ptrace
+     request in aarch64_sve_regs_copy_to_reg_buf, therefore bypassing
+     gdbserver's own call to ptrace.  */
+
+  int tid = lwpid_of (current_thread);
+
+  /* Update the register cache.  aarch64_sve_regs_copy_to_reg_buf handles
+     fetching the NT_ARM_SVE state from thread TID.  */
+  aarch64_sve_regs_copy_to_reg_buf (tid, regcache);
 }
 
 /* Wrapper for aarch64_sve_regs_copy_from_reg_buf.  */
@@ -729,7 +738,16 @@  aarch64_sve_regs_copy_to_regcache (struct regcache *regcache, const void *buf)
 static void
 aarch64_sve_regs_copy_from_regcache (struct regcache *regcache, void *buf)
 {
-  return aarch64_sve_regs_copy_from_reg_buf (regcache, buf);
+  int tid = lwpid_of (current_thread);
+
+  /* Update the thread SVE state.  aarch64_sve_regs_copy_from_reg_buf
+     handles writing the SVE/FPSIMD state back to thread TID.  */
+  aarch64_sve_regs_copy_from_reg_buf (tid, regcache);
+
+  /* We need to return the expected data in BUF, so copy whatever the kernel
+     already has to BUF.  */
+  gdb::byte_vector sve_state = aarch64_fetch_sve_regset (tid);
+  memcpy (buf, sve_state.data (), sve_state.size ());
 }
 
 /* Array containing all the possible register sets for AArch64/Linux.  During