[26/26] gdbserver: allow gradually populating and selectively storing a regcache

Message ID 65c2083374c880b4802a8453dea62b23e2c0dda6.1677582745.git.tankut.baris.aktemur@intel.com
State New
Headers
Series gdbserver: refactor regcache and allow gradually populating |

Commit Message

Aktemur, Tankut Baris Feb. 28, 2023, 11:28 a.m. UTC
  Currently, the regcache can be used by fetching all the registers from
the target.  For some targets, this could be a costly operation
because there is a large number of threads with a large register file
each.  In this patch, we revise the regcache implementation to allow
populating the contents gradually and storing the registers only when
they have updated values.

To this aim, we introduce a new register status: REG_DIRTY.  This
status denotes that a register value has been cached and also updated.
When invalidating the cache, only the dirty registers are stored to
the target.  In a typical debug session, it is more likely that only a
small subset of the register file has changed.  Therefore, selectively
storing the registers on targets with many threads and registers can
save substantial costs, with respect to storing the whole set.

The collect_register function now performs a lazy fetch.  If the
requested register value is not cached yet, it is requested from the
target.

The supply_register function updates the status of the supplied
register as follows: if the register was not available in the cache,
its status becomes REG_VALID, denoting that the value is now cached.
If the register is supplied again, it becomes REG_DIRTY.

The function that supply the whole register set (supply_regblock and
registers_from_string) are also updated to compare the present and new
values of each register, so that we can track the register statuses
(i.e.  dirty or not) properly.

Regression-tested on an X86_64 Linux target using the native-gdbserver
and native-extended-gdbserver board files.
---
 gdbserver/regcache.cc        | 96 ++++++++++++++++++++++++++++++------
 gdbserver/regcache.h         |  6 +++
 gdbsupport/common-regcache.h |  3 ++
 3 files changed, 91 insertions(+), 14 deletions(-)
  

Comments

Simon Marchi Dec. 22, 2023, 4:25 p.m. UTC | #1
On 2023-02-28 06:28, Tankut Baris Aktemur via Gdb-patches wrote:
> Currently, the regcache can be used by fetching all the registers from
> the target.  For some targets, this could be a costly operation
> because there is a large number of threads with a large register file
> each.  In this patch, we revise the regcache implementation to allow
> populating the contents gradually and storing the registers only when
> they have updated values.

When reading the subject (gradually populating) and the commit message
(especially the paragraph above), I expected that the patch would make
fetching registers from the target (and filling up the regcache) more
lazy.  This is the picture I had in mind: the regcache would start with
all REG_UNKNOWN statuses.  When a caller asks for a given register
value, we would fetch that register value from the target, and its
status would become REG_VALID.

However, I see that regcache::fetch still asks the target to fetch all
registers from the start.  Is my understanding of what you're trying to
achieve wrong?

> 
> To this aim, we introduce a new register status: REG_DIRTY.  This
> status denotes that a register value has been cached and also updated.
> When invalidating the cache, only the dirty registers are stored to
> the target.  In a typical debug session, it is more likely that only a
> small subset of the register file has changed.  Therefore, selectively
> storing the registers on targets with many threads and registers can
> save substantial costs, with respect to storing the whole set.

Just curious, can you share some real world experience about those cost
savings?  I'm guessing we're talking about GPU registers here?

> The collect_register function now performs a lazy fetch.  If the
> requested register value is not cached yet, it is requested from the
> target.
> 
> The supply_register function updates the status of the supplied
> register as follows: if the register was not available in the cache,
> its status becomes REG_VALID, denoting that the value is now cached.
> If the register is supplied again, it becomes REG_DIRTY.

I don't understand the logic here.  If a register is in the REG_UNKNOWN
state and I supply a value through regcache->raw_supply, I think it
should go to the REG_DIRTY status.  We don't know what the value of the
register on the target is, it is only safe to assume that it's not the
same value as what was supplied.

> 
> The function that supply the whole register set (supply_regblock and
> registers_from_string) are also updated to compare the present and new
> values of each register, so that we can track the register statuses
> (i.e.  dirty or not) properly.
> 
> Regression-tested on an X86_64 Linux target using the native-gdbserver
> and native-extended-gdbserver board files.
> ---
>  gdbserver/regcache.cc        | 96 ++++++++++++++++++++++++++++++------
>  gdbserver/regcache.h         |  6 +++
>  gdbsupport/common-regcache.h |  3 ++
>  3 files changed, 91 insertions(+), 14 deletions(-)
> 
> diff --git a/gdbserver/regcache.cc b/gdbserver/regcache.cc
> index cfb68774402..cf020985c31 100644
> --- a/gdbserver/regcache.cc
> +++ b/gdbserver/regcache.cc
> @@ -63,6 +63,18 @@ regcache::fetch ()
>        gdb_assert (this->thread != nullptr);
>        switch_to_thread (this->thread);
>  
> +      /* If there are individually-fetched dirty registers, first
> +	 store them, then fetch all.  We prefer this to doing
> +	 individual fetch for each registers, if needed, because it is
> +	 more likely that very few registers are individually-fetched
> +	 at this moment and that fetching all in one go is more
> +	 efficient than fetching each reg one by one.  */
> +      for (int i = 0; i < tdesc->reg_defs.size (); ++i)
> +	{
> +	  if (register_status[i] == REG_DIRTY)
> +	    store_inferior_registers (this, i);
> +	}
> +
>        /* Invalidate all registers, to prevent stale left-overs.  */
>        discard ();
>        fetch_inferior_registers (this, -1);
> @@ -100,12 +112,17 @@ regcache_invalidate_thread (struct thread_info *thread)
>  void
>  regcache::invalidate ()
>  {
> -  if (registers_fetched)
> +  scoped_restore_current_thread restore_thread;
> +  gdb_assert (this->thread != nullptr);
> +  switch_to_thread (this->thread);
> +
> +  /* Store dirty registers individually.  We prefer this to a
> +     store-all, because it is more likely that a small number of
> +     registers have changed.  */
> +  for (int i = 0; i < tdesc->reg_defs.size (); ++i)
>      {
> -      scoped_restore_current_thread restore_thread;
> -      gdb_assert (this->thread != nullptr);
> -      switch_to_thread (this->thread);
> -      store_inferior_registers (this, -1);
> +      if (register_status[i] == REG_DIRTY)
> +	store_inferior_registers (this, i);
>      }

I think there's a design problem here: from what I understand, it's
possible to get in a situation where all registers are REG_UNKNOWN,
except one that is REG_DIRTY.  When you "invalidate" the regcache, we'll
call store_inferior_registers for the register that is REG_DIRTY.
However, linux_process_target::fetch_registers, for instance, will write
all registers in a given regset when asked to store one register
contained in that regset.  So it will end up writing garbage data for
all the registers in that regset that were REG_UNKNOWN.

>  
>    discard ();
> @@ -231,7 +248,8 @@ regcache::registers_to_string (char *buf)
>    unsigned char *regs = registers;
>    for (int i = 0; i < tdesc->reg_defs.size (); ++i)
>      {
> -      if (register_status[i] == REG_VALID)
> +      if (register_status[i] == REG_VALID
> +	  || register_status[i] == REG_DIRTY)
>  	{
>  	  bin2hex (regs, buf, register_size (tdesc, i));
>  	  buf += register_size (tdesc, i) * 2;
> @@ -258,9 +276,12 @@ regcache::registers_from_string (const char *buf)
>        if (len > tdesc->registers_size * 2)
>  	len = tdesc->registers_size * 2;
>      }
> -  hex2bin (buf, registers, len / 2);
> -  /* All register data have been re-written.  Update the statuses.  */
> -  memset (register_status, REG_VALID, tdesc->reg_defs.size ());
> +
> +  unsigned char *new_regs =
> +    (unsigned char *) alloca (tdesc->registers_size);

I would prefer to use a gdb::byte_vector here, instead of alloca.

Simon
  
Aktemur, Tankut Baris June 28, 2024, 12:17 p.m. UTC | #2
Hi Simon,

> -----Original Message-----
> From: Simon Marchi <simark@simark.ca>
> Sent: Friday, December 22, 2023 5:25 PM
> To: Aktemur, Tankut Baris <tankut.baris.aktemur@intel.com>; gdb-patches@sourceware.org
> Subject: Re: [PATCH 26/26] gdbserver: allow gradually populating and selectively storing a
> regcache

Thank you very much for your review.  I just wanted to say that I'd like to address
and reply to your comments, but it hasn't been possible due to other tasks.  Just a
notification that I haven't forgotten this series.

Regards,
-Baris


Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Sean Fennelly, Jeffrey Schneiderman, Tiffany Doon Silva
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
  

Patch

diff --git a/gdbserver/regcache.cc b/gdbserver/regcache.cc
index cfb68774402..cf020985c31 100644
--- a/gdbserver/regcache.cc
+++ b/gdbserver/regcache.cc
@@ -63,6 +63,18 @@  regcache::fetch ()
       gdb_assert (this->thread != nullptr);
       switch_to_thread (this->thread);
 
+      /* If there are individually-fetched dirty registers, first
+	 store them, then fetch all.  We prefer this to doing
+	 individual fetch for each registers, if needed, because it is
+	 more likely that very few registers are individually-fetched
+	 at this moment and that fetching all in one go is more
+	 efficient than fetching each reg one by one.  */
+      for (int i = 0; i < tdesc->reg_defs.size (); ++i)
+	{
+	  if (register_status[i] == REG_DIRTY)
+	    store_inferior_registers (this, i);
+	}
+
       /* Invalidate all registers, to prevent stale left-overs.  */
       discard ();
       fetch_inferior_registers (this, -1);
@@ -100,12 +112,17 @@  regcache_invalidate_thread (struct thread_info *thread)
 void
 regcache::invalidate ()
 {
-  if (registers_fetched)
+  scoped_restore_current_thread restore_thread;
+  gdb_assert (this->thread != nullptr);
+  switch_to_thread (this->thread);
+
+  /* Store dirty registers individually.  We prefer this to a
+     store-all, because it is more likely that a small number of
+     registers have changed.  */
+  for (int i = 0; i < tdesc->reg_defs.size (); ++i)
     {
-      scoped_restore_current_thread restore_thread;
-      gdb_assert (this->thread != nullptr);
-      switch_to_thread (this->thread);
-      store_inferior_registers (this, -1);
+      if (register_status[i] == REG_DIRTY)
+	store_inferior_registers (this, i);
     }
 
   discard ();
@@ -231,7 +248,8 @@  regcache::registers_to_string (char *buf)
   unsigned char *regs = registers;
   for (int i = 0; i < tdesc->reg_defs.size (); ++i)
     {
-      if (register_status[i] == REG_VALID)
+      if (register_status[i] == REG_VALID
+	  || register_status[i] == REG_DIRTY)
 	{
 	  bin2hex (regs, buf, register_size (tdesc, i));
 	  buf += register_size (tdesc, i) * 2;
@@ -258,9 +276,12 @@  regcache::registers_from_string (const char *buf)
       if (len > tdesc->registers_size * 2)
 	len = tdesc->registers_size * 2;
     }
-  hex2bin (buf, registers, len / 2);
-  /* All register data have been re-written.  Update the statuses.  */
-  memset (register_status, REG_VALID, tdesc->reg_defs.size ());
+
+  unsigned char *new_regs =
+    (unsigned char *) alloca (tdesc->registers_size);
+
+  hex2bin (buf, new_regs, len / 2);
+  supply_regblock (new_regs);
 }
 
 /* See regcache.h */
@@ -350,7 +371,7 @@  regcache::raw_supply (int n, const void *buf)
     {
       memcpy (register_data (n), buf, register_size (tdesc, n));
 #ifndef IN_PROCESS_AGENT
-      set_register_status (n, REG_VALID);
+      bump_register_status (n);
 #endif
     }
   else
@@ -370,7 +391,7 @@  supply_register_zeroed (struct regcache *regcache, int n)
   memset (regcache->register_data (n), 0,
 	  register_size (regcache->tdesc, n));
 #ifndef IN_PROCESS_AGENT
-  regcache->set_register_status (n, REG_VALID);
+  regcache->bump_register_status (n);
 #endif
 }
 
@@ -392,11 +413,26 @@  regcache::supply_regblock (const void *buf)
 {
   gdb_assert (buf != nullptr);
 
-  memcpy (registers, buf, tdesc->registers_size);
 #ifndef IN_PROCESS_AGENT
-  for (int i = 0; i < tdesc->reg_defs.size (); i++)
-    set_register_status (i, REG_VALID);
+  /* First, update the statuses.  Mark dirty only those that have
+     changed.  */
+  unsigned char *regs = registers;
+  unsigned char *new_regs = (unsigned char *) buf;
+  for (int i = 0; i < tdesc->reg_defs.size (); ++i)
+    {
+      int size = register_size (tdesc, i);
+      bool first_time = (get_register_status (i) == REG_UNKNOWN);
+      bool valid = (get_register_status (i) == REG_VALID);
+
+      if (first_time
+	  || (valid && (memcmp (new_regs, regs, size) != 0)))
+	bump_register_status (i);
+
+      regs += size;
+      new_regs += size;
+    }
 #endif
+  memcpy (registers, buf, tdesc->registers_size);
 }
 
 #ifndef IN_PROCESS_AGENT
@@ -413,6 +449,15 @@  supply_register_by_name (struct regcache *regcache,
 void
 collect_register (struct regcache *regcache, int n, void *buf)
 {
+#ifndef IN_PROCESS_AGENT
+  if (regcache->get_register_status (n) == REG_UNKNOWN)
+    {
+      /* This register has not been fetched from the target, yet.
+	 Do it now.  */
+      fetch_inferior_registers (regcache, n);
+    }
+#endif
+
   regcache->raw_collect (n, buf);
 }
 
@@ -513,6 +558,29 @@  regcache::set_register_status (int regnum, enum register_status status)
 #endif
 }
 
+void
+regcache::bump_register_status (int regnum)
+{
+#ifndef IN_PROCESS_AGENT
+  if (register_status == nullptr)
+    return;
+#endif
+
+  switch (get_register_status (regnum))
+    {
+    case REG_UNKNOWN:
+      set_register_status (regnum, REG_VALID);
+      break;
+
+    case REG_VALID:
+      set_register_status (regnum, REG_DIRTY);
+      break;
+
+    default:
+      break;
+    }
+}
+
 /* See gdbsupport/common-regcache.h.  */
 
 bool
diff --git a/gdbserver/regcache.h b/gdbserver/regcache.h
index 216044889ec..132709fa71f 100644
--- a/gdbserver/regcache.h
+++ b/gdbserver/regcache.h
@@ -74,6 +74,12 @@  struct regcache : public reg_buffer_common
   /* Set the status of register REGNUM to STATUS.  */
   void set_register_status (int regnum, enum register_status status);
 
+  /* Shift the register status "one level" towards REG_DIRTY.
+     REG_UNKNOWN becomes REG_VALID;
+     REG_VALID becomes REG_DIRTY;
+     REG_DIRTY and REG_UNAVAILABLE stay the same.  */
+  void bump_register_status (int regnum);
+
   /* See gdbsupport/common-regcache.h.  */
   void raw_supply (int regnum, const void *buf) override;
 
diff --git a/gdbsupport/common-regcache.h b/gdbsupport/common-regcache.h
index bf14610f98f..e81238fe7e0 100644
--- a/gdbsupport/common-regcache.h
+++ b/gdbsupport/common-regcache.h
@@ -31,6 +31,9 @@  enum register_status : signed char
     /* The register value is valid and cached.  */
     REG_VALID = 1,
 
+    /* The register value is valid, cached, and has been changed.  */
+    REG_DIRTY = 2,
+
     /* The register value is unavailable.  E.g., we're inspecting a
        traceframe, and this register wasn't collected.  Note that this
        "unavailable" is different from saying the register does not