[PATCHv3] gdb: add support for %V to printf command

Message ID f3cac405a83da4c27506a3ad58f0966f6f8b30fd.1683896221.git.aburgess@redhat.com
State New
Headers
Series [PATCHv3] gdb: add support for %V to printf command |

Commit Message

Andrew Burgess May 12, 2023, 1:01 p.m. UTC
  In v3:

  - Addressed all Eli's doc feedback.  It was pretty minor, so I don't
    think another doc review is required,

  - Rebased, and fixed a minor merge conflict in the testsuite,

  - No other code changes since v2,

  - Still need to check that Tom's query is satisfied (see V2 email):
    https://sourceware.org/pipermail/gdb-patches/2023-April/199178.html

---

This commit adds a new format for the printf and dprintf commands:
'%V'.  This new format takes any GDB expression and formats it as a
string, just as GDB would for a 'print' command, e.g.:

  (gdb) print a1
  $a = {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}
  (gdb) printf "%V\n", a1
  {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}
  (gdb)

It is also possible to pass the same options to %V as you might pass
to the print command, e.g.:

  (gdb) print -elements 3 -- a1
  $4 = {2, 4, 6...}
  (gdb) printf "%V[-elements 3]\n", a1
  {2, 4, 6...}
  (gdb)

This new feature would effectively replace an existing feature of GDB,
the $_as_string builtin convenience function.  However, the
$_as_string function has a few problems which this new feature solves:

1. $_as_string doesn't currently work when the inferior is not
running, e.g:

  (gdb) printf "%s", $_as_string(a1)
  You can't do that without a process to debug.
  (gdb)

The reason for this is that $_as_string returns a value object with
string type.  When we try to print this we call value_as_address,
which ends up trying to push the string into the inferior's address
space.

Clearly we could solve this problem, the string data exists in GDB, so
there's no reason why we have to push it into the inferior, but this
is an existing problem that would need solving.

2. $_as_string suffers from the fact that C degrades arrays to
pointers, e.g.:

  (gdb) printf "%s\n", $_as_string(a1)
  0x404260 <a1>
  (gdb)

The implementation of $_as_string is passed a gdb.Value object that is
a pointer, it doesn't understand that it's actually an array.  Solving
this would be harder than issue #1 I think.  The whole array to
pointer transformation is part of our expression evaluation.  And in
most cases this is exactly what we want.  It's not clear to me how
we'd (easily) tell GDB that we didn't want this reduction in _some_
cases.  But I'm sure this is solvable if we really wanted to.

3. $_as_string is a gdb.Function sub-class, and as such is passed
gdb.Value objects.  There's no super convenient way to pass formatting
options to $_as_string.  By this I mean that the new %V feature
supports print formatting options.  Ideally, we might want to add this
feature to $_as_string, we might imagine it working something like:

  (gdb) printf "%s\n", $_as_string(a1,
                                   elements = 3,
                                   array_indexes = True)

where the first item is the value to print, while the remaining
options are the print formatting options.  However, this relies on
Python calling syntax, which isn't something that convenience
functions handle.  We could possibly rely on strictly positional
arguments, like:

  (gdb) printf "%s\n", $_as_string(a1, 3, 1)

But that's clearly terrible as there's far more print formatting
options, and if you needed to set the 9th option you'd need to fill in
all the previous options.

And right now, the only way to pass these options to a gdb.Function is
to have GDB first convert them all into gdb.Value objects, which is
really overkill for what we want.

The new %V format solves all these problems: the string is computed
and printed entirely on the GDB side, we are able to print arrays as
actual arrays rather than pointers, and we can pass named format
arguments.

Finally, the $_as_string is sold in the manual as allowing users to
print the string representation of flag enums, so given:

  enum flags
    {
      FLAG_A = (1 << 0),
      FLAG_B = (1 << 1),
      FLAG_C = (1 << 1)
    };

  enum flags ff = FLAG_B;

We can:

  (gdb) printf "%s\n", $_as_string(ff)
  FLAG_B

This works just fine with %V too:

  (gdb) printf "%V\n", ff
  FLAG_B

So all functionality of $_as_string is replaced by %V.  I'm not
proposing to remove $_as_string, there might be users currently
depending on it, but I am proposing that we don't push $_as_string in
the documentation.

As %V is a feature of printf, GDB's dprintf breakpoints naturally gain
access to this feature too.  dprintf breakpoints can be operated in
three different styles 'gdb' (use GDB's printf), 'call' (call a
function in the inferior), or 'agent' (perform the dprintf on the
remote).

The use of '%V' will work just fine when dprintf-style is 'gdb'.

When dprintf-style is 'call' the format string and arguments are
passed to an inferior function (printf by default).  In this case GDB
doesn't prevent use of '%V', but the documentation makes it clear that
support for '%V' will depend on the inferior function being called.

I chose this approach because the current implementation doesn't place
any restrictions on the format string when operating in 'call' style.
That is, the user might already be calling a function that supports
custom print format specifiers (maybe including '%V') so, I claim, it
would be wrong to block use of '%V' in this case.  The documentation
does make it clear that users shouldn't expect this to "just work"
though.

When dprintf-style is 'agent' then GDB does no support the use of
'%V' (right now).  This is handled at the point when GDB tries to
process the format string and send the dprintf command to the remote,
here's an example:

  Reading symbols from /tmp/hello.x...
  (gdb) dprintf call_me, "%V", a1
  Dprintf 1 at 0x401152: file /tmp/hello.c, line 8.
  (gdb) set sysroot /
  (gdb) target remote | gdbserver --once - /tmp/hello.x
  Remote debugging using | gdbserver --once - /tmp/hello.x
  stdin/stdout redirected
  Process /tmp/hello.x created; pid = 3088822
  Remote debugging using stdio
  Reading symbols from /lib64/ld-linux-x86-64.so.2...
  (No debugging symbols found in /lib64/ld-linux-x86-64.so.2)
  0x00007ffff7fd3110 in _start () from /lib64/ld-linux-x86-64.so.2
  (gdb) set dprintf-style agent
  (gdb) c
  Continuing.
  Unrecognized format specifier 'V' in printf
  Command aborted.
  (gdb)

This is exactly how GDB would handle any other invalid format
specifier, for example:

  Reading symbols from /tmp/hello.x...
  (gdb) dprintf call_me, "%Q", a1
  Dprintf 1 at 0x401152: file /tmp/hello.c, line 8.
  (gdb) set sysroot /
  (gdb) target remote | gdbserver --once - /tmp/hello.x
  Remote debugging using | gdbserver --once - /tmp/hello.x
  stdin/stdout redirected
  Process /tmp/hello.x created; pid = 3089193
  Remote debugging using stdio
  Reading symbols from /lib64/ld-linux-x86-64.so.2...
  (No debugging symbols found in /lib64/ld-linux-x86-64.so.2)
  0x00007ffff7fd3110 in _start () from /lib64/ld-linux-x86-64.so.2
  (gdb) set dprintf-style agent
  (gdb) c
  Continuing.
  Unrecognized format specifier 'Q' in printf
  Command aborted.
  (gdb)

The error message isn't the greatest, but improving that can be put
off for another day I hope.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>
---
 gdb/NEWS                             | 11 ++++++
 gdb/doc/gdb.texinfo                  | 55 +++++++++++++++++++++++++---
 gdb/printcmd.c                       | 30 ++++++++++++++-
 gdb/testsuite/gdb.base/printcmds.c   | 13 +++++++
 gdb/testsuite/gdb.base/printcmds.exp | 28 +++++++++++++-
 gdbsupport/format.cc                 | 26 ++++++++++++-
 gdbsupport/format.h                  |  6 ++-
 7 files changed, 158 insertions(+), 11 deletions(-)


base-commit: a02fcd08ddc5080696248ed7fb4bf50a24763431
  

Comments

Simon Marchi May 12, 2023, 1:47 p.m. UTC | #1
On 5/12/23 09:01, Andrew Burgess via Gdb-patches wrote:
> In v3:
> 
>   - Addressed all Eli's doc feedback.  It was pretty minor, so I don't
>     think another doc review is required,
> 
>   - Rebased, and fixed a minor merge conflict in the testsuite,
> 
>   - No other code changes since v2,
> 
>   - Still need to check that Tom's query is satisfied (see V2 email):
>     https://sourceware.org/pipermail/gdb-patches/2023-April/199178.html

I've skimmed the patch, the idea looks fine to me.  Not sure if we typically
use Acked-By, but I think it's pretty standard:

Acked-By: Simon Marchi <simon.marchi@efficios.com>

Simon
  
Tom Tromey May 26, 2023, 8:44 p.m. UTC | #2
>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:

Andrew>   - Still need to check that Tom's query is satisfied (see V2 email):
Andrew>     https://sourceware.org/pipermail/gdb-patches/2023-April/199178.html

It is.  Thank you for doing this.

Also, it turns out this will be handy for DAP, to implement the
logMessage support.

Tom
  
Andrew Burgess May 30, 2023, 8:56 p.m. UTC | #3
Tom Tromey <tom@tromey.com> writes:

>>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:
>
> Andrew>   - Still need to check that Tom's query is satisfied (see V2 email):
> Andrew>     https://sourceware.org/pipermail/gdb-patches/2023-April/199178.html
>
> It is.  Thank you for doing this.

Excellent.  I went ahead and pushed this.

>
> Also, it turns out this will be handy for DAP, to implement the
> logMessage support.

Glad this helps!

Thanks,
Andrew
  

Patch

diff --git a/gdb/NEWS b/gdb/NEWS
index 6aa0d5171f2..d60a1a3fc01 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -67,6 +67,17 @@ 
     break foo thread 1 task 1
     watch var thread 2 task 3
 
+* The printf command now accepts a '%V' output format which will
+  format an expression just as the 'print' command would.  Print
+  options can be placed withing '[...]' after the '%V' to modify how
+  the value is printed.  E.g:
+    printf "%V", some_array
+    printf "%V[-array-indexes on]", some_array
+  will print the array without, or with array indexes included, just
+  as the array would be printed by the 'print' command.  This
+  functionality is also available for dprintf when dprintf-style is
+  'gdb'.
+
 * New commands
 
 maintenance print record-instruction [ N ]
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 531147f6e6b..e2a33d9acd7 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -5962,18 +5962,29 @@ 
 @table @code
 @item gdb
 @kindex dprintf-style gdb
-Handle the output using the @value{GDBN} @code{printf} command.
+Handle the output using the @value{GDBN} @code{printf} command.  When
+using this style, it is possible to use the @samp{%V} format specifier
+(@pxref{%V Format Specifier}).
 
 @item call
 @kindex dprintf-style call
 Handle the output by calling a function in your program (normally
-@code{printf}).
+@code{printf}).  When using this style the supported format specifiers
+depend entirely on the function being called.
+
+Most of @value{GDB}'s format specifiers align with those supported by
+the @code{printf} function, however, @value{GDB}'s @samp{%V} format
+specifier extension is not supported by @code{printf}.  When using
+@samp{call} style dprintf, care should be taken to ensure that only
+format specifiers supported by the output function are used, otherwise
+the results will be undefined.
 
 @item agent
 @kindex dprintf-style agent
-Have the remote debugging agent (such as @code{gdbserver}) handle
-the output itself.  This style is only available for agents that
-support running commands on the target.
+Have the remote debugging agent (such as @code{gdbserver}) handle the
+output itself.  This style is only available for agents that support
+running commands on the target.  This style does not support the
+@samp{%V} format specifier.
 @end table
 
 @item set dprintf-function @var{function}
@@ -13141,6 +13152,10 @@ 
 
 @findex $_as_string@r{, convenience function}
 @item $_as_string(@var{value})
+This convenience function is considered deprecated, and could be
+removed from future versions of @value{GDBN}.  Use the @samp{%V} format
+specifier instead (@pxref{%V Format Specifier}).
+
 Return the string representation of @var{value}.
 
 This function is useful to obtain the textual label (enumerator) of an
@@ -29049,6 +29064,36 @@ 
 printf "D32: %Hf - D64: %Df - D128: %DDf\n",1.2345df,1.2E10dd,1.2E1dl
 @end smallexample
 
+@anchor{%V Format Specifier}
+Additionally, @code{printf} supports a special @samp{%V} output format.
+This format prints the string representation of an expression just as
+@value{GDBN} would produce with the standard @kbd{print} command
+(@pxref{Data, ,Examining Data}):
+
+@smallexample
+(@value{GDBP}) print array
+$1 = @{0, 1, 2, 3, 4, 5@}
+(@value{GDBP}) printf "Array is: %V\n", array
+Array is: @{0, 1, 2, 3, 4, 5@}
+@end smallexample
+
+It is possible to include print options with the @samp{%V} format by
+placing them in @samp{[...]} immediately after the @samp{%V}, like
+this:
+
+@smallexample
+(@value{GDBP}) printf "Array is: %V[-array-indexes on]\n", array
+Array is: @{[0] = 0, [1] = 1, [2] = 2, [3] = 3, [4] = 4, [5] = 5@}
+@end smallexample
+
+If you need to print a literal @samp{[} directly after a @samp{%V}, then
+just include an empty print options list:
+
+@smallexample
+(@value{GDBP}) printf "Array is: %V[][Hello]\n", array
+Array is: @{0, 1, 2, 3, 4, 5@}[Hello]
+@end smallexample
+
 @anchor{eval}
 @kindex eval
 @item eval @var{template}, @var{expressions}@dots{}
diff --git a/gdb/printcmd.c b/gdb/printcmd.c
index 679a24e665a..68c561019de 100644
--- a/gdb/printcmd.c
+++ b/gdb/printcmd.c
@@ -2732,7 +2732,7 @@  ui_printf (const char *arg, struct ui_file *stream)
   if (*s++ != '"')
     error (_("Bad format string, missing '\"'."));
 
-  format_pieces fpieces (&s);
+  format_pieces fpieces (&s, false, true);
 
   if (*s++ != '"')
     error (_("Bad format string, non-terminated '\"'."));
@@ -2874,6 +2874,34 @@  ui_printf (const char *arg, struct ui_file *stream)
 	  case ptr_arg:
 	    printf_pointer (stream, current_substring, val_args[i]);
 	    break;
+	  case value_arg:
+	    {
+	      value_print_options print_opts;
+	      get_user_print_options (&print_opts);
+
+	      if (current_substring[2] == '[')
+		{
+		  std::string args (&current_substring[3],
+				    strlen (&current_substring[3]) - 1);
+
+		  const char *args_ptr = args.c_str ();
+
+		  /* Override global settings with explicit options, if
+		     any.  */
+		  auto group
+		    = make_value_print_options_def_group (&print_opts);
+		  gdb::option::process_options
+		    (&args_ptr, gdb::option::PROCESS_OPTIONS_UNKNOWN_IS_ERROR,
+		     group);
+
+		  if (*args_ptr != '\0')
+		    error (_("unexpected content in print options: %s"),
+			     args_ptr);
+		}
+
+	      print_formatted (val_args[i], 0, &print_opts, stream);
+	    }
+	    break;
 	  case literal_piece:
 	    /* Print a portion of the format string that has no
 	       directives.  Note that this will not include any
diff --git a/gdb/testsuite/gdb.base/printcmds.c b/gdb/testsuite/gdb.base/printcmds.c
index 78291a2803c..fa3a62d6cdd 100644
--- a/gdb/testsuite/gdb.base/printcmds.c
+++ b/gdb/testsuite/gdb.base/printcmds.c
@@ -108,6 +108,7 @@  enum flag_enum
   FE_TWO_LEGACY = 0x02,
 };
 
+enum flag_enum one = FE_ONE;
 enum flag_enum three = (enum flag_enum) (FE_ONE | FE_TWO);
 
 /* Another enum considered as a "flag enum", but with no enumerator with value
@@ -152,6 +153,18 @@  struct some_struct
   }
 };
 
+/* This is used in the printf test.  */
+struct small_struct
+{
+  int a;
+  int b;
+  int c;
+} a_small_struct = {
+  1,
+  2,
+  3
+};
+
 /* The following variables are used for testing byte repeat sequences.
    The variable names are encoded: invalid_XYZ where:
    X = start
diff --git a/gdb/testsuite/gdb.base/printcmds.exp b/gdb/testsuite/gdb.base/printcmds.exp
index db57769c303..dcc6a3f85d2 100644
--- a/gdb/testsuite/gdb.base/printcmds.exp
+++ b/gdb/testsuite/gdb.base/printcmds.exp
@@ -981,6 +981,32 @@  proc test_printf_with_strings {} {
     }
 }
 
+# Test the printf '%V' format.
+proc test_printf_V_format {} {
+    # Enums.
+    gdb_test {printf "%V\n", one} "FE_ONE"
+    gdb_test {printf "%V\n", three} "\\(FE_ONE \\| FE_TWO\\)"
+    gdb_test {printf "%V\n", flag_enum_without_zero} "0"
+    gdb_test {printf "%V\n", three_not_flag} "3"
+
+    # Arrays.
+    gdb_test {printf "%V\n", a1} "\\{2, 4, 6, 8, 10, 12, 14, 16, 18, 20\\}"
+    gdb_test {printf "%V[]\n", a1} "\\{2, 4, 6, 8, 10, 12, 14, 16, 18, 20\\}"
+    gdb_test {printf "%V[][]\n", a1} \
+	"\\{2, 4, 6, 8, 10, 12, 14, 16, 18, 20\\}\\\[\\\]"
+    gdb_test {printf "%V[-elements 3]\n", a1} "\\{2, 4, 6\\.\\.\\.\\}"
+    gdb_test {printf "%V[-elements 3][]\n", a1} \
+	"\\{2, 4, 6\\.\\.\\.\\}\\\[\\\]"
+    gdb_test {printf "%V[-elements 3 -array-indexes on]\n", a1} \
+	"\\{\\\[0\\\] = 2, \\\[1\\\] = 4, \\\[2\\\] = 6\\.\\.\\.\\}"
+
+    # Structures.
+    gdb_test {printf "%V\n", a_small_struct} \
+	"\\{a = 1, b = 2, c = 3\\}"
+    gdb_test {printf "%V[-pretty on]\n", a_small_struct} \
+	"\\{\r\n  a = 1,\r\n  b = 2,\r\n  c = 3\r\n\\}"
+}
+
 proc test_print_symbol {} {
     gdb_test_no_output "set print symbol on"
 
@@ -1110,7 +1136,6 @@  proc test_printf_convenience_var {prefix} {
     }
 }
 
-
 clean_restart
 
 gdb_test "print \$pc" "No registers\\."
@@ -1196,6 +1221,7 @@  test_print_enums
 test_printf
 test_printf_with_dfp
 test_printf_with_strings
+test_printf_V_format
 test_print_symbol
 test_repeat_bytes
 test_radices
diff --git a/gdbsupport/format.cc b/gdbsupport/format.cc
index 19f37ec8e0c..6e5a3cb6603 100644
--- a/gdbsupport/format.cc
+++ b/gdbsupport/format.cc
@@ -20,7 +20,8 @@ 
 #include "common-defs.h"
 #include "format.h"
 
-format_pieces::format_pieces (const char **arg, bool gdb_extensions)
+format_pieces::format_pieces (const char **arg, bool gdb_extensions,
+			      bool value_extension)
 {
   const char *s;
   const char *string;
@@ -44,7 +45,7 @@  format_pieces::format_pieces (const char **arg, bool gdb_extensions)
       char *f = (char *) alloca (strlen (s) + 1);
       string = f;
 
-      while ((gdb_extensions || *s != '"') && *s != '\0')
+      while (*s != '"' && *s != '\0')
 	{
 	  int c = *s++;
 	  switch (c)
@@ -340,6 +341,27 @@  format_pieces::format_pieces (const char **arg, bool gdb_extensions)
 	      bad = 1;
 	    break;
 
+	  case 'V':
+	    if (!value_extension)
+	      error (_("Unrecognized format specifier '%c' in printf"), *f);
+
+	    if (lcount > 1 || seen_h || seen_big_h || seen_big_h
+		|| seen_big_d || seen_double_big_d || seen_size_t
+		|| seen_prec || seen_zero || seen_space || seen_plus)
+	      bad = 1;
+
+	    this_argclass = value_arg;
+
+	    if (f[1] == '[')
+	      {
+		/* Move F forward to the next ']' character if such a
+		   character exists, otherwise leave F unchanged.  */
+		const char *tmp = strchr (f, ']');
+		if (tmp != nullptr)
+		  f = tmp;
+	      }
+	    break;
+
 	  case '*':
 	    error (_("`*' not supported for precision or width in printf"));
 
diff --git a/gdbsupport/format.h b/gdbsupport/format.h
index 342b473c3ed..2af34ab9450 100644
--- a/gdbsupport/format.h
+++ b/gdbsupport/format.h
@@ -41,7 +41,8 @@  enum argclass
     int_arg, long_arg, long_long_arg, size_t_arg, ptr_arg,
     string_arg, wide_string_arg, wide_char_arg,
     double_arg, long_double_arg,
-    dec32float_arg, dec64float_arg, dec128float_arg
+    dec32float_arg, dec64float_arg, dec128float_arg,
+    value_arg
   };
 
 /* A format piece is a section of the format string that may include a
@@ -75,7 +76,8 @@  class format_pieces
 {
 public:
 
-  format_pieces (const char **arg, bool gdb_extensions = false);
+  format_pieces (const char **arg, bool gdb_extensions = false,
+		 bool value_extension = false);
   ~format_pieces () = default;
 
   DISABLE_COPY_AND_ASSIGN (format_pieces);