[PATCHv2] gdb: Rewrite argument handling for user-defined commands

Message ID 20180906232904.13286-1-andrew.burgess@embecosm.com
State New, archived
Headers

Commit Message

Andrew Burgess Sept. 6, 2018, 11:29 p.m. UTC
  Here's a new version of the quoting patch which now uses single and
double quotes for quoting arguments.

I look forward to any feedback.

Eli - I suspect that the documentation changes would need some work,
but you should probably wait to review, as I suspect this patch will
change again before it can be merged.

Thanks,
Andrew

---

This commit rewrites argument passing for user-defined commands.  The
rewrite was inspired after this mailing list thread:

    https://sourceware.org/ml/gdb-patches/2018-08/msg00391.html

The summary is that it was felt that in order to pass arguments that
include whitespace, then single or double quotes should be used for
quoting the argument.

The problem is that currently, the quotes are included in the argument
that is passed into the user-defined command, so passing the argument
"1 + 1" will currently litterally pass "1 + 1" (including the quotes)
to GDB, which is no good if what you want to do is to pass an
expression.

This commit changes how quoting works so that the quotes are NOT now
included in the argument passed.  If the user wants to include quotes,
they would now need to use nested quotes, so "\"abc\"" will pass the
argument "abc".

It is also possible to use single quotes, so '"abc"' will also pass
the argument "abc".

As currently there's no documentation for how quoting works in
user-defined commands this commit adds documentation for the new
behaviour.

The big risk with this commit is that this does change how arguments
are passed to user-defined commands, and this might causes issues for
existing users.

gdb/ChangeLog:

	* cli/cli-script.c (user_args::m_command_line): Remove.
	(user_args::m_args): Change type.
	(user_args::user_args): Rewrite to build arguments into
	std::string.
	(user_args::insert_args): Update to take account of m_args type
	change.

gdb/doc/ChangeLog:

	* gdb.texinfo (Define): Additional documentation about argument
	syntax.

gdb/testsuite/ChangeLog:

	* gdb.base/commands.exp (user_defined_command_arg_quoting): New
	proc, which is added to the list of procs to call.
	* gdb.base/run.c (global_var): Defined global.
---
 gdb/ChangeLog                       |  9 ++++
 gdb/cli/cli-script.c                | 88 ++++++++++++++++++-------------------
 gdb/doc/ChangeLog                   |  5 +++
 gdb/doc/gdb.texinfo                 | 66 +++++++++++++++++++++++++---
 gdb/testsuite/ChangeLog             |  6 +++
 gdb/testsuite/gdb.base/commands.exp | 45 +++++++++++++++++++
 gdb/testsuite/gdb.base/run.c        |  3 ++
 7 files changed, 172 insertions(+), 50 deletions(-)
  

Comments

Eli Zaretskii Sept. 7, 2018, 6:30 a.m. UTC | #1
> From: Andrew Burgess <andrew.burgess@embecosm.com>
> Cc: Philippe Waroquiers <philippe.waroquiers@skynet.be>,	Tom Tromey <tom@tromey.com>,	Andrew Burgess <andrew.burgess@embecosm.com>
> Date: Fri,  7 Sep 2018 00:29:04 +0100
> 
> Eli - I suspect that the documentation changes would need some work,
> but you should probably wait to review, as I suspect this patch will
> change again before it can be merged.

Fine with me, let me know when it's ready.

Thanks.
  
Tom Tromey Sept. 7, 2018, 8:36 p.m. UTC | #2
>>>>> "Andrew" == Andrew Burgess <andrew.burgess@embecosm.com> writes:

Andrew> This commit changes how quoting works so that the quotes are NOT now
Andrew> included in the argument passed.  If the user wants to include quotes,
Andrew> they would now need to use nested quotes, so "\"abc\"" will pass the
Andrew> argument "abc".

Andrew> It is also possible to use single quotes, so '"abc"' will also pass
Andrew> the argument "abc".

Andrew> As currently there's no documentation for how quoting works in
Andrew> user-defined commands this commit adds documentation for the new
Andrew> behaviour.

Andrew> The big risk with this commit is that this does change how arguments
Andrew> are passed to user-defined commands, and this might causes issues for
Andrew> existing users.

I think this change goes against the compatibility approach I discussed
in that earlier thread -- it changes the syntax of a command in a way
that is likely to be used in practice.

In my opinion, the documentation issue in cases like this is not a
strong argument in favor of allowing a change.  That's because users
will often resort to trial-and-error to get gdb to do what they want.
So unless the documentation is very clear, in practice, and especially
over long periods of time, behavior is locked to the implementation.

So, my own inclination is to say no to this patch, though I welcome &
will listen to other responses.

I'd accept a patch adding an option to define, though as I mentioned
earlier, in a case like that I think it's better to design something
very good rather than to try to patch things piecemeal; the latter being
how gdb ended up in this situation in the first place.

Tom
  
Philippe Waroquiers Sept. 8, 2018, 5:35 a.m. UTC | #3
On Fri, 2018-09-07 at 00:29 +0100, Andrew Burgess wrote:
> Here's a new version of the quoting patch which now uses single and
> double quotes for quoting arguments.
> 
> I look forward to any feedback.
> 
> Eli - I suspect that the documentation changes would need some work,
> but you should probably wait to review, as I suspect this patch will
> change again before it can be merged.
> 
> Thanks,
> Andrew
> 
> ---
> 
> This commit rewrites argument passing for user-defined commands.  The
> rewrite was inspired after this mailing list thread:
> 
>     https://sourceware.org/ml/gdb-patches/2018-08/msg00391.html
> 
> The summary is that it was felt that in order to pass arguments that
> include whitespace, then single or double quotes should be used for
> quoting the argument.
Tom felt that we need to support your initial suggestion (parenthesis
quoting) for 'balanced expressions', as parenthesis are used in
some other commands that are evaluating expressions.
I can understand his point of view, see
https://sourceware.org/ml/gdb-patches/2018-09/msg00007.html

> 
> The problem is that currently, the quotes are included in the argument
> that is passed into the user-defined command, so passing the argument
> "1 + 1" will currently litterally pass "1 + 1" (including the quotes)
> to GDB, which is no good if what you want to do is to pass an
> expression.
For this problem, an alternative solution is to have a new
way to expand an argument : 
  $argX expands the argument X with the quotes
  $arguX expands the argument X with the quotes.

That allows to pass a quoted argument containing spaces,
and use it in the user defined command without quotes where needed,
and with quotes where needed : if the user defined command has to call
another command (user defined or a native) that itself needs quoting,
then use $argX, else use $arguX.
In other words, how to handle a quoted arg is decided by the
user command 'developer' (similarly to some native GDB commands).

So, adder command would become
   print $argu1 + $argu2 + $argu3

See https://sourceware.org/ml/gdb-patches/2018-09/msg00005.html
for a patch (only very limited manual testing done) implementing
the arguX approach :

(gdb)     define adder
Type commands for definition of "adder".
End with a line saying just "end".
>       print $arg0 + $arg1 + $arg2
>     end
(gdb) adder '1 + 5' 2 3
No symbol table is loaded.  Use the "file" command.
(gdb)

(gdb) define adder
Redefine command "adder"? (y or n) y
Type commands for definition of "adder".
End with a line saying just "end".
>print $argu0 + $argu1 + $argu2
>end
(gdb) adder '1 + 5' 2 3             
$4 = 11
(gdb) 


> 
> This commit changes how quoting works so that the quotes are NOT now
> included in the argument passed.  If the user wants to include quotes,
> they would now need to use nested quotes, so "\"abc\"" will pass the
> argument "abc".
> 
> It is also possible to use single quotes, so '"abc"' will also pass
> the argument "abc".
> 
> As currently there's no documentation for how quoting works in
> user-defined commands this commit adds documentation for the new
> behaviour.
> 
> The big risk with this commit is that this does change how arguments
> are passed to user-defined commands, and this might causes issues for
> existing users.
Yes, that has the potential to create a lot of backward incompatibility,
which is not the case for the $arguX and/or the parenthesis approach
you suggested initially.

Philippe
  
Andrew Burgess Sept. 8, 2018, 2:33 p.m. UTC | #4
* Philippe Waroquiers <philippe.waroquiers@skynet.be> [2018-09-08 07:35:14 +0200]:

> On Fri, 2018-09-07 at 00:29 +0100, Andrew Burgess wrote:
> > Here's a new version of the quoting patch which now uses single and
> > double quotes for quoting arguments.
> > 
> > I look forward to any feedback.
> > 
> > Eli - I suspect that the documentation changes would need some work,
> > but you should probably wait to review, as I suspect this patch will
> > change again before it can be merged.
> > 
> > Thanks,
> > Andrew
> > 
> > ---
> > 
> > This commit rewrites argument passing for user-defined commands.  The
> > rewrite was inspired after this mailing list thread:
> > 
> >     https://sourceware.org/ml/gdb-patches/2018-08/msg00391.html
> > 
> > The summary is that it was felt that in order to pass arguments that
> > include whitespace, then single or double quotes should be used for
> > quoting the argument.
> Tom felt that we need to support your initial suggestion (parenthesis
> quoting) for 'balanced expressions', as parenthesis are used in
> some other commands that are evaluating expressions.
> I can understand his point of view, see
> https://sourceware.org/ml/gdb-patches/2018-09/msg00007.html
> 
> > 
> > The problem is that currently, the quotes are included in the argument
> > that is passed into the user-defined command, so passing the argument
> > "1 + 1" will currently litterally pass "1 + 1" (including the quotes)
> > to GDB, which is no good if what you want to do is to pass an
> > expression.
> For this problem, an alternative solution is to have a new
> way to expand an argument : 
>   $argX expands the argument X with the quotes
>   $arguX expands the argument X with the quotes.
> 
> That allows to pass a quoted argument containing spaces,
> and use it in the user defined command without quotes where needed,
> and with quotes where needed : if the user defined command has to call
> another command (user defined or a native) that itself needs quoting,
> then use $argX, else use $arguX.
> In other words, how to handle a quoted arg is decided by the
> user command 'developer' (similarly to some native GDB commands).
> 
> So, adder command would become
>    print $argu1 + $argu2 + $argu3
> 
> See https://sourceware.org/ml/gdb-patches/2018-09/msg00005.html
> for a patch (only very limited manual testing done) implementing
> the arguX approach :

Great.  OK, lets go with this approach then.  Tom, do you feel that we
need more is required above and beyond adding argX / arguX as Philippe
has suggested?

Thanks,
Andrew


> 
> (gdb)     define adder
> Type commands for definition of "adder".
> End with a line saying just "end".
> >       print $arg0 + $arg1 + $arg2
> >     end
> (gdb) adder '1 + 5' 2 3
> No symbol table is loaded.  Use the "file" command.
> (gdb)
> 
> (gdb) define adder
> Redefine command "adder"? (y or n) y
> Type commands for definition of "adder".
> End with a line saying just "end".
> >print $argu0 + $argu1 + $argu2
> >end
> (gdb) adder '1 + 5' 2 3             
> $4 = 11
> (gdb) 
> 
> 
> > 
> > This commit changes how quoting works so that the quotes are NOT now
> > included in the argument passed.  If the user wants to include quotes,
> > they would now need to use nested quotes, so "\"abc\"" will pass the
> > argument "abc".
> > 
> > It is also possible to use single quotes, so '"abc"' will also pass
> > the argument "abc".
> > 
> > As currently there's no documentation for how quoting works in
> > user-defined commands this commit adds documentation for the new
> > behaviour.
> > 
> > The big risk with this commit is that this does change how arguments
> > are passed to user-defined commands, and this might causes issues for
> > existing users.
> Yes, that has the potential to create a lot of backward incompatibility,
> which is not the case for the $arguX and/or the parenthesis approach
> you suggested initially.
> 
> Philippe
> 
>
  

Patch

diff --git a/gdb/cli/cli-script.c b/gdb/cli/cli-script.c
index 8496fb85e6f..f2110331765 100644
--- a/gdb/cli/cli-script.c
+++ b/gdb/cli/cli-script.c
@@ -78,12 +78,8 @@  private:
   user_args (const user_args &) =delete;
   user_args &operator= (const user_args &) =delete;
 
-  /* It is necessary to store a copy of the command line to ensure
-     that the arguments are not overwritten before they are used.  */
-  std::string m_command_line;
-
-  /* The arguments.  Each element points inside M_COMMAND_LINE.  */
-  std::vector<gdb::string_view> m_args;
+  /* The arguments.  Parsed from the LINE passed into the constructor.  */
+  std::vector<std::string> m_args;
 };
 
 /* The stack of arguments passed to user defined functions.  We need a
@@ -749,56 +745,58 @@  user_args::user_args (const char *command_line)
   if (command_line == NULL)
     return;
 
-  m_command_line = command_line;
-  p = m_command_line.c_str ();
+  p = command_line;
 
   while (*p)
     {
-      const char *start_arg;
-      int squote = 0;
-      int dquote = 0;
-      int bsquote = 0;
+      std::string arg;
+
+      bool bquote = false;
+      bool squote = false;
+      bool dquote = false;
 
       /* Strip whitespace.  */
-      while (*p == ' ' || *p == '\t')
+      while (isspace (*p))
 	p++;
 
-      /* P now points to an argument.  */
-      start_arg = p;
-
       /* Get to the end of this argument.  */
       while (*p)
 	{
-	  if (((*p == ' ' || *p == '\t')) && !squote && !dquote && !bsquote)
-	    break;
-	  else
-	    {
-	      if (bsquote)
-		bsquote = 0;
-	      else if (*p == '\\')
-		bsquote = 1;
-	      else if (squote)
-		{
-		  if (*p == '\'')
-		    squote = 0;
-		}
-	      else if (dquote)
-		{
-		  if (*p == '"')
-		    dquote = 0;
-		}
-	      else
-		{
-		  if (*p == '\'')
-		    squote = 1;
-		  else if (*p == '"')
-		    dquote = 1;
-		}
-	      p++;
-	    }
+          /* If we find whitespace and we're not inside a single or double
+             quote then we have found the end of this argument.  */
+          if (isspace (*p) && !(squote || dquote))
+            break;
+          else if (bquote)
+            bquote = 0;
+          else
+            {
+              /* If we're inside a single quote and we find another single
+                 quote then this is the end of the argument.  */
+              if (*p == '\'' && !dquote)
+                {
+                  ++p;
+                  squote = !squote;
+                  continue;
+                }
+
+              /* If we're inside a double quote and we find another double
+                 quote then this is the end of the argument.  */
+              if (*p == '"' && !squote)
+                {
+                  ++p;
+                  dquote = !dquote;
+                  continue;
+                }
+
+              if (*p == '\\' && !squote)
+                bquote = true;
+            }
+
+          arg += *p;
+          ++p;
 	}
 
-      m_args.emplace_back (start_arg, p - start_arg);
+      m_args.emplace_back (arg);
     }
 }
 
@@ -863,7 +861,7 @@  user_args::insert_args (const char *line) const
 	    error (_("Missing argument %ld in user function."), i);
 	  else
 	    {
-	      new_line.append (m_args[i].data (), m_args[i].length ());
+	      new_line.append (m_args[i]);
 	      line = tmp;
 	    }
 	}
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index f2d1155b4db..b159df3b217 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -25245,14 +25245,70 @@ 
 
 @noindent
 This defines the command @code{adder}, which prints the sum of
-its three arguments.  Note the arguments are text substitutions, so they may
-reference variables, use complex expressions, or even perform inferior
-functions calls.
+its three arguments.
+
+The arguments to user-defined commands are text substitutions, so they
+may reference variables, use complex expressions, or even perform
+inferior functions calls.  Each argument is separated with whitespace,
+so in the previous example three arguments were passed.  The following
+example also passes three arguments, though the arguments are more
+complex:
+
+@smallexample
+adder 10+1 10+2 10+3
+@end smallexample
+
+@noindent
+However, if whitespace were added around the @code{+} characters, then
+9 arguments would be passed, @code{adder} only uses the first 3 of
+these arguments, and the others would be silently ignored, for example:
+
+@smallexample
+adder 10 + 1 10 + 2 10 + 3
+@end smallexample
+
+Causes @value{GDBN} to try and evaluate the following, which is likely
+invalid:
+
+@smallexample
+print 10 + + + 1
+@end smallexample
+
+@noindent
+Arguments can be quoted with double (@code{"}) or single (@code{'})
+quotes.  These quotes are not passed through as part of the argument,
+so the complex arguments from the previous example can be written as:
+
+@smallexample
+adder '10 + 1' '10 + 2' '10 + 3'
+@end smallexample
+
+@noindent
+As the quotes are not passed through, then the previous example causes
+@value{GDBN} to evaluate:
+
+@smallexample
+print 10 + 1 + 10 + 2 + 10 + 3
+@end smallexample
+
+@noindent
+Outside of quotes, a backslash can be used to pass a quote as part of
+an argument, for example, @code{\'} will pass a single quote as an
+argument, and @code{\"} passes a double quote as an argument.
+
+Within double quotes a backslash can be used to pass a literal double
+quote, so @code{"\"abc\""} will pass the argument @code{"abc"}.
+
+Within single quotes a backslash does not escape a single quote, the
+next single quote always ends a single quoted argument, and
+backslashes within a single quoted argument are passed straight
+through, so @code{'abc\'} will pass the argument @code{abc\}.
 
 @cindex argument count in user-defined commands
 @cindex how many arguments (user-defined commands)
-In addition, @code{$argc} may be used to find out how many arguments have
-been passed.
+@noindent
+Within a user-defined command @code{$argc} may be used to find out how
+many arguments have been passed.
 
 @smallexample
 define adder
diff --git a/gdb/testsuite/gdb.base/commands.exp b/gdb/testsuite/gdb.base/commands.exp
index 52a22bb5ddc..4e398b84a95 100644
--- a/gdb/testsuite/gdb.base/commands.exp
+++ b/gdb/testsuite/gdb.base/commands.exp
@@ -1167,6 +1167,50 @@  proc_with_prefix backslash_in_multi_line_command_test {} {
     gdb_test "print 1" "" "run command"
 }
 
+proc_with_prefix user_defined_command_arg_quoting {} {
+    gdb_test_multiple "define show_args" "define show_args" {
+	-re "End with"  {
+	    pass "define show_args"
+	}
+    }
+
+    # This test should alternate between 0xdeadbeef and 0xfeedface two times.
+    gdb_test \
+	[multi_line_input \
+	     {printf "nargs=%d:", $argc} \
+	     {set $i = 0} \
+	     {while $i < $argc} \
+	     {printf " "} \
+	     {eval "echo '$arg%d'", $i} \
+	     {set $i = $i + 1} \
+	     {end} \
+	     {printf "\n"} \
+	     {end}] \
+	"" \
+	"enter commands"
+
+    gdb_test "show_args 1 2 3" \
+	"nargs=3: '1' '2' '3'"
+
+    gdb_test "show_args 1 '1 + 1' '1 + (1 + 1)'" \
+	"nargs=3: '1' '1 \\+ 1' '1 \\+ \\(1 \\+ 1\\)'"
+
+    gdb_test "show_args '{unsigned long long} &global_var'" \
+	"nargs=1: '{unsigned long long} &global_var'"
+
+    gdb_test "show_args '*((unsigned long long *) &global_var)'" \
+	"nargs=1: '\\*\\(\\(unsigned long long \\*\\) &global_var\\)'"
+
+    gdb_test "show_args '\"' \"'\" \"''\"" \
+	"nargs=3: '\"' ''' ''''"
+
+    gdb_test "show_args \\n a\\'b" \
+	"nargs=2: '\r\n' 'a'b'"
+
+    gdb_test "show_args \"This\\nIs\\nA\\nMulti Line\\nMessage" \
+	"nargs=1: 'This\r\nIs\r\nA\r\nMulti Line\r\nMessage'"
+}
+
 gdbvar_simple_if_test
 gdbvar_simple_while_test
 gdbvar_complex_if_while_test
@@ -1182,6 +1226,7 @@  user_defined_command_case_sensitivity
 user_defined_command_args_eval
 user_defined_command_args_stack_test
 user_defined_command_manyargs_test
+user_defined_command_arg_quoting
 watchpoint_command_test
 test_command_prompt_position
 deprecated_command_test
diff --git a/gdb/testsuite/gdb.base/run.c b/gdb/testsuite/gdb.base/run.c
index 614b018260d..d89bad78bb4 100644
--- a/gdb/testsuite/gdb.base/run.c
+++ b/gdb/testsuite/gdb.base/run.c
@@ -8,6 +8,9 @@ 
 
 #include "../lib/unbuffer_output.c"
 
+/* Used by commands.exp test script.  */
+volatile unsigned long long global_var = 34;
+
 int factorial (int);
 
 int