[2/2] gdb: Allow parenthesis to group arguments to user-defined commands

Message ID eaa6d3a2975194a7ef3a2aa40e335c7986b205d8.1534343840.git.andrew.burgess@embecosm.com
State New, archived
Headers

Commit Message

Andrew Burgess Aug. 15, 2018, 2:39 p.m. UTC
  When calling a user-defined command then currently, arguments are
whitespace separated.  This means that it is impossible to pass a
single argument that contains a whitespace.

The exception to the no whitespace rule is strings, a string argument,
enclosed in double, or single quotes, can contain whitespace.

However, if a user wants to reference, for example, a type name that
contains a space, as in these examples:

   user_command *((unsigned long long *) some_pointer)

   user_command {unsigned long long}some_pointer

then this will not work, as the whitespace between 'unsigned' and
'long', as well as the whitespace between 'long' and 'long', will mean
GDB interprets this as many arguments.

The solution proposed in this patch is to allow parenthesis to be used
to group arguments, so the use could now write:

   user_command (*((unsigned long long *) some_pointer))

   user_command ({unsigned long long}some_pointer)

And GDB will interpret these as a single argument.

gdb/ChangeLog:

	* cli/cli-script.c (user_args::user_args): Allow parenthesis to
	group arguments.

gdb/testsuite/ChangeLog:

	* gdb.base/commands.exp (args_with_whitespace): New proc, which is
	added to the list of procs to call.
	* gdb.base/run.c (global_var): Defined global.

gdb/doc/ChangeLog:

	* gdb.texinfo (Define): Additional documentation about argument
	syntax.
---
 gdb/ChangeLog                       |  5 ++++
 gdb/cli/cli-script.c                |  8 +++++-
 gdb/doc/ChangeLog                   |  5 ++++
 gdb/doc/gdb.texinfo                 | 54 +++++++++++++++++++++++++++++++++----
 gdb/testsuite/ChangeLog             |  6 +++++
 gdb/testsuite/gdb.base/commands.exp | 36 +++++++++++++++++++++++++
 gdb/testsuite/gdb.base/run.c        |  3 +++
 7 files changed, 111 insertions(+), 6 deletions(-)
  

Comments

Eli Zaretskii Aug. 15, 2018, 3:27 p.m. UTC | #1
> From: Andrew Burgess <andrew.burgess@embecosm.com>
> Cc: Andrew Burgess <andrew.burgess@embecosm.com>
> Date: Wed, 15 Aug 2018 15:39:20 +0100
> 
> When calling a user-defined command then currently, arguments are
> whitespace separated.  This means that it is impossible to pass a
> single argument that contains a whitespace.
> 
> The exception to the no whitespace rule is strings, a string argument,
> enclosed in double, or single quotes, can contain whitespace.
> 
> However, if a user wants to reference, for example, a type name that
> contains a space, as in these examples:
> 
>    user_command *((unsigned long long *) some_pointer)
> 
>    user_command {unsigned long long}some_pointer
> 
> then this will not work, as the whitespace between 'unsigned' and
> 'long', as well as the whitespace between 'long' and 'long', will mean
> GDB interprets this as many arguments.
> 
> The solution proposed in this patch is to allow parenthesis to be used
> to group arguments, so the use could now write:
> 
>    user_command (*((unsigned long long *) some_pointer))
> 
>    user_command ({unsigned long long}some_pointer)
> 
> And GDB will interpret these as a single argument.
> 
> gdb/ChangeLog:
> 
> 	* cli/cli-script.c (user_args::user_args): Allow parenthesis to
> 	group arguments.
> 
> gdb/testsuite/ChangeLog:
> 
> 	* gdb.base/commands.exp (args_with_whitespace): New proc, which is
> 	added to the list of procs to call.
> 	* gdb.base/run.c (global_var): Defined global.
> 
> gdb/doc/ChangeLog:
> 
> 	* gdb.texinfo (Define): Additional documentation about argument
> 	syntax.
> ---
>  gdb/ChangeLog                       |  5 ++++
>  gdb/cli/cli-script.c                |  8 +++++-
>  gdb/doc/ChangeLog                   |  5 ++++
>  gdb/doc/gdb.texinfo                 | 54 +++++++++++++++++++++++++++++++++----
>  gdb/testsuite/ChangeLog             |  6 +++++
>  gdb/testsuite/gdb.base/commands.exp | 36 +++++++++++++++++++++++++
>  gdb/testsuite/gdb.base/run.c        |  3 +++
>  7 files changed, 111 insertions(+), 6 deletions(-)

OK for the documentation parts.

Thanks.
  
Philippe Waroquiers Aug. 25, 2018, 7:32 p.m. UTC | #2
Woudn't it be more 'usual' (and maybe simpler) to use a \  for this ?
e.g. like the shell:
ls ab\ cd
ls: cannot access 'ab cd': No such file or directory

Philippe


On Wed, 2018-08-15 at 15:39 +0100, Andrew Burgess wrote:
> When calling a user-defined command then currently, arguments are
> whitespace separated.  This means that it is impossible to pass a
> single argument that contains a whitespace.
> 
> The exception to the no whitespace rule is strings, a string argument,
> enclosed in double, or single quotes, can contain whitespace.
> 
> However, if a user wants to reference, for example, a type name that
> contains a space, as in these examples:
> 
>    user_command *((unsigned long long *) some_pointer)
> 
>    user_command {unsigned long long}some_pointer
> 
> then this will not work, as the whitespace between 'unsigned' and
> 'long', as well as the whitespace between 'long' and 'long', will mean
> GDB interprets this as many arguments.
> 
> The solution proposed in this patch is to allow parenthesis to be used
> to group arguments, so the use could now write:
> 
>    user_command (*((unsigned long long *) some_pointer))
> 
>    user_command ({unsigned long long}some_pointer)
> 
> And GDB will interpret these as a single argument.
> 
> gdb/ChangeLog:
> 
> 	* cli/cli-script.c (user_args::user_args): Allow parenthesis to
> 	group arguments.
> 
> gdb/testsuite/ChangeLog:
> 
> 	* gdb.base/commands.exp (args_with_whitespace): New proc, which is
> 	added to the list of procs to call.
> 	* gdb.base/run.c (global_var): Defined global.
> 
> gdb/doc/ChangeLog:
> 
> 	* gdb.texinfo (Define): Additional documentation about argument
> 	syntax.
> ---
>  gdb/ChangeLog                       |  5 ++++
>  gdb/cli/cli-script.c                |  8 +++++-
>  gdb/doc/ChangeLog                   |  5 ++++
>  gdb/doc/gdb.texinfo                 | 54 +++++++++++++++++++++++++++++++++----
>  gdb/testsuite/ChangeLog             |  6 +++++
>  gdb/testsuite/gdb.base/commands.exp | 36 +++++++++++++++++++++++++
>  gdb/testsuite/gdb.base/run.c        |  3 +++
>  7 files changed, 111 insertions(+), 6 deletions(-)
> 
> diff --git a/gdb/cli/cli-script.c b/gdb/cli/cli-script.c
> index 6f31a400197..fd80ab9fbcf 100644
> --- a/gdb/cli/cli-script.c
> +++ b/gdb/cli/cli-script.c
> @@ -758,6 +758,7 @@ user_args::user_args (const char *command_line)
>        int squote = 0;
>        int dquote = 0;
>        int bsquote = 0;
> +      int pdepth = 0;
>  
>        /* Strip whitespace.  */
>        while (*p == ' ' || *p == '\t')
> @@ -769,7 +770,8 @@ user_args::user_args (const char *command_line)
>        /* Get to the end of this argument.  */
>        while (*p)
>  	{
> -	  if (((*p == ' ' || *p == '\t')) && !squote && !dquote && !bsquote)
> +	  if (((*p == ' ' || *p == '\t'))
> +	      && !squote && !dquote && !bsquote && pdepth == 0)
>  	    break;
>  	  else
>  	    {
> @@ -793,6 +795,10 @@ user_args::user_args (const char *command_line)
>  		    squote = 1;
>  		  else if (*p == '"')
>  		    dquote = 1;
> +		  else if (*p == '(')
> +		    pdepth++;
> +		  else if (*p == ')')
> +		    pdepth--;
>  		}
>  	      p++;
>  	    }
> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
> index 433a2698a92..11cbef97b8f 100644
> --- a/gdb/doc/gdb.texinfo
> +++ b/gdb/doc/gdb.texinfo
> @@ -25219,14 +25219,58 @@
>  
>  @noindent
>  This defines the command @code{adder}, which prints the sum of
> -its three arguments.  Note the arguments are text substitutions, so they may
> -reference variables, use complex expressions, or even perform inferior
> -functions calls.
> +its three arguments.
> +
> +The arguments to user-defined commands are text substitutions, so they
> +may reference variables, use complex expressions, or even perform
> +inferior functions calls.  Each argument is separated with whitespace,
> +so in the previous example three arguments were passed.  The following
> +example also passes three arguments, though the arguments are more
> +complex:
> +
> +@smallexample
> +adder 10+1 10+2 10+3
> +@end smallexample
> +
> +@noindent
> +However, if whitespace were added around the @code{+} characters, then
> +9 arguments would be passed, @code{adder} only uses the first 3 of
> +these arguments, and the others would be silently ignored:
> +
> +@smallexample
> +adder 10 + 1 10 + 2 10 + 3
> +@end smallexample
> +
> +@noindent
> +Parenthesis can be uses to group complex expressions that include
> +whitespace into a single argument, so the previous example can be
> +modified to pass just 3 arguments again, like this:
> +
> +@smallexample
> +adder (10 + 1) (10 + 2) (10 + 3)
> +@end smallexample
> +
> +@noindent
> +The parenthesis are passed through as part of the argument, so the
> +previous example causes @value{GDBN} to evaluate:
> +
> +@smallexample
> +print (10 + 1) + (10 + 2) + (10 + 3)
> +@end smallexample
> +
> +@noindent
> +Nested parenthesis are also allowed within an argument, in the
> +following example 3 arguments are still passed:
> +
> +@smallexample
> +adder (10 + 1) (10 + (1 + 1)) (10 + (1 + (1 + 1)))
> +@end smallexample
>  
>  @cindex argument count in user-defined commands
>  @cindex how many arguments (user-defined commands)
> -In addition, @code{$argc} may be used to find out how many arguments have
> -been passed.
> +@noindent
> +Within a user-defined command @code{$argc} may be used to find out how
> +many arguments have been passed.
>  
>  @smallexample
>  define adder
> diff --git a/gdb/testsuite/gdb.base/commands.exp b/gdb/testsuite/gdb.base/commands.exp
> index 7ce33fdefa9..42ffd6fa0af 100644
> --- a/gdb/testsuite/gdb.base/commands.exp
> +++ b/gdb/testsuite/gdb.base/commands.exp
> @@ -1125,6 +1125,41 @@ proc_with_prefix backslash_in_multi_line_command_test {} {
>      gdb_test "print 1" "" "run command"
>  }
>  
> +proc_with_prefix args_with_whitespace {} {
> +    gdb_test_multiple "define show_args" "define show_args" {
> +	-re "End with"  {
> +	    pass "define show_args"
> +	}
> +    }
> +
> +    # This test should alternate between 0xdeadbeef and 0xfeedface two times.
> +    gdb_test \
> +	[multi_line_input \
> +	     {printf "nargs=%d:", $argc} \
> +	     {set $i = 0} \
> +	     {while $i < $argc} \
> +	     {printf " "} \
> +	     {eval "echo '$arg%d'", $i} \
> +	     {set $i = $i + 1} \
> +	     {end} \
> +	     {printf "\n"} \
> +	     {end}] \
> +	"" \
> +	"enter commands"
> +
> +    gdb_test "show_args 1 2 3" \
> +	"nargs=3: '1' '2' '3'"
> +
> +    gdb_test "show_args 1 (1 + 1) (1 + (1 + 1))" \
> +	"nargs=3: '1' '\\(1 \\+ 1\\)' '\\(1 \\+ \\(1 \\+ 1\\)\\)'"
> +
> +    gdb_test "show_args ({unsigned long long} &global_var)" \
> +	"nargs=1: '\\({unsigned long long} &global_var\\)'"
> +
> +    gdb_test "show_args (*((unsigned long long *) &global_var))" \
> +	"nargs=1: '\\(\\*\\(\\(unsigned long long \\*\\) &global_var\\)\\)'"
> +}
> +
>  gdbvar_simple_if_test
>  gdbvar_simple_while_test
>  gdbvar_complex_if_while_test
> @@ -1154,5 +1189,6 @@ backslash_in_multi_line_command_test
>  define_if_without_arg_test
>  loop_break_test
>  loop_continue_test
> +args_with_whitespace
>  # This one should come last, as it redefines "backtrace".
>  redefine_backtrace_test
> diff --git a/gdb/testsuite/gdb.base/run.c b/gdb/testsuite/gdb.base/run.c
> index 614b018260d..d89bad78bb4 100644
> --- a/gdb/testsuite/gdb.base/run.c
> +++ b/gdb/testsuite/gdb.base/run.c
> @@ -8,6 +8,9 @@
>  
>  #include "../lib/unbuffer_output.c"
>  
> +/* Used by commands.exp test script.  */
> +volatile unsigned long long global_var = 34;
> +
>  int factorial (int);
>  
>  int
  
Philippe Waroquiers Aug. 25, 2018, 8:53 p.m. UTC | #3
On Sat, 2018-08-25 at 21:32 +0200, Philippe Waroquiers wrote:
> Woudn't it be more 'usual' (and maybe simpler) to use a \  for this ?
> e.g. like the shell:
> ls ab\ cd
> ls: cannot access 'ab cd': No such file or directory
> 
> Philippe
See also the RFA (in 'review needed status')
https://sourceware.org/ml/gdb-patches/2018-07/msg00131.html
where I (inconsistently with my comment above but still consistent with the
shell) have used single quotes for such purposes.

Like for the level versus number, we should aim at a single approach
in gdb for such things ...

Philippe

> 
> 
> On Wed, 2018-08-15 at 15:39 +0100, Andrew Burgess wrote:
> > When calling a user-defined command then currently, arguments are
> > whitespace separated.  This means that it is impossible to pass a
> > single argument that contains a whitespace.
> > 
> > The exception to the no whitespace rule is strings, a string argument,
> > enclosed in double, or single quotes, can contain whitespace.
> > 
> > However, if a user wants to reference, for example, a type name that
> > contains a space, as in these examples:
> > 
> >    user_command *((unsigned long long *) some_pointer)
> > 
> >    user_command {unsigned long long}some_pointer
> > 
> > then this will not work, as the whitespace between 'unsigned' and
> > 'long', as well as the whitespace between 'long' and 'long', will mean
> > GDB interprets this as many arguments.
> > 
> > The solution proposed in this patch is to allow parenthesis to be used
> > to group arguments, so the use could now write:
> > 
> >    user_command (*((unsigned long long *) some_pointer))
> > 
> >    user_command ({unsigned long long}some_pointer)
> > 
> > And GDB will interpret these as a single argument.
> > 
> > gdb/ChangeLog:
> > 
> > 	* cli/cli-script.c (user_args::user_args): Allow parenthesis to
> > 	group arguments.
> > 
> > gdb/testsuite/ChangeLog:
> > 
> > 	* gdb.base/commands.exp (args_with_whitespace): New proc, which is
> > 	added to the list of procs to call.
> > 	* gdb.base/run.c (global_var): Defined global.
> > 
> > gdb/doc/ChangeLog:
> > 
> > 	* gdb.texinfo (Define): Additional documentation about argument
> > 	syntax.
> > ---
> >  gdb/ChangeLog                       |  5 ++++
> >  gdb/cli/cli-script.c                |  8 +++++-
> >  gdb/doc/ChangeLog                   |  5 ++++
> >  gdb/doc/gdb.texinfo                 | 54 +++++++++++++++++++++++++++++++++----
> >  gdb/testsuite/ChangeLog             |  6 +++++
> >  gdb/testsuite/gdb.base/commands.exp | 36 +++++++++++++++++++++++++
> >  gdb/testsuite/gdb.base/run.c        |  3 +++
> >  7 files changed, 111 insertions(+), 6 deletions(-)
> > 
> > diff --git a/gdb/cli/cli-script.c b/gdb/cli/cli-script.c
> > index 6f31a400197..fd80ab9fbcf 100644
> > --- a/gdb/cli/cli-script.c
> > +++ b/gdb/cli/cli-script.c
> > @@ -758,6 +758,7 @@ user_args::user_args (const char *command_line)
> >        int squote = 0;
> >        int dquote = 0;
> >        int bsquote = 0;
> > +      int pdepth = 0;
> >  
> >        /* Strip whitespace.  */
> >        while (*p == ' ' || *p == '\t')
> > @@ -769,7 +770,8 @@ user_args::user_args (const char *command_line)
> >        /* Get to the end of this argument.  */
> >        while (*p)
> >  	{
> > -	  if (((*p == ' ' || *p == '\t')) && !squote && !dquote && !bsquote)
> > +	  if (((*p == ' ' || *p == '\t'))
> > +	      && !squote && !dquote && !bsquote && pdepth == 0)
> >  	    break;
> >  	  else
> >  	    {
> > @@ -793,6 +795,10 @@ user_args::user_args (const char *command_line)
> >  		    squote = 1;
> >  		  else if (*p == '"')
> >  		    dquote = 1;
> > +		  else if (*p == '(')
> > +		    pdepth++;
> > +		  else if (*p == ')')
> > +		    pdepth--;
> >  		}
> >  	      p++;
> >  	    }
> > diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
> > index 433a2698a92..11cbef97b8f 100644
> > --- a/gdb/doc/gdb.texinfo
> > +++ b/gdb/doc/gdb.texinfo
> > @@ -25219,14 +25219,58 @@
> >  
> >  @noindent
> >  This defines the command @code{adder}, which prints the sum of
> > -its three arguments.  Note the arguments are text substitutions, so they may
> > -reference variables, use complex expressions, or even perform inferior
> > -functions calls.
> > +its three arguments.
> > +
> > +The arguments to user-defined commands are text substitutions, so they
> > +may reference variables, use complex expressions, or even perform
> > +inferior functions calls.  Each argument is separated with whitespace,
> > +so in the previous example three arguments were passed.  The following
> > +example also passes three arguments, though the arguments are more
> > +complex:
> > +
> > +@smallexample
> > +adder 10+1 10+2 10+3
> > +@end smallexample
> > +
> > +@noindent
> > +However, if whitespace were added around the @code{+} characters, then
> > +9 arguments would be passed, @code{adder} only uses the first 3 of
> > +these arguments, and the others would be silently ignored:
> > +
> > +@smallexample
> > +adder 10 + 1 10 + 2 10 + 3
> > +@end smallexample
> > +
> > +@noindent
> > +Parenthesis can be uses to group complex expressions that include
> > +whitespace into a single argument, so the previous example can be
> > +modified to pass just 3 arguments again, like this:
> > +
> > +@smallexample
> > +adder (10 + 1) (10 + 2) (10 + 3)
> > +@end smallexample
> > +
> > +@noindent
> > +The parenthesis are passed through as part of the argument, so the
> > +previous example causes @value{GDBN} to evaluate:
> > +
> > +@smallexample
> > +print (10 + 1) + (10 + 2) + (10 + 3)
> > +@end smallexample
> > +
> > +@noindent
> > +Nested parenthesis are also allowed within an argument, in the
> > +following example 3 arguments are still passed:
> > +
> > +@smallexample
> > +adder (10 + 1) (10 + (1 + 1)) (10 + (1 + (1 + 1)))
> > +@end smallexample
> >  
> >  @cindex argument count in user-defined commands
> >  @cindex how many arguments (user-defined commands)
> > -In addition, @code{$argc} may be used to find out how many arguments have
> > -been passed.
> > +@noindent
> > +Within a user-defined command @code{$argc} may be used to find out how
> > +many arguments have been passed.
> >  
> >  @smallexample
> >  define adder
> > diff --git a/gdb/testsuite/gdb.base/commands.exp b/gdb/testsuite/gdb.base/commands.exp
> > index 7ce33fdefa9..42ffd6fa0af 100644
> > --- a/gdb/testsuite/gdb.base/commands.exp
> > +++ b/gdb/testsuite/gdb.base/commands.exp
> > @@ -1125,6 +1125,41 @@ proc_with_prefix backslash_in_multi_line_command_test {} {
> >      gdb_test "print 1" "" "run command"
> >  }
> >  
> > +proc_with_prefix args_with_whitespace {} {
> > +    gdb_test_multiple "define show_args" "define show_args" {
> > +	-re "End with"  {
> > +	    pass "define show_args"
> > +	}
> > +    }
> > +
> > +    # This test should alternate between 0xdeadbeef and 0xfeedface two times.
> > +    gdb_test \
> > +	[multi_line_input \
> > +	     {printf "nargs=%d:", $argc} \
> > +	     {set $i = 0} \
> > +	     {while $i < $argc} \
> > +	     {printf " "} \
> > +	     {eval "echo '$arg%d'", $i} \
> > +	     {set $i = $i + 1} \
> > +	     {end} \
> > +	     {printf "\n"} \
> > +	     {end}] \
> > +	"" \
> > +	"enter commands"
> > +
> > +    gdb_test "show_args 1 2 3" \
> > +	"nargs=3: '1' '2' '3'"
> > +
> > +    gdb_test "show_args 1 (1 + 1) (1 + (1 + 1))" \
> > +	"nargs=3: '1' '\\(1 \\+ 1\\)' '\\(1 \\+ \\(1 \\+ 1\\)\\)'"
> > +
> > +    gdb_test "show_args ({unsigned long long} &global_var)" \
> > +	"nargs=1: '\\({unsigned long long} &global_var\\)'"
> > +
> > +    gdb_test "show_args (*((unsigned long long *) &global_var))" \
> > +	"nargs=1: '\\(\\*\\(\\(unsigned long long \\*\\) &global_var\\)\\)'"
> > +}
> > +
> >  gdbvar_simple_if_test
> >  gdbvar_simple_while_test
> >  gdbvar_complex_if_while_test
> > @@ -1154,5 +1189,6 @@ backslash_in_multi_line_command_test
> >  define_if_without_arg_test
> >  loop_break_test
> >  loop_continue_test
> > +args_with_whitespace
> >  # This one should come last, as it redefines "backtrace".
> >  redefine_backtrace_test
> > diff --git a/gdb/testsuite/gdb.base/run.c b/gdb/testsuite/gdb.base/run.c
> > index 614b018260d..d89bad78bb4 100644
> > --- a/gdb/testsuite/gdb.base/run.c
> > +++ b/gdb/testsuite/gdb.base/run.c
> > @@ -8,6 +8,9 @@
> >  
> >  #include "../lib/unbuffer_output.c"
> >  
> > +/* Used by commands.exp test script.  */
> > +volatile unsigned long long global_var = 34;
> > +
> >  int factorial (int);
> >  
> >  int
  
Andrew Burgess Aug. 25, 2018, 10:43 p.m. UTC | #4
* Philippe Waroquiers <philippe.waroquiers@skynet.be> [2018-08-25 22:53:23 +0200]:

> On Sat, 2018-08-25 at 21:32 +0200, Philippe Waroquiers wrote:
> > Woudn't it be more 'usual' (and maybe simpler) to use a \  for this ?
> > e.g. like the shell:
> > ls ab\ cd
> > ls: cannot access 'ab cd': No such file or directory
> > 
> > Philippe
> See also the RFA (in 'review needed status')
> https://sourceware.org/ml/gdb-patches/2018-07/msg00131.html
> where I (inconsistently with my comment above but still consistent with the
> shell) have used single quotes for such purposes.
> 
> Like for the level versus number, we should aim at a single approach
> in gdb for such things ...

The problem here is that user defined commands already have (IMHO)
weird handling of single/double quotes.  So,

  my_command 'ab cd'

will replace $arg0 with literally, 'ac cd', including the quotes.  The
same for double quotes.  Next, there is some strange handling of
backslashes already in place, so,

  my_command ac\ cd

Will replace $arg0 with literally, ac\ cd, including the backslash.

Honestly, if there is a logic behind the argument processing, I'm not
seeing it, and the docs don't explain it.  It feels like an attempt
was made to do something like you suggest, except it all went horribly
wrong...

However, as far as I can tell, the code has been this way for years, a
quick look back to revision d318976c46b92e4d in year 2000, showed the
same argument processing code unchanged, so my concern is, if we
change things too much now we might break existing scripts....

So, my suggestion deliberately avoids using quotes or backslashes as
these are bogged down in the existing code.  And using (...) is fairly
intuitive given GDBs C like expression handling, personally I'd rather
write:

    my_command ({unsigned long long} &global_var)

than:

    my_command {unsigned\ long\ long}\ &global_var

though, I think you also suggested quotes as a possibility, so this
isn't so bad:

    my_command '{unsigned long long} &global_var'

but the C like expression does feel more natural to me.  But I could
live with the quotes to get the functionality I need in.

Additionally, using (...) for grouping didn't feel like it was likely
to break much existing code, I convinced myself that it was unlikely
that someone would write:

   my_command (ab cd)

and actually want '(ab' and 'cd)' to be separate arguments...

The only saving grace here is that the argument processing for user
defined commands is completely separate, and not used anywhere else.
We could make the choice that this is just a legacy corner of GDB that
has its own rules.

In summary I think that the choices are:

  1. Break backward compatibility for use of quote and backslashes,
     and rework argument processing for user defined commands.

  2. Stick with (...)

  3. Define a new scheme that allows most backward compatibility to be
     maintained, but extends use backslash and quote to allow for
     argument grouping.

Happy to hear more thoughts on this topic,

Thanks,
Andrew



> 
> Philippe
> 
> > 
> > 
> > On Wed, 2018-08-15 at 15:39 +0100, Andrew Burgess wrote:
> > > When calling a user-defined command then currently, arguments are
> > > whitespace separated.  This means that it is impossible to pass a
> > > single argument that contains a whitespace.
> > > 
> > > The exception to the no whitespace rule is strings, a string argument,
> > > enclosed in double, or single quotes, can contain whitespace.
> > > 
> > > However, if a user wants to reference, for example, a type name that
> > > contains a space, as in these examples:
> > > 
> > >    user_command *((unsigned long long *) some_pointer)
> > > 
> > >    user_command {unsigned long long}some_pointer
> > > 
> > > then this will not work, as the whitespace between 'unsigned' and
> > > 'long', as well as the whitespace between 'long' and 'long', will mean
> > > GDB interprets this as many arguments.
> > > 
> > > The solution proposed in this patch is to allow parenthesis to be used
> > > to group arguments, so the use could now write:
> > > 
> > >    user_command (*((unsigned long long *) some_pointer))
> > > 
> > >    user_command ({unsigned long long}some_pointer)
> > > 
> > > And GDB will interpret these as a single argument.
> > > 
> > > gdb/ChangeLog:
> > > 
> > > 	* cli/cli-script.c (user_args::user_args): Allow parenthesis to
> > > 	group arguments.
> > > 
> > > gdb/testsuite/ChangeLog:
> > > 
> > > 	* gdb.base/commands.exp (args_with_whitespace): New proc, which is
> > > 	added to the list of procs to call.
> > > 	* gdb.base/run.c (global_var): Defined global.
> > > 
> > > gdb/doc/ChangeLog:
> > > 
> > > 	* gdb.texinfo (Define): Additional documentation about argument
> > > 	syntax.
> > > ---
> > >  gdb/ChangeLog                       |  5 ++++
> > >  gdb/cli/cli-script.c                |  8 +++++-
> > >  gdb/doc/ChangeLog                   |  5 ++++
> > >  gdb/doc/gdb.texinfo                 | 54 +++++++++++++++++++++++++++++++++----
> > >  gdb/testsuite/ChangeLog             |  6 +++++
> > >  gdb/testsuite/gdb.base/commands.exp | 36 +++++++++++++++++++++++++
> > >  gdb/testsuite/gdb.base/run.c        |  3 +++
> > >  7 files changed, 111 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/gdb/cli/cli-script.c b/gdb/cli/cli-script.c
> > > index 6f31a400197..fd80ab9fbcf 100644
> > > --- a/gdb/cli/cli-script.c
> > > +++ b/gdb/cli/cli-script.c
> > > @@ -758,6 +758,7 @@ user_args::user_args (const char *command_line)
> > >        int squote = 0;
> > >        int dquote = 0;
> > >        int bsquote = 0;
> > > +      int pdepth = 0;
> > >  
> > >        /* Strip whitespace.  */
> > >        while (*p == ' ' || *p == '\t')
> > > @@ -769,7 +770,8 @@ user_args::user_args (const char *command_line)
> > >        /* Get to the end of this argument.  */
> > >        while (*p)
> > >  	{
> > > -	  if (((*p == ' ' || *p == '\t')) && !squote && !dquote && !bsquote)
> > > +	  if (((*p == ' ' || *p == '\t'))
> > > +	      && !squote && !dquote && !bsquote && pdepth == 0)
> > >  	    break;
> > >  	  else
> > >  	    {
> > > @@ -793,6 +795,10 @@ user_args::user_args (const char *command_line)
> > >  		    squote = 1;
> > >  		  else if (*p == '"')
> > >  		    dquote = 1;
> > > +		  else if (*p == '(')
> > > +		    pdepth++;
> > > +		  else if (*p == ')')
> > > +		    pdepth--;
> > >  		}
> > >  	      p++;
> > >  	    }
> > > diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
> > > index 433a2698a92..11cbef97b8f 100644
> > > --- a/gdb/doc/gdb.texinfo
> > > +++ b/gdb/doc/gdb.texinfo
> > > @@ -25219,14 +25219,58 @@
> > >  
> > >  @noindent
> > >  This defines the command @code{adder}, which prints the sum of
> > > -its three arguments.  Note the arguments are text substitutions, so they may
> > > -reference variables, use complex expressions, or even perform inferior
> > > -functions calls.
> > > +its three arguments.
> > > +
> > > +The arguments to user-defined commands are text substitutions, so they
> > > +may reference variables, use complex expressions, or even perform
> > > +inferior functions calls.  Each argument is separated with whitespace,
> > > +so in the previous example three arguments were passed.  The following
> > > +example also passes three arguments, though the arguments are more
> > > +complex:
> > > +
> > > +@smallexample
> > > +adder 10+1 10+2 10+3
> > > +@end smallexample
> > > +
> > > +@noindent
> > > +However, if whitespace were added around the @code{+} characters, then
> > > +9 arguments would be passed, @code{adder} only uses the first 3 of
> > > +these arguments, and the others would be silently ignored:
> > > +
> > > +@smallexample
> > > +adder 10 + 1 10 + 2 10 + 3
> > > +@end smallexample
> > > +
> > > +@noindent
> > > +Parenthesis can be uses to group complex expressions that include
> > > +whitespace into a single argument, so the previous example can be
> > > +modified to pass just 3 arguments again, like this:
> > > +
> > > +@smallexample
> > > +adder (10 + 1) (10 + 2) (10 + 3)
> > > +@end smallexample
> > > +
> > > +@noindent
> > > +The parenthesis are passed through as part of the argument, so the
> > > +previous example causes @value{GDBN} to evaluate:
> > > +
> > > +@smallexample
> > > +print (10 + 1) + (10 + 2) + (10 + 3)
> > > +@end smallexample
> > > +
> > > +@noindent
> > > +Nested parenthesis are also allowed within an argument, in the
> > > +following example 3 arguments are still passed:
> > > +
> > > +@smallexample
> > > +adder (10 + 1) (10 + (1 + 1)) (10 + (1 + (1 + 1)))
> > > +@end smallexample
> > >  
> > >  @cindex argument count in user-defined commands
> > >  @cindex how many arguments (user-defined commands)
> > > -In addition, @code{$argc} may be used to find out how many arguments have
> > > -been passed.
> > > +@noindent
> > > +Within a user-defined command @code{$argc} may be used to find out how
> > > +many arguments have been passed.
> > >  
> > >  @smallexample
> > >  define adder
> > > diff --git a/gdb/testsuite/gdb.base/commands.exp b/gdb/testsuite/gdb.base/commands.exp
> > > index 7ce33fdefa9..42ffd6fa0af 100644
> > > --- a/gdb/testsuite/gdb.base/commands.exp
> > > +++ b/gdb/testsuite/gdb.base/commands.exp
> > > @@ -1125,6 +1125,41 @@ proc_with_prefix backslash_in_multi_line_command_test {} {
> > >      gdb_test "print 1" "" "run command"
> > >  }
> > >  
> > > +proc_with_prefix args_with_whitespace {} {
> > > +    gdb_test_multiple "define show_args" "define show_args" {
> > > +	-re "End with"  {
> > > +	    pass "define show_args"
> > > +	}
> > > +    }
> > > +
> > > +    # This test should alternate between 0xdeadbeef and 0xfeedface two times.
> > > +    gdb_test \
> > > +	[multi_line_input \
> > > +	     {printf "nargs=%d:", $argc} \
> > > +	     {set $i = 0} \
> > > +	     {while $i < $argc} \
> > > +	     {printf " "} \
> > > +	     {eval "echo '$arg%d'", $i} \
> > > +	     {set $i = $i + 1} \
> > > +	     {end} \
> > > +	     {printf "\n"} \
> > > +	     {end}] \
> > > +	"" \
> > > +	"enter commands"
> > > +
> > > +    gdb_test "show_args 1 2 3" \
> > > +	"nargs=3: '1' '2' '3'"
> > > +
> > > +    gdb_test "show_args 1 (1 + 1) (1 + (1 + 1))" \
> > > +	"nargs=3: '1' '\\(1 \\+ 1\\)' '\\(1 \\+ \\(1 \\+ 1\\)\\)'"
> > > +
> > > +    gdb_test "show_args ({unsigned long long} &global_var)" \
> > > +	"nargs=1: '\\({unsigned long long} &global_var\\)'"
> > > +
> > > +    gdb_test "show_args (*((unsigned long long *) &global_var))" \
> > > +	"nargs=1: '\\(\\*\\(\\(unsigned long long \\*\\) &global_var\\)\\)'"
> > > +}
> > > +
> > >  gdbvar_simple_if_test
> > >  gdbvar_simple_while_test
> > >  gdbvar_complex_if_while_test
> > > @@ -1154,5 +1189,6 @@ backslash_in_multi_line_command_test
> > >  define_if_without_arg_test
> > >  loop_break_test
> > >  loop_continue_test
> > > +args_with_whitespace
> > >  # This one should come last, as it redefines "backtrace".
> > >  redefine_backtrace_test
> > > diff --git a/gdb/testsuite/gdb.base/run.c b/gdb/testsuite/gdb.base/run.c
> > > index 614b018260d..d89bad78bb4 100644
> > > --- a/gdb/testsuite/gdb.base/run.c
> > > +++ b/gdb/testsuite/gdb.base/run.c
> > > @@ -8,6 +8,9 @@
> > >  
> > >  #include "../lib/unbuffer_output.c"
> > >  
> > > +/* Used by commands.exp test script.  */
> > > +volatile unsigned long long global_var = 34;
> > > +
> > >  int factorial (int);
> > >  
> > >  int
  
Tom Tromey Aug. 28, 2018, 3:53 p.m. UTC | #5
>>>>> "Andrew" == Andrew Burgess <andrew.burgess@embecosm.com> writes:

Andrew> So, my suggestion deliberately avoids using quotes or backslashes as
Andrew> these are bogged down in the existing code.  And using (...) is fairly
Andrew> intuitive given GDBs C like expression handling, personally I'd rather
Andrew> write:
Andrew>     my_command ({unsigned long long} &global_var)
Andrew> than:
Andrew>     my_command {unsigned\ long\ long}\ &global_var

FWIW I tend to agree with your logic here.

User-defined argument parsing is broken (and I think there's at least
one bug in bugzilla about this), but at the same time, making breaking
changes seems unfriendly.  Your approach doesn't seem to be breaking
anything that is likely to be actually used.

Tom
  
Andrew Burgess Aug. 28, 2018, 6:43 p.m. UTC | #6
* Tom Tromey <tom@tromey.com> [2018-08-28 09:53:52 -0600]:

> >>>>> "Andrew" == Andrew Burgess <andrew.burgess@embecosm.com> writes:
> 
> Andrew> So, my suggestion deliberately avoids using quotes or backslashes as
> Andrew> these are bogged down in the existing code.  And using (...) is fairly
> Andrew> intuitive given GDBs C like expression handling, personally I'd rather
> Andrew> write:
> Andrew>     my_command ({unsigned long long} &global_var)
> Andrew> than:
> Andrew>     my_command {unsigned\ long\ long}\ &global_var
> 
> FWIW I tend to agree with your logic here.
> 
> User-defined argument parsing is broken (and I think there's at least
> one bug in bugzilla about this), but at the same time, making breaking
> changes seems unfriendly.  Your approach doesn't seem to be breaking
> anything that is likely to be actually used.

Given that the argument passing for user-defined functions is pretty
self contained we could, conceivably, implement a whole new system and
have a switch to select between them...  the existing code does seem
rather odd.

But ideally, I'd like that to be a project for another day.

Thanks,
Andrew
  
Philippe Waroquiers Aug. 28, 2018, 8:28 p.m. UTC | #7
On Tue, 2018-08-28 at 19:43 +0100, Andrew Burgess wrote:
> * Tom Tromey <tom@tromey.com> [2018-08-28 09:53:52 -0600]:
> 
> > > > > > > "Andrew" == Andrew Burgess <andrew.burgess@embecosm.com> writes:
> > 
> > Andrew> So, my suggestion deliberately avoids using quotes or backslashes as
> > Andrew> these are bogged down in the existing code.  And using (...) is fairly
> > Andrew> intuitive given GDBs C like expression handling, personally I'd rather
> > Andrew> write:
> > Andrew>     my_command ({unsigned long long} &global_var)
> > Andrew> than:
> > Andrew>     my_command {unsigned\ long\ long}\ &global_var
> > 
> > FWIW I tend to agree with your logic here.
> > 
> > User-defined argument parsing is broken (and I think there's at least
> > one bug in bugzilla about this), but at the same time, making breaking
> > changes seems unfriendly.  Your approach doesn't seem to be breaking
> > anything that is likely to be actually used.

I do not think a \ followed by a space will create a lot of incompatibilities.
i
.e. that someone would type
    some_user_defined_command  a\ b
to give 2
different args to the command.

For single quotes:  it is unlikely that someone types something like
   some_user_defined_command 'abc def'
and was expecting the user defined command to receive two args.
What is however a lot more likely is
   some_user_defined_command 'a'
and this command expects to receive a character value.
So, yes, single quotes has more potential to create incompatibilities.

On the downside, quoting with parenthesis is something very peculiar
(is there some other tool/shell/... using this convention ?).

And from a previous discussion with Pedro, he indicated that
some commands in gdb already are using single quotes.
The 'Not sure' below is a quote of Pedro :), replying to a suggestion
I gave to let the user tell explicitly if an arg was quoted or not.
   >   * have a way to (explicitely) quote an argument e.g.
   >       info var  -Qt 'long int'  regexpofsomevars
   >     where -Q indicates that the next "value argument" is
   >     explicitely quoted.

   Not sure we need that -Q.  We can support optional quotes, and
   tell users to add quotes if necessary?  That's what we've done
   since forever in linespecs and expressions, for example.

It is based on this that I used single quotes in the 
  info var/func/arg/local -t TYPEREGEXP NAMEREGEXP
patch.

> Given that the argument passing for user-defined functions is pretty
> self contained we could, conceivably, implement a whole new system and
> have a switch to select between them...  the existing code does seem
> rather odd.
> 
> But ideally, I'd like that to be a project for another day.
And we also have the 'generalised arg parser' prototype that Pedro has
in a corner.

IMO, clearly, what to do is unclear :).

Philippe
  
Andrew Burgess Aug. 28, 2018, 11:29 p.m. UTC | #8
* Philippe Waroquiers <philippe.waroquiers@skynet.be> [2018-08-28 22:28:57 +0200]:

> On Tue, 2018-08-28 at 19:43 +0100, Andrew Burgess wrote:
> > * Tom Tromey <tom@tromey.com> [2018-08-28 09:53:52 -0600]:
> > 
> > > > > > > > "Andrew" == Andrew Burgess <andrew.burgess@embecosm.com> writes:
> > > 
> > > Andrew> So, my suggestion deliberately avoids using quotes or backslashes as
> > > Andrew> these are bogged down in the existing code.  And using (...) is fairly
> > > Andrew> intuitive given GDBs C like expression handling, personally I'd rather
> > > Andrew> write:
> > > Andrew>     my_command ({unsigned long long} &global_var)
> > > Andrew> than:
> > > Andrew>     my_command {unsigned\ long\ long}\ &global_var
> > > 
> > > FWIW I tend to agree with your logic here.
> > > 
> > > User-defined argument parsing is broken (and I think there's at least
> > > one bug in bugzilla about this), but at the same time, making breaking
> > > changes seems unfriendly.  Your approach doesn't seem to be breaking
> > > anything that is likely to be actually used.
> 
> I do not think a \ followed by a space will create a lot of incompatibilities.
> i
> .e. that someone would type
>     some_user_defined_command  a\ b
> to give 2
> different args to the command.

No, and what they get at the moment is a single argument, which is:

    a\ b

including the backslash and the space.  Is this correct, or useful?  I
suspect not, but my instinct in these cases is not to mess with things
without good reason, and I didn't have a good reason.

> 
> For single quotes:  it is unlikely that someone types something like
>    some_user_defined_command 'abc def'
> and was expecting the user defined command to receive two args.

I agree, and what they get right now is:

    'abc def'

including the quotes.

> What is however a lot more likely is
>    some_user_defined_command 'a'
> and this command expects to receive a character value.

Which is more or less what you get:

    'a'

> So, yes, single quotes has more potential to create
> incompatibilities.

I guess that depends on what you're proposing the argument should
evaluate too.

> 
> On the downside, quoting with parenthesis is something very peculiar
> (is there some other tool/shell/... using this convention ?).

I guess I never saw it as quoting, just grouping an expression
together.  I think I see GDB as mostly consuming C like expressions,
which is why using () seemed natural enough. Coincidentally the users
I'm working with initially tried () too (without prompting from me) so
my focus group of 2 all agreed with me ;-)

> 
> And from a previous discussion with Pedro, he indicated that
> some commands in gdb already are using single quotes.
> The 'Not sure' below is a quote of Pedro :), replying to a suggestion
> I gave to let the user tell explicitly if an arg was quoted or not.
>    >   * have a way to (explicitely) quote an argument e.g.
>    >       info var  -Qt 'long int'  regexpofsomevars
>    >     where -Q indicates that the next "value argument" is
>    >     explicitely quoted.
> 
>    Not sure we need that -Q.  We can support optional quotes, and
>    tell users to add quotes if necessary?  That's what we've done
>    since forever in linespecs and expressions, for example.
> 
> It is based on this that I used single quotes in the 
>   info var/func/arg/local -t TYPEREGEXP NAMEREGEXP
> patch.
> 
> > Given that the argument passing for user-defined functions is pretty
> > self contained we could, conceivably, implement a whole new system and
> > have a switch to select between them...  the existing code does seem
> > rather odd.
> > 
> > But ideally, I'd like that to be a project for another day.
> And we also have the 'generalised arg parser' prototype that Pedro has
> in a corner.
> 
> IMO, clearly, what to do is unclear :).

I agree, except I need to figure out some clarity in order to resolve
the real issue I have :)

I think that you're arguing in favour of reworking the argument
passing to user-defined commands.  I'm not entirely sure exactly what
rules you're proposing we adopt though.

Maybe you could help me work through a few examples, in the following
table I've outlined a few examples, and have the following questions:

  [1] What argument(s) would be passed under the new scheme?
  [2] What argument(s) would be passed under the new scheme?
  [3] How would I write {unsigned long} &var to pass this as a single
      argument?  If with quotes, how does this relate to 1 & 2?

  | Input                | Current Arguments(s) | Proposed Argument(s) |
  |----------------------+----------------------+----------------------|
  | a b                  | a                    | a                    |
  |                      | b                    | b                    |
  |----------------------+----------------------+----------------------|
  | "a b"                | "a b"                | [1]                  |
  |----------------------+----------------------+----------------------|
  | 'a b'                | 'a b'                | [2]                  |
  |----------------------+----------------------+----------------------|
  | {unsigned long} &var | {unsigned            |                      |
  |                      | long}                | N/A                  |
  |                      | &var                 |                      |
  |----------------------+----------------------+----------------------|
  | [3]                  |                      |                      |
  |----------------------+----------------------+----------------------|

In general I'm happy to rework this part of GDB, but ideally I'd like
some feedback from a global maintainer that such a change, which might
would break backward compatibility, would be acceptable...

Thanks,
Andrew
  
Tom Tromey Aug. 30, 2018, 2:19 a.m. UTC | #9
>>>>> "Andrew" == Andrew Burgess <andrew.burgess@embecosm.com> writes:

Andrew> In general I'm happy to rework this part of GDB, but ideally I'd like
Andrew> some feedback from a global maintainer that such a change, which might
Andrew> would break backward compatibility, would be acceptable...

My philosophy has generally been to avoid incompatibility when possible
and when there seems to be some likelihood that the behavior is actually
used.

So, for example, I felt ok changing the parsing of "bt" arguments,
because I felt the implementation was simply buggy and that nobody would
be using its weird corner cases.

On the other hand, back in the day I changed the parsing of
"disassemble" and heard complaints for quite some time.  I guess this
informed my view.

Sometimes this leads to bad results, like how users should nearly always
use "watch -location", but it isn't the default.


This is why I think the paren grouping idea is fine: the examples I can
think of all seem useless.


That said, upthread you mentioned the idea of passing a flag to "define"
to change how arguments are parsed.  I would be ok with this as well.
However I think that in this case it would be good to work out the
details beforehand, including looking through bugzilla to see if there
are other unmet needs.

Tom
  
Tom Tromey Aug. 30, 2018, 2:26 a.m. UTC | #10
>>>>> "Philippe" == Philippe Waroquiers <philippe.waroquiers@skynet.be> writes:

Philippe> On the downside, quoting with parenthesis is something very peculiar
Philippe> (is there some other tool/shell/... using this convention ?).

gdb itself does this in some cases.  Often an expression can be
terminated with a ",", but parentheses can prevent this termination.  I
believe this can occur with linespecs as well.

Tom
  
Philippe Waroquiers Aug. 30, 2018, 9:19 p.m. UTC | #11
On Wed, 2018-08-29 at 00:29 +0100, Andrew Burgess wrote:
> 
> Maybe you could help me work through a few examples, in the following
> table I've outlined a few examples, and have the following questions:
> 
>   [1] What argument(s) would be passed under the new scheme?
>   [2] What argument(s) would be passed under the new scheme?
>   [3] How would I write {unsigned long} &var to pass this as a single
>       argument?  If with quotes, how does this relate to 1 & 2?
> 
>   | Input                | Current Arguments(s) | Proposed Argument(s) |
>   |----------------------+----------------------+----------------------|
>   | a b                  | a                    | a                    |
>   |                      | b                    | b                    |
>   |----------------------+----------------------+----------------------|
>   | "a b"                | "a b"                | [1]                  |
>   |----------------------+----------------------+----------------------|
>   | 'a b'                | 'a b'                | [2]                  |
>   |----------------------+----------------------+----------------------|
>   | {unsigned long} &var | {unsigned            |                      |
>   |                      | long}                | N/A                  |
>   |                      | &var                 |                      |
>   |----------------------+----------------------+----------------------|
>   | [3]                  |                      |                      |
>   |----------------------+----------------------+----------------------|
> 
> In general I'm happy to rework this part of GDB, but ideally I'd like
> some feedback from a global maintainer that such a change, which might
> would break backward compatibility, would be acceptable...

Humph, now I understand that user defined commands already handle
specially single and double quotes, as you explained above.
This seems not documented in the gdb documentation, which tells
that 'Note the arguments are text substitutions, so they may
reference variables, use complex expressions, or even perform inferior
functions calls.' I have not found a description of this single and double
quote behaviour.


Maybe the only thing we might need is something like:
   $unquoted_arg1 or something like that to allow to remove the quotes
   inside the user defined command, when desired ?
Or as you suggested, some new flags to the define command, such as
  define some_command -unquote_arg3
would indicate that if arg3 is a single quoted argument as in your table
above, that it must be given without the quote to some_command.

For what concerns the parenthesis based solution, it looks to not work
for some cases.

E.g. if I want to pass one argument (I am using single quotes to show the args
I want in the below examples):
    'a b) (c d'
then I need to use:
    some_user_defined (a b) (c d)
but that is the same notation as if I want to pass 2 arguments
    'a b'  and 'c d'

So, the parenthesis mechanism would need escaping or something like that,
to allow a general grouping. As it stands now, it only works
based on the assumption that the content inside the parenthesis 

Still, it looks strange to me that we introduce another mechanism
on top of the single/double quotes to quote.
And it would be nice if the mechanism used to quote args would
be compatible between user defined commands and native gdb commands.

I have not understood the reference given by Tom that expressions
are terminated with ,  and that parenthesis stops this termination.
Is that described somewhere in the doc ?
The doc (or an example if this is not documented) will help
me to understand.

Philippe
  
Tom Tromey Aug. 31, 2018, 8:59 p.m. UTC | #12
>>>>> "Philippe" == Philippe Waroquiers <philippe.waroquiers@skynet.be> writes:

Philippe> For what concerns the parenthesis based solution, it looks to not work
Philippe> for some cases.

Philippe> E.g. if I want to pass one argument (I am using single quotes to show the args
Philippe> I want in the below examples):
Philippe>     'a b) (c d'
Philippe> then I need to use:
Philippe>     some_user_defined (a b) (c d)
Philippe> but that is the same notation as if I want to pass 2 arguments
Philippe>     'a b'  and 'c d'

Can you think of an actual example where this would be useful?
My feeling is that there isn't one, though I'd be interested in
tricky-but-plausible counter-examples.

This feeling is why I'm generally ok with Andrew's idea.

Philippe> And it would be nice if the mechanism used to quote args would
Philippe> be compatible between user defined commands and native gdb commands.

There is no universal quoting in gdb.  Instead there are 3 common cases,
plus the extras (neglecting MI, which is totally different and not
relevant here):

1. Commands that take an expression.  These are things like "print".
   Expressions are passed to the language parser, but language parsers
   have to follow some gdb rules: optionally terminate parsing at a
   top-level (unparenthesized) comma, and also terminate parsing at "if"
   or some other things (there is a mildly odd case for Ada tasks).

2. Commands that take a linespec.  These can also take an expression via
   the "*" linespec, so linespecs end up following some expression
   rules.  Often there are expressions after the linespec too, like
     break LINESPEC if EXPR
   or
     dprintf LINESPEC, "string", EXPR, EXPR...

3. Commands that use gdb_argv (or libiberty's buildargv, which is the
   same thing).  These respect some simple quoting.  However, the
   quoting is very simple, not like what you might expect from the
   shell, for example I don't think you can quote an embedded quotation
   mark of the same kind (that is, no "this has \"quotes\"").

4. Everything else.  gdb commands are just given a string and some do
   whatever the author liked.

Philippe> I have not understood the reference given by Tom that expressions
Philippe> are terminated with ,  and that parenthesis stops this termination.
Philippe> Is that described somewhere in the doc ?
Philippe> The doc (or an example if this is not documented) will help
Philippe> me to understand.

I think it's largely undocumented, since some of these quirks are just
constraints arising from gdb's implementation choices.

Not having any rhyme or reason to command argument parsing has good and
bad facets.

The good is that the generally DWIM nature of commands means that
usually you don't have to contort yourself to satisfy some parser.
Like, you can "break foo.c:93" or "break function" rather than something
like the old dbx "stop in" / "stop at" split.

The bad of course is that you may sometimes want to quote something and
not have any idea of how to do it: because there's no consistency;
because the gdb_argv idea was not really thought through (that's my
conclusion, I don't know the actual story); or perhaps because you've
just tripped across some command that was implemented in a particularly
bad way.

Now, it would be great to fix this, at least for some subset of things.
I find it hard to see a way forward, though.  Breaking compatibility
(see my post upthread) is unpleasant, unless maybe it is done in a very
dramatic way, coupled with a major version bump and some kind of huge
warning for users -- but that's also very hard to implement and release.

One idea is to add a new standard way to parse arguments, for new
commands.  But of course then gdb just has 5 ways to do it ... :(

Tom
  

Patch

diff --git a/gdb/cli/cli-script.c b/gdb/cli/cli-script.c
index 6f31a400197..fd80ab9fbcf 100644
--- a/gdb/cli/cli-script.c
+++ b/gdb/cli/cli-script.c
@@ -758,6 +758,7 @@  user_args::user_args (const char *command_line)
       int squote = 0;
       int dquote = 0;
       int bsquote = 0;
+      int pdepth = 0;
 
       /* Strip whitespace.  */
       while (*p == ' ' || *p == '\t')
@@ -769,7 +770,8 @@  user_args::user_args (const char *command_line)
       /* Get to the end of this argument.  */
       while (*p)
 	{
-	  if (((*p == ' ' || *p == '\t')) && !squote && !dquote && !bsquote)
+	  if (((*p == ' ' || *p == '\t'))
+	      && !squote && !dquote && !bsquote && pdepth == 0)
 	    break;
 	  else
 	    {
@@ -793,6 +795,10 @@  user_args::user_args (const char *command_line)
 		    squote = 1;
 		  else if (*p == '"')
 		    dquote = 1;
+		  else if (*p == '(')
+		    pdepth++;
+		  else if (*p == ')')
+		    pdepth--;
 		}
 	      p++;
 	    }
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 433a2698a92..11cbef97b8f 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -25219,14 +25219,58 @@ 
 
 @noindent
 This defines the command @code{adder}, which prints the sum of
-its three arguments.  Note the arguments are text substitutions, so they may
-reference variables, use complex expressions, or even perform inferior
-functions calls.
+its three arguments.
+
+The arguments to user-defined commands are text substitutions, so they
+may reference variables, use complex expressions, or even perform
+inferior functions calls.  Each argument is separated with whitespace,
+so in the previous example three arguments were passed.  The following
+example also passes three arguments, though the arguments are more
+complex:
+
+@smallexample
+adder 10+1 10+2 10+3
+@end smallexample
+
+@noindent
+However, if whitespace were added around the @code{+} characters, then
+9 arguments would be passed, @code{adder} only uses the first 3 of
+these arguments, and the others would be silently ignored:
+
+@smallexample
+adder 10 + 1 10 + 2 10 + 3
+@end smallexample
+
+@noindent
+Parenthesis can be uses to group complex expressions that include
+whitespace into a single argument, so the previous example can be
+modified to pass just 3 arguments again, like this:
+
+@smallexample
+adder (10 + 1) (10 + 2) (10 + 3)
+@end smallexample
+
+@noindent
+The parenthesis are passed through as part of the argument, so the
+previous example causes @value{GDBN} to evaluate:
+
+@smallexample
+print (10 + 1) + (10 + 2) + (10 + 3)
+@end smallexample
+
+@noindent
+Nested parenthesis are also allowed within an argument, in the
+following example 3 arguments are still passed:
+
+@smallexample
+adder (10 + 1) (10 + (1 + 1)) (10 + (1 + (1 + 1)))
+@end smallexample
 
 @cindex argument count in user-defined commands
 @cindex how many arguments (user-defined commands)
-In addition, @code{$argc} may be used to find out how many arguments have
-been passed.
+@noindent
+Within a user-defined command @code{$argc} may be used to find out how
+many arguments have been passed.
 
 @smallexample
 define adder
diff --git a/gdb/testsuite/gdb.base/commands.exp b/gdb/testsuite/gdb.base/commands.exp
index 7ce33fdefa9..42ffd6fa0af 100644
--- a/gdb/testsuite/gdb.base/commands.exp
+++ b/gdb/testsuite/gdb.base/commands.exp
@@ -1125,6 +1125,41 @@  proc_with_prefix backslash_in_multi_line_command_test {} {
     gdb_test "print 1" "" "run command"
 }
 
+proc_with_prefix args_with_whitespace {} {
+    gdb_test_multiple "define show_args" "define show_args" {
+	-re "End with"  {
+	    pass "define show_args"
+	}
+    }
+
+    # This test should alternate between 0xdeadbeef and 0xfeedface two times.
+    gdb_test \
+	[multi_line_input \
+	     {printf "nargs=%d:", $argc} \
+	     {set $i = 0} \
+	     {while $i < $argc} \
+	     {printf " "} \
+	     {eval "echo '$arg%d'", $i} \
+	     {set $i = $i + 1} \
+	     {end} \
+	     {printf "\n"} \
+	     {end}] \
+	"" \
+	"enter commands"
+
+    gdb_test "show_args 1 2 3" \
+	"nargs=3: '1' '2' '3'"
+
+    gdb_test "show_args 1 (1 + 1) (1 + (1 + 1))" \
+	"nargs=3: '1' '\\(1 \\+ 1\\)' '\\(1 \\+ \\(1 \\+ 1\\)\\)'"
+
+    gdb_test "show_args ({unsigned long long} &global_var)" \
+	"nargs=1: '\\({unsigned long long} &global_var\\)'"
+
+    gdb_test "show_args (*((unsigned long long *) &global_var))" \
+	"nargs=1: '\\(\\*\\(\\(unsigned long long \\*\\) &global_var\\)\\)'"
+}
+
 gdbvar_simple_if_test
 gdbvar_simple_while_test
 gdbvar_complex_if_while_test
@@ -1154,5 +1189,6 @@  backslash_in_multi_line_command_test
 define_if_without_arg_test
 loop_break_test
 loop_continue_test
+args_with_whitespace
 # This one should come last, as it redefines "backtrace".
 redefine_backtrace_test
diff --git a/gdb/testsuite/gdb.base/run.c b/gdb/testsuite/gdb.base/run.c
index 614b018260d..d89bad78bb4 100644
--- a/gdb/testsuite/gdb.base/run.c
+++ b/gdb/testsuite/gdb.base/run.c
@@ -8,6 +8,9 @@ 
 
 #include "../lib/unbuffer_output.c"
 
+/* Used by commands.exp test script.  */
+volatile unsigned long long global_var = 34;
+
 int factorial (int);
 
 int