gdb: Support some escaping of args with startup-with-shell being off

Message ID 72227f1c5a2e350ca70b2151d1b91306a0261bdc.1736860317.git.aburgess@redhat.com
State New
Headers
Series gdb: Support some escaping of args with startup-with-shell being off |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Test passed

Commit Message

Andrew Burgess Jan. 14, 2025, 1:14 p.m. UTC
  From: Michael Weghorn <m.weghorn@posteo.de>

This patch was originally posted as part of this series:

  https://inbox.sourceware.org/gdb-patches/cover.1730731085.git.aburgess@redhat.com

But it's not really fair to ask people to review a 14 patch series.
So instead I plan to split the series up into smaller pieces and post
them in turn.

This first patch changes a single GDB files, plus some tests.

Thanks,
Andrew

--

I (Andrew Burgess) have taken this patch from this series:

  https://inbox.sourceware.org/gdb-patches/20211022071933.3478427-1-m.weghorn@posteo.de/

I started off reviewing that series, but wanted to explore some
alternative strategies for solving the problems this series addresses.
However, this patch I think is super useful, so I've taken it mostly
as it was in the original series.

I have made a few minor cleanups, and I've also added some more tests.
Any bugs should be considered mine (Andrew's), but I've left the
original author (Michael Weghorn) in place as the GDB side changes are
mostly their work.

The function execv_argv::init_for_no_shell (gdb/nat/fork-inferior.c),
is passed a single string ALLARGS containing all of the inferior
arguments, and contains some custom code for splitting this argument
string into a vector of separate arguments.  This function is used
when startup-with-shell is off (which is not the default).

The algorithm in this function was just splitting on whitespace
characters, and ignoring any quoting, so for example:

    (gdb) set startup-with-shell off
    (gdb) set args "first arg" second_arg

would result in three arguments ("first), (arg"), and (second_arg)
being passed to the inferior (the parenthesis are not part of the
parsed arguments).

This commit replaces this custom argument splitting with a use of the
existing gdb_argv class (which uses the libiberty buildargv function).
This does a better job of supporting quoting and escaping, so for the
example given above we now pass two arguments (first arg)
and (second_arg), which is certainly what I would have expected as a
GDB user.

This commit changes the 'execv_argv' class accordingly and drops the
optimization to have all the 'char *' in 'm_argv' point to a single
string rather than allocating a separate string for each arg.  This is
needed because we are now going to be stripping some escaping from the
arguments, for example:

    (gdb) set startup-with-shell off
    (gdb) set args "literal \$"

In this case we will pass the single argument (literal $) to the
inferior, the escaping backslash will be removed.  This might seem
strange as usually the backslash would be stripped by the shell, and
now we have no shell.  However, I think the consistent behaviour is a
good thing; whether we start with a shell or not the escaping will be
removed.

Using gdb_argv will mean that quote characters are also stripped.  If
we consider the first example again:

    (gdb) set startup-with-shell off
    (gdb) set args "first arg" second_arg

This is now going to pass (first arg) and (second_arg), the quotes
have been removed.  If the user did want the original behaviour then
they are going to have to now do this:

    (gdb) set startup-with-shell off
    (gdb) set args \"first arg\" second_arg

or they could do this:

    (gdb) set startup-with-shell off
    (gdb) set args '"first' 'arg"' second_arg

This commit also extends the three tests that cover inferior argument
passing to cover the case where 'startup-with-shell' is off.  All of
these new tests pass for native targets, but there are still problems
when using remote targets.

The remote target problems arise because of how escaping is handled
while passing arguments to remote targets.  I have a larger series
that aims to address this issue:

  https://inbox.sourceware.org/gdb-patches/cover.1730731085.git.aburgess@redhat.com

This patch was originally part of that series, but getting a 14 patch
series reviewed is not easy, so I've pulled this patch out on its own
for now, and the new tests are (rather crudely) disabled for remote
targets.

My hope is to work through my 14 patch series posting all of the
patches in smaller groups, which will hopefully make reviewing
easier.  As more of that series gets merged, the remote argument
handling will improve, before, eventually, no tests will need to be
disabled.

Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28392

Tested-By: Guinevere Larsen <guinevere@redhat.com>
---
 gdb/nat/fork-inferior.c                       |  84 +++-------
 gdb/testsuite/gdb.base/args.exp               |  64 +++++---
 gdb/testsuite/gdb.base/inferior-args.exp      | 105 ++++++++++--
 gdb/testsuite/gdb.base/startup-with-shell.exp | 149 ++++++++++++++----
 4 files changed, 272 insertions(+), 130 deletions(-)


base-commit: 127f733f88717bdb52f03c12c0c2d240bbc892e3
  

Comments

Keith Seitz Jan. 19, 2025, 11:55 p.m. UTC | #1
On 1/14/25 5:14 AM, Andrew Burgess wrote:
> From: Michael Weghorn <m.weghorn@posteo.de>
> 
> This patch was originally posted as part of this series:
> 
>    https://inbox.sourceware.org/gdb-patches/cover.1730731085.git.aburgess@redhat.com
> 
> But it's not really fair to ask people to review a 14 patch series.
> So instead I plan to split the series up into smaller pieces and post
> them in turn.
> 
> This first patch changes a single GDB files, plus some tests.

I did review your other series, and I had no specific comments about
this particular patch. [I apologize my patch-reviewing etiquette was/is
still in its infancy. I wrote in reply to the cover letter, "Feel free
to add my Reviewed-by."]

Just one lament (not asking for any changes).

> diff --git a/gdb/nat/fork-inferior.c b/gdb/nat/fork-inferior.c
> index 41765b102bc..65cfab83c98 100644
> --- a/gdb/nat/fork-inferior.c
> +++ b/gdb/nat/fork-inferior.c
> @@ -69,66 +75,28 @@ class execv_argv
>   		       const std::string &allargs,
>   		       const char *shell_file);
>   
> -  /* The argument vector built.  Holds non-owning pointers.  Elements
> -     either point to the strings passed to the execv_argv ctor, or
> -     inside M_STORAGE.  */
> -  std::vector<const char *> m_argv;
> -
> -  /* Storage.  In the no-shell case, this contains a copy of the
> -     arguments passed to the ctor, split by '\0'.  In the shell case,
> -     this contains the quoted shell command.  I.e., SHELL_COMMAND in
> -     {"$SHELL" "-c", SHELL_COMMAND, NULL}.  */
> -  std::string m_storage;
> +  /* The argument vector.  This owns the strings within it.  */
> +  std::vector<char *> m_argv;
>   };

Nowadays, I am not a fan of containers "owning" simple pointers like
this. I certainly understand why you've followed along here, but I
lament that ownership isn't more explicit. That is, I hope we someday
make a push to use, e.g., unique ptrs to convey ownership in the code
instead of in a comment.

In any case, certainly nothing that we aren't all already dealing with.

This still looks like a step in the right direction.

Reviewed-By: Keith Seitz <keiths@redhat.com>

Keith
  
Andrew Burgess Jan. 27, 2025, 5:04 p.m. UTC | #2
Keith Seitz <keiths@redhat.com> writes:

> On 1/14/25 5:14 AM, Andrew Burgess wrote:
>> From: Michael Weghorn <m.weghorn@posteo.de>
>> 
>> This patch was originally posted as part of this series:
>> 
>>    https://inbox.sourceware.org/gdb-patches/cover.1730731085.git.aburgess@redhat.com
>> 
>> But it's not really fair to ask people to review a 14 patch series.
>> So instead I plan to split the series up into smaller pieces and post
>> them in turn.
>> 
>> This first patch changes a single GDB files, plus some tests.
>
> I did review your other series, and I had no specific comments about
> this particular patch. [I apologize my patch-reviewing etiquette was/is
> still in its infancy. I wrote in reply to the cover letter, "Feel free
> to add my Reviewed-by."]
>
> Just one lament (not asking for any changes).
>
>> diff --git a/gdb/nat/fork-inferior.c b/gdb/nat/fork-inferior.c
>> index 41765b102bc..65cfab83c98 100644
>> --- a/gdb/nat/fork-inferior.c
>> +++ b/gdb/nat/fork-inferior.c
>> @@ -69,66 +75,28 @@ class execv_argv
>>   		       const std::string &allargs,
>>   		       const char *shell_file);
>>   
>> -  /* The argument vector built.  Holds non-owning pointers.  Elements
>> -     either point to the strings passed to the execv_argv ctor, or
>> -     inside M_STORAGE.  */
>> -  std::vector<const char *> m_argv;
>> -
>> -  /* Storage.  In the no-shell case, this contains a copy of the
>> -     arguments passed to the ctor, split by '\0'.  In the shell case,
>> -     this contains the quoted shell command.  I.e., SHELL_COMMAND in
>> -     {"$SHELL" "-c", SHELL_COMMAND, NULL}.  */
>> -  std::string m_storage;
>> +  /* The argument vector.  This owns the strings within it.  */
>> +  std::vector<char *> m_argv;
>>   };
>
> Nowadays, I am not a fan of containers "owning" simple pointers like
> this. I certainly understand why you've followed along here, but I
> lament that ownership isn't more explicit. That is, I hope we someday
> make a push to use, e.g., unique ptrs to convey ownership in the code
> instead of in a comment.
>
> In any case, certainly nothing that we aren't all already dealing with.

I'd disagree with you here.  We shouldn't be adding more of these cases
just because "there are already similar cases in the code".  We should
call out incorrect uses of raw pointers during review.

But, predictably, I feel this case is different.  We pass the result of
m_argv.data() through to the execve call.  I don't believe this is
valid:

  void some_func (char **data);

  void other_func ()
  {
    std::vector<std::unique_ptr<char>> vec;
    vec.push_back (strdup ("aaaa"));
    vec.push_back (strdup ("bbbb"));
    vec.push_back (strdup ("cccc"));

    some_func (vec.data ());
  }

So, if we change the vector's element type, we'd need to first build
another vector of type 'char *' which can be used in the call to
some_func (or execve in the real GDB code).

One thing I think we maybe could do to clean this code up, is move the
execv_argv::init_for_no_shell and execv_argv::init_for_shell functions
out of the class, and make execv_argv ONLY be wrapper around the vector
of strings.  I think good C++ practice is for a class to either manage a
resource, or to do algorithmic work, but not a mix of the two, as we do
here.

Looking at it, if we had such a class we could use it to improve the
gdbserver code a little, so I'll take a pass at that, then return to
this patch.

Thanks,
Andrew

>
> This still looks like a step in the right direction.
>
> Reviewed-By: Keith Seitz <keiths@redhat.com>
>
> Keith
  

Patch

diff --git a/gdb/nat/fork-inferior.c b/gdb/nat/fork-inferior.c
index 41765b102bc..65cfab83c98 100644
--- a/gdb/nat/fork-inferior.c
+++ b/gdb/nat/fork-inferior.c
@@ -27,6 +27,7 @@ 
 #include "gdbsupport/signals-state-save-restore.h"
 #include "gdbsupport/gdb_tilde_expand.h"
 #include "gdbsupport/gdb_signals.h"
+#include "gdbsupport/buildargv.h"
 #include <vector>
 
 extern char **environ;
@@ -42,6 +43,11 @@  class execv_argv
   execv_argv (const char *exec_file, const std::string &allargs,
 	      const char *shell_file);
 
+  ~execv_argv ()
+  {
+    free_vector_argv (m_argv);
+  }
+
   /* Return a pointer to the built argv, in the type expected by
      execv.  The result is (only) valid for as long as this execv_argv
      object is live.  We return a "char **" because that's the type
@@ -50,7 +56,7 @@  class execv_argv
      strings to which the array point.  */
   char **argv ()
   {
-    return const_cast<char **> (&m_argv[0]);
+    return m_argv.data ();
   }
 
 private:
@@ -69,66 +75,28 @@  class execv_argv
 		       const std::string &allargs,
 		       const char *shell_file);
 
-  /* The argument vector built.  Holds non-owning pointers.  Elements
-     either point to the strings passed to the execv_argv ctor, or
-     inside M_STORAGE.  */
-  std::vector<const char *> m_argv;
-
-  /* Storage.  In the no-shell case, this contains a copy of the
-     arguments passed to the ctor, split by '\0'.  In the shell case,
-     this contains the quoted shell command.  I.e., SHELL_COMMAND in
-     {"$SHELL" "-c", SHELL_COMMAND, NULL}.  */
-  std::string m_storage;
+  /* The argument vector.  This owns the strings within it.  */
+  std::vector<char *> m_argv;
 };
 
-/* Create argument vector for straight call to execvp.  Breaks up
-   ALLARGS into an argument vector suitable for passing to execvp and
-   stores it in M_ARGV.  E.g., on "run a b c d" this routine would get
-   as input the string "a b c d", and as output it would fill in
-   M_ARGV with the four arguments "a", "b", "c", "d".  Each argument
-   in M_ARGV points to a substring of a copy of ALLARGS stored in
-   M_STORAGE.  */
+/* Create argument vector for straight call to execvp.  Breaks up ALLARGS
+   into an argument vector suitable for passing to execvp and stores it in
+   M_ARGV.  EXEC_FILE is the executable to be run.
+
+   E.g., if EXEC_FILE is "foo", and the user does "run a b c d" then
+   ALLARGS would be "a b c d", and this function would fill M_ARGV with
+   give arguments "foo", "a", "b", "c", and "d".  */
 
 void
 execv_argv::init_for_no_shell (const char *exec_file,
 			       const std::string &allargs)
 {
+  m_argv.push_back (xstrdup (exec_file));
 
-  /* Save/work with a copy stored in our storage.  The pointers pushed
-     to M_ARGV point directly into M_STORAGE, which is modified in
-     place with the necessary NULL terminators.  This avoids N heap
-     allocations and string dups when 1 is sufficient.  */
-  std::string &args_copy = m_storage = allargs;
+  gdb_argv argv (allargs.c_str ());
 
-  m_argv.push_back (exec_file);
-
-  for (size_t cur_pos = 0; cur_pos < args_copy.size ();)
-    {
-      /* Skip whitespace-like chars.  */
-      std::size_t pos = args_copy.find_first_not_of (" \t\n", cur_pos);
-
-      if (pos != std::string::npos)
-	cur_pos = pos;
-
-      /* Find the position of the next separator.  */
-      std::size_t next_sep = args_copy.find_first_of (" \t\n", cur_pos);
-
-      if (next_sep == std::string::npos)
-	{
-	  /* No separator found, which means this is the last
-	     argument.  */
-	  next_sep = args_copy.size ();
-	}
-      else
-	{
-	  /* Replace the separator with a terminator.  */
-	  args_copy[next_sep++] = '\0';
-	}
-
-      m_argv.push_back (&args_copy[cur_pos]);
-
-      cur_pos = next_sep;
-    }
+  for (const auto &a : argv)
+    m_argv.push_back (xstrdup (a));
 
   /* NULL-terminate the vector.  */
   m_argv.push_back (NULL);
@@ -182,11 +150,7 @@  execv_argv::init_for_shell (const char *exec_file,
   /* We're going to call a shell.  */
   bool escape_bang = escape_bang_in_quoted_argument (shell_file);
 
-  /* We need to build a new shell command string, and make argv point
-     to it.  So build it in the storage.  */
-  std::string &shell_command = m_storage;
-
-  shell_command = "exec ";
+  std::string shell_command = "exec ";
 
   /* Add any exec wrapper.  That may be a program name with arguments,
      so the user must handle quoting.  */
@@ -256,9 +220,9 @@  execv_argv::init_for_shell (const char *exec_file,
      "-c" says to interpret the next arg as a shell command to
      execute, and this command is "exec <target-program> <args>".  */
   m_argv.reserve (4);
-  m_argv.push_back (shell_file);
-  m_argv.push_back ("-c");
-  m_argv.push_back (shell_command.c_str ());
+  m_argv.push_back (xstrdup (shell_file));
+  m_argv.push_back (xstrdup ("-c"));
+  m_argv.push_back (xstrdup (shell_command.c_str ()));
   m_argv.push_back (NULL);
 }
 
diff --git a/gdb/testsuite/gdb.base/args.exp b/gdb/testsuite/gdb.base/args.exp
index f9235f2f7b0..34d722a7941 100644
--- a/gdb/testsuite/gdb.base/args.exp
+++ b/gdb/testsuite/gdb.base/args.exp
@@ -29,30 +29,54 @@  if {[build_executable $testfile.exp $testfile $srcfile] == -1} {
     return -1
 }
 
+set startup_with_shell_modes { "on" }
+if {![gdb_protocol_is_remote]} {
+    lappend startup_with_shell_modes "off"
+} else {
+    # Some of these tests will not work when using the remote protocol
+    # due to bug PR gdb/28392.
+    unsupported "gdbserver 'startup-with-shell off' broken PR gdb/28392"
+}
+
 # NAME is the name to use for the tests and ARGLIST is the list of
 # arguments that are passed to GDB when it is started.
+#
+# The optional RE_LIST is the list of patterns to check the arguments
+# against, these patterns should match ARGLIST.  If the arguments are
+# expected to show up unmodified in the test output then RE_LIST can
+# be dropped, and this proc will reuse ARGLIST.
+
+proc args_test { name arglist {re_list {}} } {
+
+    # If RE_LIST is not supplied then we can reuse ARGLIST, this
+    # implies that the arguments will appear unmodified in the test
+    # output.
+    if {[llength $re_list] == 0} {
+	set re_list $arglist
+    }
 
-proc args_test { name arglist } {
-    save_vars { ::GDBFLAGS } {
-	set ::GDBFLAGS "$::GDBFLAGS --args $::binfile $arglist"
+    foreach_with_prefix startup_with_shell $::startup_with_shell_modes {
+	save_vars { ::GDBFLAGS } {
+	    set ::GDBFLAGS "$::GDBFLAGS --args $::binfile $arglist"
 
-	clean_restart $::binfile
+	    clean_restart $::binfile
 
-	runto_main
-	gdb_breakpoint [gdb_get_line_number "set breakpoint here"]
-	gdb_continue_to_breakpoint "breakpoint for $name"
+	    gdb_test_no_output "set startup-with-shell ${startup_with_shell}" \
+		"set startup-with-shell for $name"
 
-	set expected_len [expr 1 + [llength $arglist]]
-	gdb_test "print argc" "\\\$$::decimal = $expected_len" "argc for $name"
+	    runto_main
+	    gdb_breakpoint [gdb_get_line_number "set breakpoint here"]
+	    gdb_continue_to_breakpoint "breakpoint for $name"
 
-	set i 1
-	foreach arg $arglist {
-	    if { $arg eq "\n" } {
-		set arg {\\n}
+	    set expected_len [expr 1 + [llength $re_list]]
+	    gdb_test "print argc" "\\\$$::decimal = $expected_len" "argc for $name"
+
+	    set i 1
+	    foreach arg $re_list {
+		gdb_test "print argv\[$i\]" "\\\$$::decimal = $::hex \"$arg\"" \
+		    "argv\[$i\] for $name"
+		set i [expr $i + 1]
 	    }
-	    gdb_test "print argv\[$i\]" "\\\$$::decimal = $::hex \"$arg\"" \
-		"argv\[$i\] for $name"
-	    set i [expr $i + 1]
 	}
     }
 }
@@ -78,6 +102,10 @@  args_test "two empty with single quotes" {{1} {''} {''} {3}}
 
 # Try with arguments containing literal newlines.
 
-args_test "one newline" {{1} "\n" {3}}
+args_test "one newline" {{1} "\n" {3}} {1 \\\\n 3}
+
+args_test "two newlines" {{1} "\n" "\n" {3}} {1 \\\\n \\\\n 3}
+
+args_test "lone single quote" {{1} \' {3}}
 
-args_test "two newlines" {{1} "\n" "\n" {3}}
+args_test "lone double quote" {{1} \" {3}} {1 \\\\\" 3}
diff --git a/gdb/testsuite/gdb.base/inferior-args.exp b/gdb/testsuite/gdb.base/inferior-args.exp
index b165fb83839..79b73e61b33 100644
--- a/gdb/testsuite/gdb.base/inferior-args.exp
+++ b/gdb/testsuite/gdb.base/inferior-args.exp
@@ -25,16 +25,27 @@  if {[build_executable "failed to prepare" $testfile $srcfile \
     return
 }
 
-proc do_test { method } {
+# STARTUP_WITH_SHELL is either 'on' or 'off' and determines if the
+# inferior is started under a shell or not.  INFERIOR_ARGS is the list
+# of inferior arguments.  EXPECTED_RESULTS is the list of expected
+# results, one for each argument.
+#
+# When STUB_SUITABLE is true this test is suitable for use with
+# gdbserver, i.e. INFERIOR_ARGS can be passed through to
+# gdbserver_start via gdb_run_cmd.  Some of the weird quoting used in
+# some of the tests doesn't seem to play well with gdbserver_start.
+# This is a TCL issue, not a gdbserver issue.  Manually testing with
+# gdbserver shows no problems.  It's just that when we try to invoke
+# gdbserver from TCL the argument quoting gets messed up.  For tests
+# that are problematic, STUB_SUITABLE is false.
+proc do_test { method startup_with_shell inferior_args expected_results \
+		   stub_suitable } {
     global binfile hex
 
-    # The second arg is an empty string on purpose.  The last argument
-    # must be the empty argument -- we once had a bug where that
-    # wouldn't work!
-    set inferior_args { "first arg" "" "third-arg" "'" "\"" " " "" }
-
     clean_restart $binfile
 
+    gdb_test_no_output "set startup-with-shell $startup_with_shell"
+
     if { $method == "start" } {
 	# The start command does not make sense for a stub.
 	if { [use_gdb_stub] } {
@@ -80,6 +91,10 @@  proc do_test { method } {
 	    return -1
 	}
 
+	if { [use_gdb_stub] && !$stub_suitable } {
+	    return
+	}
+
 	# The run command does not make sense for a stub, but GDB_RUN_CMD
 	# does the right thing when the target is a stub (start the stub,
 	# connect to it, and "continue").
@@ -110,18 +125,74 @@  proc do_test { method } {
 	error "invalid method $method"
     }
 
+    set argc [expr [llength $expected_results] + 1]
+
     # Now that we are stopped at main, inspect argc/argv.
-    gdb_test "print argc" " = 8"
-    gdb_test "print argv\[0\]" " = $hex \".*\""
-    gdb_test "print argv\[1\]" " = $hex \"first arg\""
-    gdb_test "print argv\[2\]" " = $hex \"\""
-    gdb_test "print argv\[3\]" " = $hex \"third-arg\""
-    gdb_test "print argv\[4\]" " = $hex \"'\""
-    gdb_test "print argv\[5\]" " = $hex \"\\\\\"\""
-    gdb_test "print argv\[6\]" " = $hex \" \""
-    gdb_test "print argv\[7\]" " = $hex \"\""
+    gdb_test "print argc" " = $argc"
+    gdb_test "print argv\[0\]" " = $hex \"\[^\r\n\]+\""
+    for { set i 1 } { $i < $argc } { incr i } {
+	set idx [expr $i - 1]
+	gdb_test "print argv\[$i\]" " = [lindex $expected_results $idx]"
+    }
+}
+
+set test_desc_list []
+
+# test one
+# --------
+#
+# The second arg is an empty string on purpose.  The last argument
+# must be the empty argument -- we once had a bug where that wouldn't
+# work!
+lappend test_desc_list [list "test one" \
+			    true \
+			    { "first arg" "" "third-arg" "'" "\"" " " "" } \
+			    [list "$hex \"first arg\"" \
+				 "$hex \"\"" \
+				 "$hex \"third-arg\"" \
+				 "$hex \"'\"" \
+				 "$hex \"\\\\\"\"" \
+				 "$hex \" \"" \
+				 "$hex \"\"" ]]
+
+# test two
+# --------
+#
+# The argument being passed here is '"', that is a single double quote
+# contained within single quotes.
+#
+# I build the test descriptor using this mess of code to avoid having
+# unbalanced quotes, which messes up indentation and syntax
+# highlighting within (at least) emacs.  The 'format' of ascii code 34
+# gives us the double quote character.  Then I have to jump through
+# the rest of this mess in order to avoid TCL escaping the quote for
+# me.  It's super important that what we send to GDB is '"' not '\"'.
+set item [list "test two" false]
+set cmd [format "lappend item \{ '%c' '\\%c' \}" 34 34]
+eval $cmd
+set bs "\\\\"
+lappend item [list "$hex \"$bs\"\"" "$hex \"$bs$bs$bs\"\""]
+lappend test_desc_list $item
+
+set startup_with_shell_modes { "on" }
+if {![gdb_protocol_is_remote]} {
+    lappend startup_with_shell_modes "off"
+} else {
+    # Due to PR gdb/28392 gdbserver doesn't currently support having
+    # startup-with-shell off, and then attempting to pass arguments
+    # containing whitespace.
+    unsupported "bug gdb/28392: gdbserver doesn't support this"
 }
 
-foreach_with_prefix method { "start" "starti" "run" "set args" } {
-    do_test $method
+
+foreach desc $test_desc_list {
+    lassign $desc name stub_suitable args re_list
+    with_test_prefix $name {
+	foreach_with_prefix set_method { "start" "starti" "run" "set args" } {
+	    foreach_with_prefix startup_with_shell $startup_with_shell_modes {
+		do_test $set_method $startup_with_shell $args $re_list \
+		    $stub_suitable
+	    }
+	}
+    }
 }
diff --git a/gdb/testsuite/gdb.base/startup-with-shell.exp b/gdb/testsuite/gdb.base/startup-with-shell.exp
index 87a755904f6..495c43eeaee 100644
--- a/gdb/testsuite/gdb.base/startup-with-shell.exp
+++ b/gdb/testsuite/gdb.base/startup-with-shell.exp
@@ -43,55 +43,134 @@  proc initial_setup_simple { startup_with_shell run_args } {
     clean_restart $binfile
 
     gdb_test_no_output "set startup-with-shell $startup_with_shell"
+    gdb_test_no_output "set print characters unlimited"
 
     gdb_test_no_output "set args $run_args" \
 	"set args \$run_args"
 
-    set test "inferior started"
-    if { [runto_main] } {
-	pass $test
-    } else {
-	fail $test
+    return [runto_main]
+}
+
+# Start GDB, set the inferior arguments to ARGS, and then run to main.
+# Once at main, read the first argument from the inferior and compare
+# it to ON_RE if startup-with-shell is on, otherwise compare to
+# OFF_RE.
+#
+# If PROBLEMATIC_ON is true then when startup-with-shell is on we
+# expect the comparison to fail, so setup an xfail.
+#
+# If PROBLEMATIC_OFF is true then when startup-with-shell is off we
+# expect the comparison to fail, so setup an xfail.
+#
+# TESTNAME is a string used in the test names.
+proc run_test { args on_re off_re testname { problematic_on false } \
+		    { problematic_off false } } {
+    foreach startup_with_shell { "on" "off" } {
+	with_test_prefix "$testname, startup_with_shell: ${startup_with_shell}" {
+	    if {![initial_setup_simple $startup_with_shell $args]} {
+		return -1
+	    }
+
+	    if { $startup_with_shell } {
+		set re $on_re
+		set problematic $problematic_on
+	    } else {
+		set re $off_re
+		set problematic $problematic_off
+	    }
+
+	    if { $problematic } {
+		setup_xfail "*-*-*" gdb/28392
+	    }
+
+	    gdb_test "print argv\[1\]" "\\\$$::decimal = $::hex $re" $testname
+	}
     }
 }
 
+# This is like the run_test proc except that RE is used as the
+# expected argument regexp when startup-with-shell is both on and off.
+# For the other arguments, see run_test.
+proc run_test_same { args re testname { problematic_on false } \
+			 { problematic_off false } } {
+    run_test $args $re $re $testname $problematic_on $problematic_off
+}
+
+# The regexp to match a single '\' character.
+set bs "\\\\"
+
 # Are we using 'remote' or 'extended-remote' protocol?
 set is_remote_p [gdb_protocol_is_remote]
 
 ## Run the actual tests
 
-with_test_prefix "startup_with_shell = on; run_args = *.unique-extension" {
-    initial_setup_simple "on" "$unique_file_dir/*.unique-extension"
-    gdb_test_no_output "set print characters unlimited"
-    if { $is_remote_p } {
-	setup_xfail "*-*-*" gdb/28392
-    }
-    gdb_test "print argv\[1\]" "\\\$$decimal = $hex \"$unique_file\"" \
-	"first argument expanded"
-}
+run_test "$unique_file_dir/*.unique-extension" \
+    "\"$unique_file\"" \
+    "\"$unique_file_dir/\\\*\.unique-extension\"" \
+    "arg is glob" \
+    $is_remote_p
 
-with_test_prefix "startup_with_shell = off; run_args = *.unique-extension" {
-    initial_setup_simple "off" "$unique_file_dir/*.unique-extension"
-    gdb_test_no_output "set print characters unlimited"
-    gdb_test "print argv\[1\]" "\\\$$decimal = $hex \"$unique_file_dir/\\\*\.unique-extension\"" \
-	"first argument not expanded"
-}
+run_test_same "$unique_file_dir/\\*.unique-extension" \
+    "\"$unique_file_dir/\\\*\.unique-extension\"" \
+    "arg is escaped glob"
 
-with_test_prefix "startup_with_shell = on; run_args = \$TEST" {
+save_vars { env(TEST) } {
     set env(TEST) "1234"
-    initial_setup_simple "on" "\$TEST"
-    if { $is_remote_p } {
-	setup_xfail "*-*-*" gdb/28392
-    }
-    gdb_test "print argv\[1\]" "\\\$$decimal = $hex \"1234\"" \
-	"testing first argument"
-    unset env(TEST)
+    run_test "\$TEST" \
+	"\"1234\"" \
+	"\"\\\$TEST\"" \
+	"arg is shell variable" \
+	$is_remote_p
+
+    run_test_same "\\\$TEST" \
+	"\"\\\$TEST\"" \
+	"arg is escaped shell variable"
 }
 
-with_test_prefix "startup_with_shell = off; run_args = \$TEST" {
-    set env(TEST) "1234"
-    initial_setup_simple "off" "\$TEST"
-    gdb_test "print argv\[1\]" "\\\$$decimal = $hex \"\\\$TEST\"" \
-	"testing first argument"
-    unset env(TEST)
-}
+run_test_same "\"\\a\"" \
+    "\"${bs}${bs}a\"" \
+    "retain backslash in double quote arg" \
+    false $is_remote_p
+
+run_test_same "'\\a'" \
+    "\"${bs}${bs}a\"" \
+    "retain backslash in single quote arg" \
+    false $is_remote_p
+
+run_test_same "\"\\\$\"" \
+    "\"\\\$\"" \
+    "'\$' can be escaped in double quote arg"
+
+run_test_same "'\\\$'" \
+    "\"${bs}${bs}\\\$\"" \
+    "'\$' is not escaped in single quote arg" \
+    false $is_remote_p
+
+run_test_same "\"\\`\"" \
+    "\"\\`\"" \
+    "'`' can be escaped in double quote arg"
+
+run_test_same "'\\`'" \
+    "\"${bs}${bs}`\"" \
+    "'`' is not escaped in single quote arg" \
+    false $is_remote_p
+
+run_test_same "\"\\\"\"" \
+    "\"${bs}\"\"" \
+    "'\"' can be escaped in double quote arg" \
+    false $is_remote_p
+
+run_test_same "'\\\"'" \
+    "\"${bs}${bs}${bs}\"\"" \
+    "'\"' is not escaped in single quote arg" \
+    false $is_remote_p
+
+run_test_same "\"\\\\\"" \
+    "\"${bs}${bs}\"" \
+    "'\\' can be escaped in double quote arg" \
+    false $is_remote_p
+
+run_test_same "'\\\\'" \
+    "\"${bs}${bs}${bs}${bs}\"" \
+    "'\\' is not escaped in single quote arg" \
+    false $is_remote_p