[04/16] gdb: remove the !startup_with_shell path from construct_inferior_arguments

Message ID f5cc99ef257169bc40a70f4d2aacaade515cc652.1704809585.git.aburgess@redhat.com
State New
Headers
Series Inferior argument (inc for remote targets) changes |

Commit Message

Andrew Burgess Jan. 9, 2024, 2:26 p.m. UTC
  In the previous commit nat/fork-inferior.c was updated such that when
we are starting an inferior without a shell we now remove escape
characters.  The benefits of this are explained in the previous
commit, but having made that change we can now make an additional
change.

Currently, in construct_inferior_arguments, when startup_with_shell is
false we construct the inferior argument string differently than when
startup_with_shell is true; when true we apply some escaping to
special shell character, while when false we don't.

This commit removes this special handling, and instead we now apply
escaping in all cases.  This is fine because, thanks to the previous
commit the escaping will be correctly removed when we call into
nat/fork-inferior.c (thanks to the previous commit).

For GDB's native targets construct_inferior_arguments is reached via
two code paths; first when GDB starts and we combine arguments from
the command line, and second when the Python API is used to set the
arguments from a sequence.

Now, it might seem like removing this code path is a bad thing for
native targets.  This path allowed a "neat" trick to work around a
limitation of GDB's command line argument processing.  Consider this:

  $ gdb --args /tmp/exec '$FOO'
  (gdb) show args
  Argument list to give program being debugged when it is started is "\$FOO".

Notice that the argument has become \$FOO, the '$' is now quoted.
This is because, by quoting the argument in the shell command that
started GDB, GDB was passed a literal $FOO with no quotes.  In order
to ensure that the inferior sees this same value, GDB then added the
extra escape character.  When GDB starts with a shell we pass \$FOO,
which results in the inferior seeing a literal $FOO.

But what if the user _actually_ wanted to have the shell GDB uses to
start the inferior expand $FOO?  Well, it appears this can't be done
from the command line, but from the GDB prompt we can just do:

  (gdb) set args $FOO
  (gdb) show args
  Argument list to give program being debugged when it is started is "$FOO".

And now the inferior will see the shell expanded version of $FOO.
But there's no obvious way to achieve this from the GDB command line,
except with this trick:

  $ gdb -eiex 'set startup-with-shell off' --args /tmp/exec '$FOO'
  (gdb) show args
  Argument list to give program being debugged when it is started is "$FOO".
  (gdb) show startup-with-shell
  Use of shell to start subprocesses is off.

And now the $FOO is not escaped, but GDB is no longer using a shell to
start the inferior, however, we can extend our command line like this:

  $ gdb -eiex 'set startup-with-shell off' \
        -ex 'set startup-with-shell on' \
	--args /tmp/exec '$FOO'
  (gdb) show args
  Argument list to give program being debugged when it is started is "$FOO".
  (gdb) show startup-with-shell
  Use of shell to start subprocesses is on.

We use an early-initialisation command line option to disable
startup-with-shell, this is done before command line argument
processing, then a normal initialisation option turns
startup-with-shell back on after GDB has processed the command line
arguments!

Is this useful?  Yes, absolutely.  Is this a good user experience?
Absolutely not.  And a later patch in this series is going to add a
new command line option to GDB (and gdbserver) that will allow users
to achieve the same result (this trick doesn't work in gdbserver as
there's no early-initialisation there).  So, the fact that I plan to
remove the ability to do this from GDB is not going to be a problem
once this complete series is merged.

Now, for remote targets the impact of this change is greater, and is
only a good thing.  When arguments arrive in gdbserver from GDB we use
construct_inferior_arguments to build the argument string.  After the
previous commit we know that calling nat/fork-inferior.c will always
remove one "level" of escapes; either the shell gdbserver spawns will
remove the escapes, or nat/fork-inferior.c will manually remove one
level of escapes.

What this means is that, if we don't add a level of escapes when
building the arguments in construct_inferior_arguments, we will end up
removing an additional level of escapes in nat/fork-inferior.c.

After this commit a whole set of tests that were added as xfail in the
previous commit are now passing.

A change similar to this one can be found in this series:

  https://inbox.sourceware.org/gdb-patches/20211022071933.3478427-1-m.weghorn@posteo.de/

which I reviewed before writing this patch.  I don't think there's any
one patch in that series that exactly corresponds with this patch
though, so I've listed the author of the original series as co-author
on this patch.

Co-Authored-By: Michael Weghorn <m.weghorn@posteo.de>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28392
---
 gdb/testsuite/gdb.base/args.exp               |  12 +-
 gdb/testsuite/gdb.base/inferior-args.exp      |  49 ++++++--
 gdb/testsuite/gdb.base/startup-with-shell.exp |  37 ++----
 gdbsupport/common-inferior.cc                 | 108 +++++++-----------
 4 files changed, 91 insertions(+), 115 deletions(-)
  

Comments

Keith Seitz Jan. 21, 2024, 3:56 a.m. UTC | #1
On 1/9/24 06:26, Andrew Burgess wrote:
> diff --git a/gdbsupport/common-inferior.cc b/gdbsupport/common-inferior.cc
> index 55149ec1f13..076ddc73d51 100644
> --- a/gdbsupport/common-inferior.cc
> +++ b/gdbsupport/common-inferior.cc
> @@ -32,92 +32,66 @@ construct_inferior_arguments (gdb::array_view<char * const> argv)
>   {
>     std::string result;
>   
> -  if (startup_with_shell)
> -    {
>   #ifdef __MINGW32__
> -      /* This holds all the characters considered special to the
> -	 Windows shells.  */
> -      static const char special[] = "\"!&*|[]{}<>?`~^=;, \t\n";
> -      static const char quote = '"';
> +  /* This holds all the characters considered special to the
> +     Windows shells.  */
> +  static const char special[] = "\"!&*|[]{}<>?`~^=;, \t\n";
> +  static const char quote = '"';
>   #else
> -      /* This holds all the characters considered special to the
> -	 typical Unix shells.  We include `^' because the SunOS
> -	 /bin/sh treats it as a synonym for `|'.  */
> -      static const char special[] = "\"!#$&*()\\|[]{}<>?'`~^; \t\n";
> -      static const char quote = '\'';
> +  /* This holds all the characters considered special to the
> +     typical Unix shells.  We include `^' because the SunOS
> +     /bin/sh treats it as a synonym for `|'.  */
> +  static const char special[] = "\"!#$&*()\\|[]{}<>?'`~^; \t\n";
> +  static const char quote = '\'';
>   #endif
> -      for (int i = 0; i < argv.size (); ++i)
> +  for (int i = 0; i < argv.size (); ++i)
> +    {
> +      if (i > 0)
> +	result += ' ';
> +
> +      /* Need to handle empty arguments specially.  */
> +      if (argv[i][0] == '\0')
>   	{
> -	  if (i > 0)
> -	    result += ' ';
> +	  result += quote;
> +	  result += quote;
> +	}
> +      else
> +	{
> +#ifdef __MINGW32__
> +	  bool quoted = false;
>   
> -	  /* Need to handle empty arguments specially.  */
> -	  if (argv[i][0] == '\0')
> +	  if (strpbrk (argv[i], special))
>   	    {
> -	      result += quote;
> +	      quoted = true;
>   	      result += quote;
>   	    }
> -	  else
> +#endif
> +	  for (char *cp = argv[i]; *cp; ++cp)

Re: "*cp": Didn't we move to explicit checks some time ago?

Keith

>   	    {
> -#ifdef __MINGW32__
> -	      bool quoted = false;
> -
> -	      if (strpbrk (argv[i], special))
> +	      if (*cp == '\n')
>   		{
> -		  quoted = true;
> +		  /* A newline cannot be quoted with a backslash (it
> +		     just disappears), only by putting it inside
> +		     quotes.  */
> +		  result += quote;
> +		  result += '\n';
>   		  result += quote;
>   		}
> -#endif
> -	      for (char *cp = argv[i]; *cp; ++cp)
> +	      else
>   		{
> -		  if (*cp == '\n')
> -		    {
> -		      /* A newline cannot be quoted with a backslash (it
> -			 just disappears), only by putting it inside
> -			 quotes.  */
> -		      result += quote;
> -		      result += '\n';
> -		      result += quote;
> -		    }
> -		  else
> -		    {
>   #ifdef __MINGW32__
> -		      if (*cp == quote)
> +		  if (*cp == quote)
>   #else
> -		      if (strchr (special, *cp) != NULL)
> +		    if (strchr (special, *cp) != NULL)
>   #endif
> -			result += '\\';
> -		      result += *cp;
> -		    }
> +		      result += '\\';
> +		  result += *cp;
>   		}
> +	    }
>   #ifdef __MINGW32__
> -	      if (quoted)
> -		result += quote;
> +	  if (quoted)
> +	    result += quote;
>   #endif
> -	    }
> -	}
> -    }
> -  else
> -    {
> -      /* In this case we can't handle arguments that contain spaces,
> -	 tabs, or newlines -- see breakup_args().  */
> -      for (char *arg : argv)
> -	{
> -	  char *cp = strchr (arg, ' ');
> -	  if (cp == NULL)
> -	    cp = strchr (arg, '\t');
> -	  if (cp == NULL)
> -	    cp = strchr (arg, '\n');
> -	  if (cp != NULL)
> -	    error (_("can't handle command-line "
> -		     "argument containing whitespace"));
> -	}
> -
> -      for (int i = 0; i < argv.size (); ++i)
> -	{
> -	  if (i > 0)
> -	    result += " ";
> -	  result += argv[i];
>   	}
>       }
>
  

Patch

diff --git a/gdb/testsuite/gdb.base/args.exp b/gdb/testsuite/gdb.base/args.exp
index f97f1089d69..8b0047999bf 100644
--- a/gdb/testsuite/gdb.base/args.exp
+++ b/gdb/testsuite/gdb.base/args.exp
@@ -29,16 +29,6 @@  if {[build_executable $testfile.exp $testfile $srcfile] == -1} {
     return -1
 }
 
-set startup_with_shell_modes { "on" }
-if {!([target_info gdb_protocol] == "remote"
-      || [target_info gdb_protocol] == "extended-remote")} {
-    lappend startup_with_shell_modes "off"
-} else {
-    # Some of these tests will not work when using the remote protocol
-    # due to bug PR gdb/28392.
-    unsupported "gdbserver 'startup-with-shell off' broken PR gdb/28392"
-}
-
 # NAME is the name to use for the tests and ARGLIST is the list of
 # arguments that are passed to GDB when it is started.
 #
@@ -56,7 +46,7 @@  proc args_test { name arglist {re_list {}} } {
 	set re_list $arglist
     }
 
-    foreach_with_prefix startup_with_shell $::startup_with_shell_modes {
+    foreach_with_prefix startup_with_shell { on off } {
 	save_vars { ::GDBFLAGS } {
 	    set ::GDBFLAGS "$::GDBFLAGS --args $::binfile $arglist"
 
diff --git a/gdb/testsuite/gdb.base/inferior-args.exp b/gdb/testsuite/gdb.base/inferior-args.exp
index bffbcf1862d..4b51b657326 100644
--- a/gdb/testsuite/gdb.base/inferior-args.exp
+++ b/gdb/testsuite/gdb.base/inferior-args.exp
@@ -174,23 +174,48 @@  set bs "\\\\"
 lappend item [list "$hex \"$bs\"\"" "$hex \"$bs$bs$bs\"\""]
 lappend test_desc_list $item
 
-set startup_with_shell_modes { "on" }
-if {!([target_info gdb_protocol] == "remote"
-       || [target_info gdb_protocol] == "extended-remote")} {
-    lappend startup_with_shell_modes "off"
-} else {
-    # Due to PR gdb/28392 gdbserver doesn't currently support having
-    # startup-with-shell off, and then attempting to pass arguments
-    # containing whitespace.
-    unsupported "bug gdb/28392: gdbserver doesn't support this"
-}
-
+# test three
+# ----------
+#
+# This test focuses on sending special shell characters within a
+# double quote argument, and each special character is prefixed with a
+# backslash.
+#
+# In a POSIX shell, within a double quoted argument, only $ (dollar),
+# ` (backtick), " (double quote), \ (backslash), and newline can be
+# escaped.  All other backslash characters are literal backslashes.
+#
+# As with the previous test, the double quotes are lost when the
+# arguments are sent through gdbserver_start, as such, this test isn't
+# going to work when using the native-gdbserver board, hence we set
+# the second arguemnt to 'false'.
+lappend test_desc_list [list "test three" \
+			    false \
+			    { "\&" "\<" "\#" "\^" "\>" "\$" "\`" } \
+			    [list "$hex \"\\\\\\\\&\"" \
+				 "$hex \"\\\\\\\\<\"" \
+				 "$hex \"\\\\\\\\#\"" \
+				 "$hex \"\\\\\\\\\\^\"" \
+				 "$hex \"\\\\\\\\>\"" \
+				 "$hex \"\\\$\"" \
+				 "$hex \"`\""]]
+
+# test four
+# ---------
+#
+# This test passes two arguments, a single and double quote, each
+# escaped with a backslash.
+lappend test_desc_list [list "test four" \
+			    true \
+			    { \' \" } \
+			    [list "$hex \"'\"" \
+				 "$hex \"\\\\\"\""]]
 
 foreach desc $test_desc_list {
     lassign $desc name stub_suitable args re_list
     with_test_prefix $name {
 	foreach_with_prefix set_method { "start" "starti" "run" "set args" } {
-	    foreach_with_prefix startup_with_shell $startup_with_shell_modes {
+	    foreach_with_prefix startup_with_shell { on off } {
 		do_test $set_method $startup_with_shell $args $re_list \
 		    $stub_suitable
 	    }
diff --git a/gdb/testsuite/gdb.base/startup-with-shell.exp b/gdb/testsuite/gdb.base/startup-with-shell.exp
index 62bb5c9c882..0424b20de3a 100644
--- a/gdb/testsuite/gdb.base/startup-with-shell.exp
+++ b/gdb/testsuite/gdb.base/startup-with-shell.exp
@@ -58,12 +58,8 @@  proc initial_setup_simple { startup_with_shell run_args } {
 # If PROBLEMATIC_ON is true then when startup-with-shell is on we
 # expect the comparison to fail, so setup an xfail.
 #
-# If PROBLEMATIC_OFF is true then when startup-with-shell is off we
-# expect the comparison to fail, so setup an xfail.
-#
 # TESTNAME is a string used in the test names.
-proc run_test { args on_re off_re testname { problematic_on false } \
-		    { problematic_off false } } {
+proc run_test { args on_re off_re testname { problematic_on false } } {
     foreach startup_with_shell { "on" "off" } {
 	with_test_prefix "$testname, startup_with_shell: ${startup_with_shell}" {
 	    if {![initial_setup_simple $startup_with_shell $args]} {
@@ -75,7 +71,7 @@  proc run_test { args on_re off_re testname { problematic_on false } \
 		set problematic $problematic_on
 	    } else {
 		set re $off_re
-		set problematic $problematic_off
+		set problematic false
 	    }
 
 	    if { $problematic } {
@@ -90,9 +86,8 @@  proc run_test { args on_re off_re testname { problematic_on false } \
 # This is like the run_test proc except that RE is used as the
 # expected argument regexp when startup-with-shell is both on and off.
 # For the other arguments, see run_test.
-proc run_test_same { args re testname { problematic_on false } \
-			 { problematic_off false } } {
-    run_test $args $re $re $testname $problematic_on $problematic_off
+proc run_test_same { args re testname } {
+    run_test $args $re $re $testname
 }
 
 # The regexp to match a single '\' character.
@@ -131,13 +126,11 @@  save_vars { env(TEST) } {
 
 run_test_same "\"\\a\"" \
     "\"${bs}${bs}a\"" \
-    "retain backslash in double quote arg" \
-    false $is_remote_p
+    "retain backslash in double quote arg"
 
 run_test_same "'\\a'" \
     "\"${bs}${bs}a\"" \
-    "retain backslash in single quote arg" \
-    false $is_remote_p
+    "retain backslash in single quote arg"
 
 run_test_same "\"\\\$\"" \
     "\"\\\$\"" \
@@ -145,8 +138,7 @@  run_test_same "\"\\\$\"" \
 
 run_test_same "'\\\$'" \
     "\"${bs}${bs}\\\$\"" \
-    "'\$' is not escaped in single quote arg" \
-    false $is_remote_p
+    "'\$' is not escaped in single quote arg"
 
 run_test_same "\"\\`\"" \
     "\"\\`\"" \
@@ -154,25 +146,20 @@  run_test_same "\"\\`\"" \
 
 run_test_same "'\\`'" \
     "\"${bs}${bs}`\"" \
-    "'`' is not escaped in single quote arg" \
-    false $is_remote_p
+    "'`' is not escaped in single quote arg"
 
 run_test_same "\"\\\"\"" \
     "\"${bs}\"\"" \
-    "'\"' can be escaped in double quote arg" \
-    false $is_remote_p
+    "'\"' can be escaped in double quote arg"
 
 run_test_same "'\\\"'" \
     "\"${bs}${bs}${bs}\"\"" \
-    "'\"' is not escaped in single quote arg" \
-    false $is_remote_p
+    "'\"' is not escaped in single quote arg"
 
 run_test_same "\"\\\\\"" \
     "\"${bs}${bs}\"" \
-    "'\\' can be escaped in double quote arg" \
-    false $is_remote_p
+    "'\\' can be escaped in double quote arg"
 
 run_test_same "'\\\\'" \
     "\"${bs}${bs}${bs}${bs}\"" \
-    "'\\' is not escaped in single quote arg" \
-    false $is_remote_p
+    "'\\' is not escaped in single quote arg"
diff --git a/gdbsupport/common-inferior.cc b/gdbsupport/common-inferior.cc
index 55149ec1f13..076ddc73d51 100644
--- a/gdbsupport/common-inferior.cc
+++ b/gdbsupport/common-inferior.cc
@@ -32,92 +32,66 @@  construct_inferior_arguments (gdb::array_view<char * const> argv)
 {
   std::string result;
 
-  if (startup_with_shell)
-    {
 #ifdef __MINGW32__
-      /* This holds all the characters considered special to the
-	 Windows shells.  */
-      static const char special[] = "\"!&*|[]{}<>?`~^=;, \t\n";
-      static const char quote = '"';
+  /* This holds all the characters considered special to the
+     Windows shells.  */
+  static const char special[] = "\"!&*|[]{}<>?`~^=;, \t\n";
+  static const char quote = '"';
 #else
-      /* This holds all the characters considered special to the
-	 typical Unix shells.  We include `^' because the SunOS
-	 /bin/sh treats it as a synonym for `|'.  */
-      static const char special[] = "\"!#$&*()\\|[]{}<>?'`~^; \t\n";
-      static const char quote = '\'';
+  /* This holds all the characters considered special to the
+     typical Unix shells.  We include `^' because the SunOS
+     /bin/sh treats it as a synonym for `|'.  */
+  static const char special[] = "\"!#$&*()\\|[]{}<>?'`~^; \t\n";
+  static const char quote = '\'';
 #endif
-      for (int i = 0; i < argv.size (); ++i)
+  for (int i = 0; i < argv.size (); ++i)
+    {
+      if (i > 0)
+	result += ' ';
+
+      /* Need to handle empty arguments specially.  */
+      if (argv[i][0] == '\0')
 	{
-	  if (i > 0)
-	    result += ' ';
+	  result += quote;
+	  result += quote;
+	}
+      else
+	{
+#ifdef __MINGW32__
+	  bool quoted = false;
 
-	  /* Need to handle empty arguments specially.  */
-	  if (argv[i][0] == '\0')
+	  if (strpbrk (argv[i], special))
 	    {
-	      result += quote;
+	      quoted = true;
 	      result += quote;
 	    }
-	  else
+#endif
+	  for (char *cp = argv[i]; *cp; ++cp)
 	    {
-#ifdef __MINGW32__
-	      bool quoted = false;
-
-	      if (strpbrk (argv[i], special))
+	      if (*cp == '\n')
 		{
-		  quoted = true;
+		  /* A newline cannot be quoted with a backslash (it
+		     just disappears), only by putting it inside
+		     quotes.  */
+		  result += quote;
+		  result += '\n';
 		  result += quote;
 		}
-#endif
-	      for (char *cp = argv[i]; *cp; ++cp)
+	      else
 		{
-		  if (*cp == '\n')
-		    {
-		      /* A newline cannot be quoted with a backslash (it
-			 just disappears), only by putting it inside
-			 quotes.  */
-		      result += quote;
-		      result += '\n';
-		      result += quote;
-		    }
-		  else
-		    {
 #ifdef __MINGW32__
-		      if (*cp == quote)
+		  if (*cp == quote)
 #else
-		      if (strchr (special, *cp) != NULL)
+		    if (strchr (special, *cp) != NULL)
 #endif
-			result += '\\';
-		      result += *cp;
-		    }
+		      result += '\\';
+		  result += *cp;
 		}
+	    }
 #ifdef __MINGW32__
-	      if (quoted)
-		result += quote;
+	  if (quoted)
+	    result += quote;
 #endif
-	    }
-	}
-    }
-  else
-    {
-      /* In this case we can't handle arguments that contain spaces,
-	 tabs, or newlines -- see breakup_args().  */
-      for (char *arg : argv)
-	{
-	  char *cp = strchr (arg, ' ');
-	  if (cp == NULL)
-	    cp = strchr (arg, '\t');
-	  if (cp == NULL)
-	    cp = strchr (arg, '\n');
-	  if (cp != NULL)
-	    error (_("can't handle command-line "
-		     "argument containing whitespace"));
-	}
-
-      for (int i = 0; i < argv.size (); ++i)
-	{
-	  if (i > 0)
-	    result += " ";
-	  result += argv[i];
 	}
     }