fork-child.c: Avoid unnecessary heap-allocation / string copying (Re: [PATCH v5 3/5] C++-fy and prepare for sharing fork_inferior)

Message ID 01d7a597-ac60-9764-142b-bcaefd269271@redhat.com
State New, archived
Headers

Commit Message

Pedro Alves April 13, 2017, 3:42 a.m. UTC
  On 04/12/2017 11:26 PM, Sergio Durigan Junior wrote:
> On Wednesday, April 12 2017, Pedro Alves wrote:
> 
>> On 04/12/2017 06:04 AM, Sergio Durigan Junior wrote:
>>
>>>>>>  static void
>>>>>> -breakup_args (char *scratch, char **argv)
>>>>>> +breakup_args (const std::string &scratch, std::vector<char *> &argv)
>>>>>>  {
>>>>>
>>>>> ...
>>>>>
>>>>>> +
>>>>>> +      std::string arg = scratch.substr (cur_pos, next_sep - cur_pos);
>>>>>> +
>>>>>
>>>>> This creates a temporary string (heap allocates) ...
>>>>>
>>>>>> +      argv.push_back (xstrdup (arg.c_str ()));
>>>>>
>>>>> ... and here you create yet another copy.
>>>>>
>>>>> You should be able to avoid it by using e.g., savestring:
>>>>>
>>>>>     char *arg = savestring (scratch.c_str () + cur_pos, next_sep - cur_pos);
>>>>>     argv.push_back (arg);
>>>>
>>>> Fair enough.  I had my mind on "C++-only mode" when writing this code.
>>
>> Yup, in C++, it's good to keep unnecessary temporaries
>> and hidden heap allocations in mind.  Actually, now that I look a bit
>> deeper, I think we can avoid a "premature pessimization" here, keeping
>> the same level of clarity.
>>
>> I think it'd be good to push this patch below.  WDYT?
> 
> Hm, I don't know.  I mean, I understand where you're coming from and
> what you're trying to achieve, but somehow I thought the old code
> (especially the part that modified the argument string in place, using
> \0's as separators) was somewhat hacky.  Even though the new code uses
> more memory, it it also more readable, at least IMNSHO.  And also more
> const-correct, since we can make breakup_args_for_exec receive a const
> as the first argument.

There's nothing wrong with writing to a string in place.  A std::string
is just a _container_ of chars with convenience methods.  It's
been designed to support embedded \0s.  And it's not a problem for a
function to take a copy of a value if it needs to modify it!
const-correct refers to when functions that _don't need to
modify an argument nevertheless declare non-const reference/pointer.

Effectively you can't avoid _some_ creating _some_ new string(s). 
But instead of taking one single copy (O(1)), the current code is making
many (N) tiny copies.  Trips to the generic allocator / malloc should
be one of the first things to avoid if easy to avoid, as a principle.

The patch I had posted is exactly equivalent to the one below, except
the state/storage in this one is moved to a class.  Maybe you'd
find this clearer?  (It'd need some cleaning up for comments
further at least, but it's getting quite late here...)

From 689a881839f9d4aeb36695676856db2eaf8e97ab Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Thu, 13 Apr 2017 04:13:22 +0100
Subject: [PATCH] fork-child.c: Avoid unnecessary heap-allocation / string
 copying

The previous patch converted the argv building from an
alloca-allocated array of non-owning arg pointers, to a std::vector of
owning pointers, which results in N string dups, with N being the
number of arguments in the vector, and then requires manually
releasing the pointers owned by the vector.

This patch makes the vector hold non-owning pointers, and avoids the
string dups, by doing one single string copy of the arguments upfront,
and replacing separators with NULL terminators in place, like we used
to.

With this, there's no need to remember to call free_vector_argv either.

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <palves@redhat.com>

	* fork-child.c (breakup_args): Make 'scratch' non-const.
	Replace separators with NULL terminators in place.  Change
	type of vector.  (fork_inferior): The argument vector now
	holds non-owning pointers.  Don't strdup strings into the
	vector.  Remove free_vector_argv call.
---
 gdb/fork-child.c | 278 ++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 172 insertions(+), 106 deletions(-)
  

Comments

Sergio Durigan Junior April 13, 2017, 4:31 a.m. UTC | #1
On Wednesday, April 12 2017, Pedro Alves wrote:

> On 04/12/2017 11:26 PM, Sergio Durigan Junior wrote:
>> On Wednesday, April 12 2017, Pedro Alves wrote:
>> 
>>> On 04/12/2017 06:04 AM, Sergio Durigan Junior wrote:
>>>
>>>>>>>  static void
>>>>>>> -breakup_args (char *scratch, char **argv)
>>>>>>> +breakup_args (const std::string &scratch, std::vector<char *> &argv)
>>>>>>>  {
>>>>>>
>>>>>> ...
>>>>>>
>>>>>>> +
>>>>>>> +      std::string arg = scratch.substr (cur_pos, next_sep - cur_pos);
>>>>>>> +
>>>>>>
>>>>>> This creates a temporary string (heap allocates) ...
>>>>>>
>>>>>>> +      argv.push_back (xstrdup (arg.c_str ()));
>>>>>>
>>>>>> ... and here you create yet another copy.
>>>>>>
>>>>>> You should be able to avoid it by using e.g., savestring:
>>>>>>
>>>>>>     char *arg = savestring (scratch.c_str () + cur_pos, next_sep - cur_pos);
>>>>>>     argv.push_back (arg);
>>>>>
>>>>> Fair enough.  I had my mind on "C++-only mode" when writing this code.
>>>
>>> Yup, in C++, it's good to keep unnecessary temporaries
>>> and hidden heap allocations in mind.  Actually, now that I look a bit
>>> deeper, I think we can avoid a "premature pessimization" here, keeping
>>> the same level of clarity.
>>>
>>> I think it'd be good to push this patch below.  WDYT?
>> 
>> Hm, I don't know.  I mean, I understand where you're coming from and
>> what you're trying to achieve, but somehow I thought the old code
>> (especially the part that modified the argument string in place, using
>> \0's as separators) was somewhat hacky.  Even though the new code uses
>> more memory, it it also more readable, at least IMNSHO.  And also more
>> const-correct, since we can make breakup_args_for_exec receive a const
>> as the first argument.
>
> There's nothing wrong with writing to a string in place.  A std::string
> is just a _container_ of chars with convenience methods.  It's
> been designed to support embedded \0s.  And it's not a problem for a
> function to take a copy of a value if it needs to modify it!
> const-correct refers to when functions that _don't need to
> modify an argument nevertheless declare non-const reference/pointer.

Sorry, maybe I should have made myself clearer.  I did not mean to say
that there was anything actually wrong with modifying a string in-place;
I just meant that it seems somewhat hacky and more error-prone to me
than separate the string into substrings.

I totally understand that a std::string is just a container, and that it
is OK to modify it if needed.  But we're not talking about a generic
string manipulation here; we're talking about inserting \0's on the
string in order to split it.  Anyway, I totally understand this is valid
code.

About the const-correctness comment: you're right, I should have said
that the original function is better IMHO because it doesn't modify its
arguments.

> Effectively you can't avoid _some_ creating _some_ new string(s). 
> But instead of taking one single copy (O(1)), the current code is making
> many (N) tiny copies.  Trips to the generic allocator / malloc should
> be one of the first things to avoid if easy to avoid, as a principle.

Yep, I am aware that the code is more efficient using less strings, but
I thought "OK, how many args will we be dealing with here, usually?",
and I decided that it wasn't worth.

> The patch I had posted is exactly equivalent to the one below, except
> the state/storage in this one is moved to a class.  Maybe you'd
> find this clearer?  (It'd need some cleaning up for comments
> further at least, but it's getting quite late here...)

I liked your patch.  There are some things I'd like to comment, but
overall it is good and I think it does a nice job of separating things
on fork_inferior.  As I said before, I won't really make a fuss about
this approach if you think it is the right way to go; my comments were
solely based on my personal taste.

> From 689a881839f9d4aeb36695676856db2eaf8e97ab Mon Sep 17 00:00:00 2001
> From: Pedro Alves <palves@redhat.com>
> Date: Thu, 13 Apr 2017 04:13:22 +0100
> Subject: [PATCH] fork-child.c: Avoid unnecessary heap-allocation / string
>  copying
>
> The previous patch converted the argv building from an
> alloca-allocated array of non-owning arg pointers, to a std::vector of
> owning pointers, which results in N string dups, with N being the
> number of arguments in the vector, and then requires manually
> releasing the pointers owned by the vector.
>
> This patch makes the vector hold non-owning pointers, and avoids the
> string dups, by doing one single string copy of the arguments upfront,
> and replacing separators with NULL terminators in place, like we used
> to.
>
> With this, there's no need to remember to call free_vector_argv either.
>
> gdb/ChangeLog:
> yyyy-mm-dd  Pedro Alves  <palves@redhat.com>
>
> 	* fork-child.c (breakup_args): Make 'scratch' non-const.
> 	Replace separators with NULL terminators in place.  Change
> 	type of vector.  (fork_inferior): The argument vector now
> 	holds non-owning pointers.  Don't strdup strings into the
> 	vector.  Remove free_vector_argv call.
> ---
>  gdb/fork-child.c | 278 ++++++++++++++++++++++++++++++++++---------------------
>  1 file changed, 172 insertions(+), 106 deletions(-)
>
> diff --git a/gdb/fork-child.c b/gdb/fork-child.c
> index 6b7386e..91986f7 100644
> --- a/gdb/fork-child.c
> +++ b/gdb/fork-child.c
> @@ -43,38 +43,96 @@ extern char **environ;
>  
>  static char *exec_wrapper;
>  
> -/* Break up SCRATCH into an argument vector suitable for passing to
> -   execvp and store it in ARGV.  E.g., on "run a b c d" this routine
> +/* Build the argument vector for execv.  */
> +
> +class execv_argv_builder
> +{
> +public:
> +  /* EXEC_FILE is the file to run.  ALLARGS is a string containing the
> +     arguments to the program.  If starting with a shell, SHELL_FILE
> +     is the shell to run.  Otherwise, SHELL_FILE is NULL.  */
> +  execv_argv_builder (const char *exec_file,
> +		      const std::string &allargs,
> +		      const char *shell_file);
> +
> +  /* Return a pointer to the built argv, in the type expected by
> +     execv.  */
> +  char **argv ()
> +  {
> +    /* It is guaranteed that the exec functions do not modify the
> +       arguments, but they nevertheless expect "char **", so that's
> +       the type we return.  */
> +    return const_cast<char **> (&m_argv[0]);
> +  }
> +
> +private:
> +  void build_no_shell (const char *exec_file,
> +		       const std::string &allargs);
> +
> +  void build_shell (const char *exec_file,
> +		    const std::string &allargs,
> +		    const char *shell_file);

Missing comments on the functions.

> +
> +  /* The argument vector built.  Holds non-owning pointers.  */
> +  std::vector<const char *> m_argv;

OOC, I notice you're prefixing a lot (if not all) of your class
variables with "m_".  I understand this is a code convention to signal
when a variable is a member of a class, but are we following this
officially?  Because (again!) I'm not really fond of this :-).

> +  /* Storage.  Arguments in C_ARGV point inside these.  */
> +
> +  /* For the !shell case.  */
> +  std::string m_broke_up_args;

If we're going this way, then I'd like to see this comment expanded to
explain how the args are split, i.e., using \0 as the delimiter.

> +
> +  /* For the shell case.  */
> +  std::string m_shell_command;

> +};
> +
> +/* Break up allargs into an argument vector suitable for passing to
> +   execvp and store it in M_ARGV.  E.g., on "run a b c d" this routine
>     would get as input the string "a b c d", and as output it would
> -   fill in ARGV with the four arguments "a", "b", "c", "d".  */
> +   fill in M_ARGV with the four arguments "a", "b", "c", "d".  */
>  
> -static void
> -breakup_args (const std::string &scratch, std::vector<char *> &argv)
> +void
> +execv_argv_builder::build_no_shell (const char *exec_file,
> +				    const std::string &allargs)
>  {
> -  for (size_t cur_pos = 0; cur_pos < scratch.size ();)
> +  /* We're going to call execvp.  Create argument vector.  */
> +
> +  /* Save/work with a copy.  The pointers pushed to M_ARGV point
> +     directly into M_BROKE_UP_ARGS, which is modified in place with
> +     the necessary NULL terminators.  */
> +  m_broke_up_args = allargs;
> +
> +  m_argv.push_back (exec_file);
> +
> +  for (size_t cur_pos = 0; cur_pos < m_broke_up_args.size ();)
>      {
>        /* Skip whitespace-like chars.  */
> -      std::size_t pos = scratch.find_first_not_of (" \t\n", cur_pos);
> +      std::size_t pos = m_broke_up_args.find_first_not_of (" \t\n", cur_pos);
>  
>        if (pos != std::string::npos)
>  	cur_pos = pos;
>  
>        /* Find the position of the next separator.  */
> -      std::size_t next_sep = scratch.find_first_of (" \t\n", cur_pos);
> +      std::size_t next_sep = m_broke_up_args.find_first_of (" \t\n", cur_pos);
>  
> -      /* No separator found, which means this is the last
> -	 argument.  */
>        if (next_sep == std::string::npos)
> -	next_sep = scratch.size ();
> +	{
> +	  /* No separator found, which means this is the last
> +	     argument.  */
> +	  next_sep = m_broke_up_args.size ();
> +	}
> +      else
> +	{
> +	  /* Replace the separator with a terminator.  */
> +	  m_broke_up_args[next_sep++] = '\0';
> +	}
>  
> -      char *arg = savestring (scratch.c_str () + cur_pos, next_sep - cur_pos);
> -      argv.push_back (arg);
> +      m_argv.push_back (&m_broke_up_args[cur_pos]);
>  
>        cur_pos = next_sep;
>      }
>  
>    /* NULL-terminate the vector.  */
> -  argv.push_back (NULL);
> +  m_argv.push_back (NULL);
>  }
>  
>  /* When executing a command under the given shell, return non-zero if
> @@ -101,6 +159,101 @@ escape_bang_in_quoted_argument (const char *shell_file)
>    return 0;
>  }

Missing a comment here explaining the function below (or pointing to the
right place).

> +execv_argv_builder::execv_argv_builder (const char *exec_file,
> +					const std::string &allargs,
> +					const char *shell_file)
> +{
> +  if (shell_file == NULL)
> +    build_no_shell (exec_file, allargs);
> +  else
> +    build_shell (exec_file, allargs, shell_file);
> +}

And here.

> +
> +void
> +execv_argv_builder::build_shell (const char *exec_file,
> +				 const std::string &allargs,
> +				 const char *shell_file)
> +{
> +  /* We're going to call a shell.  */
> +  const char *p;
> +  int need_to_quote;
> +  const int escape_bang = escape_bang_in_quoted_argument (shell_file);
> +
> +  m_shell_command = "exec ";
> +
> +  /* Add any exec wrapper.  That may be a program name with arguments,
> +     so the user must handle quoting.  */
> +  if (exec_wrapper)
> +    {
> +      m_shell_command += exec_wrapper;
> +      m_shell_command += ' ';
> +    }
> +
> +  /* Now add exec_file, quoting as necessary.  */
> +
> +  /* Quoting in this style is said to work with all shells.  But csh
> +     on IRIX 4.0.1 can't deal with it.  So we only quote it if we need
> +     to.  */
> +  p = exec_file;
> +  while (1)
> +    {
> +      switch (*p)
> +	{
> +	case '\'':
> +	case '!':
> +	case '"':
> +	case '(':
> +	case ')':
> +	case '$':
> +	case '&':
> +	case ';':
> +	case '<':
> +	case '>':
> +	case ' ':
> +	case '\n':
> +	case '\t':
> +	  need_to_quote = 1;
> +	  goto end_scan;
> +
> +	case '\0':
> +	  need_to_quote = 0;
> +	  goto end_scan;
> +
> +	default:
> +	  break;
> +	}
> +      ++p;
> +    }
> + end_scan:
> +  if (need_to_quote)
> +    {
> +      m_shell_command += '\'';
> +      for (p = exec_file; *p != '\0'; ++p)
> +	{
> +	  if (*p == '\'')
> +	    m_shell_command += "'\\''";
> +	  else if (*p == '!' && escape_bang)
> +	    m_shell_command += "\\!";
> +	  else
> +	    m_shell_command += *p;
> +	}
> +      m_shell_command += '\'';
> +    }
> +  else
> +    m_shell_command += exec_file;
> +
> +  m_shell_command += " " + allargs;
> +
> +  /* If we decided above to start up with a shell, we exec the shell,
> +     "-c" says to interpret the next arg as a shell command to
> +     execute, and this command is "exec <target-program> <args>".  */
> +  m_argv.reserve (4);
> +  m_argv.push_back (shell_file);
> +  m_argv.push_back ("-c");
> +  m_argv.push_back (m_shell_command.c_str ());
> +  m_argv.push_back (NULL);
> +}
> +
>  /* See inferior.h.  */
>  
>  void
> @@ -155,7 +308,6 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs,
>    static const char *exec_file;
>    char **save_our_env;
>    int shell = 0;
> -  std::vector<char *> argv;

I believe you can also remove the 'static' keyword from the 'shell_file'
declaration.

>    const char *inferior_io_terminal = get_inferior_io_terminal ();
>    struct inferior *inf;
>    int i;
> @@ -183,95 +335,9 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs,
>        shell = 1;
>      }
>  
> -  if (!shell)
> -    {
> -      /* We're going to call execvp.  Create argument vector.  */
> -      argv.push_back (xstrdup (exec_file));
> -      breakup_args (allargs, argv);
> -    }
> -  else
> -    {
> -      /* We're going to call a shell.  */
> -      std::string shell_command;
> -      const char *p;
> -      int need_to_quote;
> -      const int escape_bang = escape_bang_in_quoted_argument (shell_file);
> -
> -      shell_command = std::string ("exec ");
> -
> -      /* Add any exec wrapper.  That may be a program name with arguments, so
> -	 the user must handle quoting.  */
> -      if (exec_wrapper)
> -	{
> -	  shell_command += exec_wrapper;
> -	  shell_command += ' ';
> -	}
> -
> -      /* Now add exec_file, quoting as necessary.  */
> -
> -      /* Quoting in this style is said to work with all shells.  But
> -         csh on IRIX 4.0.1 can't deal with it.  So we only quote it if
> -         we need to.  */
> -      p = exec_file;
> -      while (1)
> -	{
> -	  switch (*p)
> -	    {
> -	    case '\'':
> -	    case '!':
> -	    case '"':
> -	    case '(':
> -	    case ')':
> -	    case '$':
> -	    case '&':
> -	    case ';':
> -	    case '<':
> -	    case '>':
> -	    case ' ':
> -	    case '\n':
> -	    case '\t':
> -	      need_to_quote = 1;
> -	      goto end_scan;
> -
> -	    case '\0':
> -	      need_to_quote = 0;
> -	      goto end_scan;
> -
> -	    default:
> -	      break;
> -	    }
> -	  ++p;
> -	}
> -    end_scan:
> -      if (need_to_quote)
> -	{
> -	  shell_command += '\'';
> -	  for (p = exec_file; *p != '\0'; ++p)
> -	    {
> -	      if (*p == '\'')
> -		shell_command += "'\\''";
> -	      else if (*p == '!' && escape_bang)
> -		shell_command += "\\!";
> -	      else
> -		shell_command += *p;
> -	    }
> -	  shell_command += '\'';
> -	}
> -      else
> -	shell_command += exec_file;
> -
> -      shell_command += " " + allargs;
> -
> -      /* If we decided above to start up with a shell, we exec the
> -	 shell, "-c" says to interpret the next arg as a shell command
> -	 to execute, and this command is "exec <target-program>
> -	 <args>".  We xstrdup all the strings here because they will
> -	 be free'd later in the code.  */
> -      argv.push_back (xstrdup (shell_file));
> -      argv.push_back (xstrdup ("-c"));
> -      argv.push_back (xstrdup (shell_command.c_str ()));
> -      argv.push_back (NULL);
> -    }
> +  /* Build the argument vector.  */
> +  execv_argv_builder argv_builder (exec_file, allargs,
> +				   shell ? shell_file : NULL);

Good code cleanup!

BTW, it is possible to simplify this part and remove the "shell"
variable completely, using only "shell_file".

>  
>    /* Retain a copy of our environment variables, since the child will
>       replace the value of environ and if we're vforked, we have to
> @@ -376,6 +442,8 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs,
>           path to find $SHELL.  Rich Pixley says so, and I agree.  */
>        environ = env;
>  
> +      char **argv = argv_builder.argv ();
> +
>        if (exec_fun != NULL)
>          (*exec_fun) (argv[0], &argv[0], env);
>        else
> @@ -393,8 +461,6 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs,
>        _exit (0177);
>      }
>  
> -  free_vector_argv (argv);
> -
>    /* Restore our environment in case a vforked child clob'd it.  */
>    environ = save_our_env;
>  
> -- 
> 2.5.5

Otherwise, LGTM.

Thanks for doing this,
  

Patch

diff --git a/gdb/fork-child.c b/gdb/fork-child.c
index 6b7386e..91986f7 100644
--- a/gdb/fork-child.c
+++ b/gdb/fork-child.c
@@ -43,38 +43,96 @@  extern char **environ;
 
 static char *exec_wrapper;
 
-/* Break up SCRATCH into an argument vector suitable for passing to
-   execvp and store it in ARGV.  E.g., on "run a b c d" this routine
+/* Build the argument vector for execv.  */
+
+class execv_argv_builder
+{
+public:
+  /* EXEC_FILE is the file to run.  ALLARGS is a string containing the
+     arguments to the program.  If starting with a shell, SHELL_FILE
+     is the shell to run.  Otherwise, SHELL_FILE is NULL.  */
+  execv_argv_builder (const char *exec_file,
+		      const std::string &allargs,
+		      const char *shell_file);
+
+  /* Return a pointer to the built argv, in the type expected by
+     execv.  */
+  char **argv ()
+  {
+    /* It is guaranteed that the exec functions do not modify the
+       arguments, but they nevertheless expect "char **", so that's
+       the type we return.  */
+    return const_cast<char **> (&m_argv[0]);
+  }
+
+private:
+  void build_no_shell (const char *exec_file,
+		       const std::string &allargs);
+
+  void build_shell (const char *exec_file,
+		    const std::string &allargs,
+		    const char *shell_file);
+
+  /* The argument vector built.  Holds non-owning pointers.  */
+  std::vector<const char *> m_argv;
+
+  /* Storage.  Arguments in C_ARGV point inside these.  */
+
+  /* For the !shell case.  */
+  std::string m_broke_up_args;
+
+  /* For the shell case.  */
+  std::string m_shell_command;
+};
+
+/* Break up allargs into an argument vector suitable for passing to
+   execvp and store it in M_ARGV.  E.g., on "run a b c d" this routine
    would get as input the string "a b c d", and as output it would
-   fill in ARGV with the four arguments "a", "b", "c", "d".  */
+   fill in M_ARGV with the four arguments "a", "b", "c", "d".  */
 
-static void
-breakup_args (const std::string &scratch, std::vector<char *> &argv)
+void
+execv_argv_builder::build_no_shell (const char *exec_file,
+				    const std::string &allargs)
 {
-  for (size_t cur_pos = 0; cur_pos < scratch.size ();)
+  /* We're going to call execvp.  Create argument vector.  */
+
+  /* Save/work with a copy.  The pointers pushed to M_ARGV point
+     directly into M_BROKE_UP_ARGS, which is modified in place with
+     the necessary NULL terminators.  */
+  m_broke_up_args = allargs;
+
+  m_argv.push_back (exec_file);
+
+  for (size_t cur_pos = 0; cur_pos < m_broke_up_args.size ();)
     {
       /* Skip whitespace-like chars.  */
-      std::size_t pos = scratch.find_first_not_of (" \t\n", cur_pos);
+      std::size_t pos = m_broke_up_args.find_first_not_of (" \t\n", cur_pos);
 
       if (pos != std::string::npos)
 	cur_pos = pos;
 
       /* Find the position of the next separator.  */
-      std::size_t next_sep = scratch.find_first_of (" \t\n", cur_pos);
+      std::size_t next_sep = m_broke_up_args.find_first_of (" \t\n", cur_pos);
 
-      /* No separator found, which means this is the last
-	 argument.  */
       if (next_sep == std::string::npos)
-	next_sep = scratch.size ();
+	{
+	  /* No separator found, which means this is the last
+	     argument.  */
+	  next_sep = m_broke_up_args.size ();
+	}
+      else
+	{
+	  /* Replace the separator with a terminator.  */
+	  m_broke_up_args[next_sep++] = '\0';
+	}
 
-      char *arg = savestring (scratch.c_str () + cur_pos, next_sep - cur_pos);
-      argv.push_back (arg);
+      m_argv.push_back (&m_broke_up_args[cur_pos]);
 
       cur_pos = next_sep;
     }
 
   /* NULL-terminate the vector.  */
-  argv.push_back (NULL);
+  m_argv.push_back (NULL);
 }
 
 /* When executing a command under the given shell, return non-zero if
@@ -101,6 +159,101 @@  escape_bang_in_quoted_argument (const char *shell_file)
   return 0;
 }
 
+execv_argv_builder::execv_argv_builder (const char *exec_file,
+					const std::string &allargs,
+					const char *shell_file)
+{
+  if (shell_file == NULL)
+    build_no_shell (exec_file, allargs);
+  else
+    build_shell (exec_file, allargs, shell_file);
+}
+
+void
+execv_argv_builder::build_shell (const char *exec_file,
+				 const std::string &allargs,
+				 const char *shell_file)
+{
+  /* We're going to call a shell.  */
+  const char *p;
+  int need_to_quote;
+  const int escape_bang = escape_bang_in_quoted_argument (shell_file);
+
+  m_shell_command = "exec ";
+
+  /* Add any exec wrapper.  That may be a program name with arguments,
+     so the user must handle quoting.  */
+  if (exec_wrapper)
+    {
+      m_shell_command += exec_wrapper;
+      m_shell_command += ' ';
+    }
+
+  /* Now add exec_file, quoting as necessary.  */
+
+  /* Quoting in this style is said to work with all shells.  But csh
+     on IRIX 4.0.1 can't deal with it.  So we only quote it if we need
+     to.  */
+  p = exec_file;
+  while (1)
+    {
+      switch (*p)
+	{
+	case '\'':
+	case '!':
+	case '"':
+	case '(':
+	case ')':
+	case '$':
+	case '&':
+	case ';':
+	case '<':
+	case '>':
+	case ' ':
+	case '\n':
+	case '\t':
+	  need_to_quote = 1;
+	  goto end_scan;
+
+	case '\0':
+	  need_to_quote = 0;
+	  goto end_scan;
+
+	default:
+	  break;
+	}
+      ++p;
+    }
+ end_scan:
+  if (need_to_quote)
+    {
+      m_shell_command += '\'';
+      for (p = exec_file; *p != '\0'; ++p)
+	{
+	  if (*p == '\'')
+	    m_shell_command += "'\\''";
+	  else if (*p == '!' && escape_bang)
+	    m_shell_command += "\\!";
+	  else
+	    m_shell_command += *p;
+	}
+      m_shell_command += '\'';
+    }
+  else
+    m_shell_command += exec_file;
+
+  m_shell_command += " " + allargs;
+
+  /* If we decided above to start up with a shell, we exec the shell,
+     "-c" says to interpret the next arg as a shell command to
+     execute, and this command is "exec <target-program> <args>".  */
+  m_argv.reserve (4);
+  m_argv.push_back (shell_file);
+  m_argv.push_back ("-c");
+  m_argv.push_back (m_shell_command.c_str ());
+  m_argv.push_back (NULL);
+}
+
 /* See inferior.h.  */
 
 void
@@ -155,7 +308,6 @@  fork_inferior (const char *exec_file_arg, const std::string &allargs,
   static const char *exec_file;
   char **save_our_env;
   int shell = 0;
-  std::vector<char *> argv;
   const char *inferior_io_terminal = get_inferior_io_terminal ();
   struct inferior *inf;
   int i;
@@ -183,95 +335,9 @@  fork_inferior (const char *exec_file_arg, const std::string &allargs,
       shell = 1;
     }
 
-  if (!shell)
-    {
-      /* We're going to call execvp.  Create argument vector.  */
-      argv.push_back (xstrdup (exec_file));
-      breakup_args (allargs, argv);
-    }
-  else
-    {
-      /* We're going to call a shell.  */
-      std::string shell_command;
-      const char *p;
-      int need_to_quote;
-      const int escape_bang = escape_bang_in_quoted_argument (shell_file);
-
-      shell_command = std::string ("exec ");
-
-      /* Add any exec wrapper.  That may be a program name with arguments, so
-	 the user must handle quoting.  */
-      if (exec_wrapper)
-	{
-	  shell_command += exec_wrapper;
-	  shell_command += ' ';
-	}
-
-      /* Now add exec_file, quoting as necessary.  */
-
-      /* Quoting in this style is said to work with all shells.  But
-         csh on IRIX 4.0.1 can't deal with it.  So we only quote it if
-         we need to.  */
-      p = exec_file;
-      while (1)
-	{
-	  switch (*p)
-	    {
-	    case '\'':
-	    case '!':
-	    case '"':
-	    case '(':
-	    case ')':
-	    case '$':
-	    case '&':
-	    case ';':
-	    case '<':
-	    case '>':
-	    case ' ':
-	    case '\n':
-	    case '\t':
-	      need_to_quote = 1;
-	      goto end_scan;
-
-	    case '\0':
-	      need_to_quote = 0;
-	      goto end_scan;
-
-	    default:
-	      break;
-	    }
-	  ++p;
-	}
-    end_scan:
-      if (need_to_quote)
-	{
-	  shell_command += '\'';
-	  for (p = exec_file; *p != '\0'; ++p)
-	    {
-	      if (*p == '\'')
-		shell_command += "'\\''";
-	      else if (*p == '!' && escape_bang)
-		shell_command += "\\!";
-	      else
-		shell_command += *p;
-	    }
-	  shell_command += '\'';
-	}
-      else
-	shell_command += exec_file;
-
-      shell_command += " " + allargs;
-
-      /* If we decided above to start up with a shell, we exec the
-	 shell, "-c" says to interpret the next arg as a shell command
-	 to execute, and this command is "exec <target-program>
-	 <args>".  We xstrdup all the strings here because they will
-	 be free'd later in the code.  */
-      argv.push_back (xstrdup (shell_file));
-      argv.push_back (xstrdup ("-c"));
-      argv.push_back (xstrdup (shell_command.c_str ()));
-      argv.push_back (NULL);
-    }
+  /* Build the argument vector.  */
+  execv_argv_builder argv_builder (exec_file, allargs,
+				   shell ? shell_file : NULL);
 
   /* Retain a copy of our environment variables, since the child will
      replace the value of environ and if we're vforked, we have to
@@ -376,6 +442,8 @@  fork_inferior (const char *exec_file_arg, const std::string &allargs,
          path to find $SHELL.  Rich Pixley says so, and I agree.  */
       environ = env;
 
+      char **argv = argv_builder.argv ();
+
       if (exec_fun != NULL)
         (*exec_fun) (argv[0], &argv[0], env);
       else
@@ -393,8 +461,6 @@  fork_inferior (const char *exec_file_arg, const std::string &allargs,
       _exit (0177);
     }
 
-  free_vector_argv (argv);
-
   /* Restore our environment in case a vforked child clob'd it.  */
   environ = save_our_env;