[PATCHv2] gdb/debuginfod: cleanup debuginfod earlier

Message ID a6d031964ed8cba24035c3062f2b5f91fd0a6faf.1686231184.git.aburgess@redhat.com
State New
Headers
Series [PATCHv2] gdb/debuginfod: cleanup debuginfod earlier |

Commit Message

Andrew Burgess June 8, 2023, 1:33 p.m. UTC
  In V2:

  - Complete rewrite inline with Simon's suggestion.

--

A GDB crash was discovered on Fedora GDB that was tracked back to an
issue with the way that debuginfod is cleaned up.

The bug was reported on Fedora 37, 38, and 39.  Here are the steps to
reproduce:

1. The file /etc/ssl/openssl.cnf contains the following lines:

   [provider_sect]
   default = default_sect
   ##legacy = legacy_sect
   ##
   [default_sect]
   activate = 1

   ##[legacy_sect]
   ##activate = 1

   The bug will occur when the '##' characters are removed so that the
   lines in question look like this:

   [provider_sect]
   default = default_sect
   legacy = legacy_sect

   [default_sect]
   activate = 1

   [legacy_sect]
   activate = 1

2. Clean up any existing debuginfod cache data:

   > rm -rf $HOME/.cache/debuginfod_client

3. Run GDB:

   > gdb -nx -q -iex 'set trace-commands on' \
                -iex 'set debuginfod enabled on' \
                -iex 'set confirm off' \
		-ex 'start' -ex 'quit' /bin/ls
   +set debuginfod enabled on
   +set confirm off
   Reading symbols from /bin/ls...
   Downloading separate debug info for /usr/bin/ls
   ... snip ...
   Temporary breakpoint 1, main (argc=1, argv=0x7fffffffde38) at ../src/ls.c:1646
   1646    {
   +quit

   Fatal signal: Segmentation fault
   ----- Backtrace -----
   ... snip ...

So GDB ends up crashing during exit.

What's happening is that when debuginfod is initialised
debuginfod_begin is called (this is in the debuginfod library), this
in turn sets up libcurl, which makes use of openssl.  Somewhere during
this setup process an at_exit function is registered to cleanup some
state.

Back in GDB the debuginfod_client object is managed using this code:

  /* Deleter for a debuginfod_client.  */

  struct debuginfod_client_deleter
  {
    void operator() (debuginfod_client *c)
    {
      debuginfod_end (c);
    }
  };

  using debuginfod_client_up
    = std::unique_ptr<debuginfod_client, debuginfod_client_deleter>;

And then a global debuginfod_client_up is created to hold a pointer to
the debuginfod_client object.  As a global this will be cleaned up
using the standard C++ global object destructor mechanism, which is
run after the at_exit handlers.

However, it is expected that when debuginfod_end is called the
debuginfod_client object will still be in a usable state, that is, we
don't expect the at_exit handlers to have run and started cleaning up
the library state.

To fix this issue we need to ensure that debuginfod_end is called
before the at_exit handlers have a chance to run.

This commit removes the debuginfod_client_up type, and instead has GDB
hold a raw pointer to the debuginfod_client object.  We then make use
of GDB's make_final_cleanup to register a function that will call
debuginfod_end.

As GDB's final cleanups are called before exit is called, this means
that debuginfod_end will be called before the at_exit handlers are
called, and the crash identified above is resolved.

It's not obvious how this issue can easily be tested for. The bug does
not appear to manifest when using a local debuginfod server, so we'd
need to setup something more involved.  For now I'm proposing this
patch without any associated tests.

Co-Authored-By: Mark Wielaard <mark@klomp.org>
Co-Authored-By: Simon Marchi <simark@simark.ca>
---
 gdb/debuginfod-support.c | 47 +++++++++++++++++++++++++---------------
 1 file changed, 29 insertions(+), 18 deletions(-)


base-commit: baab375361c365afee2577c94cbbd3fdd443d6da
  

Comments

Mark Wielaard June 8, 2023, 3:09 p.m. UTC | #1
Hi Andrew,

On Thu, 2023-06-08 at 14:33 +0100, Andrew Burgess wrote:
> To fix this issue we need to ensure that debuginfod_end is called
> before the at_exit handlers have a chance to run.
> 
> This commit removes the debuginfod_client_up type, and instead has GDB
> hold a raw pointer to the debuginfod_client object.  We then make use
> of GDB's make_final_cleanup to register a function that will call
> debuginfod_end.
> 
> As GDB's final cleanups are called before exit is called, this means
> that debuginfod_end will be called before the at_exit handlers are
> called, and the crash identified above is resolved.

The nice thing is that the code got simpler.

> It's not obvious how this issue can easily be tested for. The bug does
> not appear to manifest when using a local debuginfod server, so we'd
> need to setup something more involved.  For now I'm proposing this
> patch without any associated tests.

Agreed. This really is a somewhat obscure bug that only triggers for a
legacy openssl backend in libcurl that is shutdown "too early".

> Co-Authored-By: Mark Wielaard <mark@klomp.org>

That is somewhat generous, I don't think any of what I suggested was
left in this new (and better) version :)


Code looks good to me.

Thanks,

Mark
  
Aaron Merey June 8, 2023, 9:08 p.m. UTC | #2
On Thu, Jun 8, 2023 at 9:34 AM Andrew Burgess <aburgess@redhat.com> wrote:
>
> It's not obvious how this issue can easily be tested for. The bug does
> not appear to manifest when using a local debuginfod server, so we'd
> need to setup something more involved.  For now I'm proposing this
> patch without any associated tests.

LGTM. Tested manually on F37.

Thanks,
Aaron
  
Tom Tromey June 9, 2023, 2:24 p.m. UTC | #3
>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:

Andrew> In V2:
Andrew>   - Complete rewrite inline with Simon's suggestion.

Thanks for doing this.  FWIW this looks good to me.
Reviewed-By: Tom Tromey <tom@tromey.com>

Tom
  
Andrew Burgess June 9, 2023, 2:38 p.m. UTC | #4
Andrew Burgess <aburgess@redhat.com> writes:

> In V2:
>
>   - Complete rewrite inline with Simon's suggestion.
>

Thanks for all the reviews.  I've now pushed this patch.

Thanks,
Andrew


> --
>
> A GDB crash was discovered on Fedora GDB that was tracked back to an
> issue with the way that debuginfod is cleaned up.
>
> The bug was reported on Fedora 37, 38, and 39.  Here are the steps to
> reproduce:
>
> 1. The file /etc/ssl/openssl.cnf contains the following lines:
>
>    [provider_sect]
>    default = default_sect
>    ##legacy = legacy_sect
>    ##
>    [default_sect]
>    activate = 1
>
>    ##[legacy_sect]
>    ##activate = 1
>
>    The bug will occur when the '##' characters are removed so that the
>    lines in question look like this:
>
>    [provider_sect]
>    default = default_sect
>    legacy = legacy_sect
>
>    [default_sect]
>    activate = 1
>
>    [legacy_sect]
>    activate = 1
>
> 2. Clean up any existing debuginfod cache data:
>
>    > rm -rf $HOME/.cache/debuginfod_client
>
> 3. Run GDB:
>
>    > gdb -nx -q -iex 'set trace-commands on' \
>                 -iex 'set debuginfod enabled on' \
>                 -iex 'set confirm off' \
> 		-ex 'start' -ex 'quit' /bin/ls
>    +set debuginfod enabled on
>    +set confirm off
>    Reading symbols from /bin/ls...
>    Downloading separate debug info for /usr/bin/ls
>    ... snip ...
>    Temporary breakpoint 1, main (argc=1, argv=0x7fffffffde38) at ../src/ls.c:1646
>    1646    {
>    +quit
>
>    Fatal signal: Segmentation fault
>    ----- Backtrace -----
>    ... snip ...
>
> So GDB ends up crashing during exit.
>
> What's happening is that when debuginfod is initialised
> debuginfod_begin is called (this is in the debuginfod library), this
> in turn sets up libcurl, which makes use of openssl.  Somewhere during
> this setup process an at_exit function is registered to cleanup some
> state.
>
> Back in GDB the debuginfod_client object is managed using this code:
>
>   /* Deleter for a debuginfod_client.  */
>
>   struct debuginfod_client_deleter
>   {
>     void operator() (debuginfod_client *c)
>     {
>       debuginfod_end (c);
>     }
>   };
>
>   using debuginfod_client_up
>     = std::unique_ptr<debuginfod_client, debuginfod_client_deleter>;
>
> And then a global debuginfod_client_up is created to hold a pointer to
> the debuginfod_client object.  As a global this will be cleaned up
> using the standard C++ global object destructor mechanism, which is
> run after the at_exit handlers.
>
> However, it is expected that when debuginfod_end is called the
> debuginfod_client object will still be in a usable state, that is, we
> don't expect the at_exit handlers to have run and started cleaning up
> the library state.
>
> To fix this issue we need to ensure that debuginfod_end is called
> before the at_exit handlers have a chance to run.
>
> This commit removes the debuginfod_client_up type, and instead has GDB
> hold a raw pointer to the debuginfod_client object.  We then make use
> of GDB's make_final_cleanup to register a function that will call
> debuginfod_end.
>
> As GDB's final cleanups are called before exit is called, this means
> that debuginfod_end will be called before the at_exit handlers are
> called, and the crash identified above is resolved.
>
> It's not obvious how this issue can easily be tested for. The bug does
> not appear to manifest when using a local debuginfod server, so we'd
> need to setup something more involved.  For now I'm proposing this
> patch without any associated tests.
>
> Co-Authored-By: Mark Wielaard <mark@klomp.org>
> Co-Authored-By: Simon Marchi <simark@simark.ca>
> ---
>  gdb/debuginfod-support.c | 47 +++++++++++++++++++++++++---------------
>  1 file changed, 29 insertions(+), 18 deletions(-)
>
> diff --git a/gdb/debuginfod-support.c b/gdb/debuginfod-support.c
> index 5853f420a18..a41a4c95785 100644
> --- a/gdb/debuginfod-support.c
> +++ b/gdb/debuginfod-support.c
> @@ -96,20 +96,6 @@ struct user_data
>    ui_out::progress_update progress;
>  };
>  
> -/* Deleter for a debuginfod_client.  */
> -
> -struct debuginfod_client_deleter
> -{
> -  void operator() (debuginfod_client *c)
> -  {
> -    debuginfod_end (c);
> -  }
> -};
> -
> -using debuginfod_client_up
> -  = std::unique_ptr<debuginfod_client, debuginfod_client_deleter>;
> -
> -
>  /* Convert SIZE into a unit suitable for use with progress updates.
>     SIZE should in given in bytes and will be converted into KB, MB, GB
>     or remain unchanged. UNIT will be set to "B", "KB", "MB" or "GB"
> @@ -180,20 +166,45 @@ progressfn (debuginfod_client *c, long cur, long total)
>    return 0;
>  }
>  
> +/* Cleanup ARG, which is a debuginfod_client pointer.  */
> +
> +static void
> +cleanup_debuginfod_client (void *arg)
> +{
> +  debuginfod_client *client = static_cast<debuginfod_client *> (arg);
> +  debuginfod_end (client);
> +}
> +
> +/* Return a pointer to the single global debuginfod_client, initialising it
> +   first if needed.  */
> +
>  static debuginfod_client *
>  get_debuginfod_client ()
>  {
> -  static debuginfod_client_up global_client;
> +  static debuginfod_client *global_client = nullptr;
>  
>    if (global_client == nullptr)
>      {
> -      global_client.reset (debuginfod_begin ());
> +      global_client = debuginfod_begin ();
>  
>        if (global_client != nullptr)
> -	debuginfod_set_progressfn (global_client.get (), progressfn);
> +	{
> +	  /* It is important that we cleanup the debuginfod_client object
> +	     before calling exit.  Some of the libraries used by debuginfod
> +	     make use of at_exit handlers to perform cleanup.
> +
> +	     If we wrapped the debuginfod_client in a unique_ptr and relied
> +	     on its destructor to cleanup then this would be run as part of
> +	     the global C++ object destructors, which is after the at_exit
> +	     handlers, which is too late.
> +
> +	     So instead, we make use of GDB's final cleanup mechanism.  */
> +	  make_final_cleanup (cleanup_debuginfod_client, global_client);
> +	  debuginfod_set_progressfn (global_client, progressfn);
> +	}
>      }
>  
> -  return global_client.get ();
> +  return global_client;
>  }
>  
>  /* Check if debuginfod is enabled.  If configured to do so, ask the user
>
> base-commit: baab375361c365afee2577c94cbbd3fdd443d6da
> -- 
> 2.25.4
  

Patch

diff --git a/gdb/debuginfod-support.c b/gdb/debuginfod-support.c
index 5853f420a18..a41a4c95785 100644
--- a/gdb/debuginfod-support.c
+++ b/gdb/debuginfod-support.c
@@ -96,20 +96,6 @@  struct user_data
   ui_out::progress_update progress;
 };
 
-/* Deleter for a debuginfod_client.  */
-
-struct debuginfod_client_deleter
-{
-  void operator() (debuginfod_client *c)
-  {
-    debuginfod_end (c);
-  }
-};
-
-using debuginfod_client_up
-  = std::unique_ptr<debuginfod_client, debuginfod_client_deleter>;
-
-
 /* Convert SIZE into a unit suitable for use with progress updates.
    SIZE should in given in bytes and will be converted into KB, MB, GB
    or remain unchanged. UNIT will be set to "B", "KB", "MB" or "GB"
@@ -180,20 +166,45 @@  progressfn (debuginfod_client *c, long cur, long total)
   return 0;
 }
 
+/* Cleanup ARG, which is a debuginfod_client pointer.  */
+
+static void
+cleanup_debuginfod_client (void *arg)
+{
+  debuginfod_client *client = static_cast<debuginfod_client *> (arg);
+  debuginfod_end (client);
+}
+
+/* Return a pointer to the single global debuginfod_client, initialising it
+   first if needed.  */
+
 static debuginfod_client *
 get_debuginfod_client ()
 {
-  static debuginfod_client_up global_client;
+  static debuginfod_client *global_client = nullptr;
 
   if (global_client == nullptr)
     {
-      global_client.reset (debuginfod_begin ());
+      global_client = debuginfod_begin ();
 
       if (global_client != nullptr)
-	debuginfod_set_progressfn (global_client.get (), progressfn);
+	{
+	  /* It is important that we cleanup the debuginfod_client object
+	     before calling exit.  Some of the libraries used by debuginfod
+	     make use of at_exit handlers to perform cleanup.
+
+	     If we wrapped the debuginfod_client in a unique_ptr and relied
+	     on its destructor to cleanup then this would be run as part of
+	     the global C++ object destructors, which is after the at_exit
+	     handlers, which is too late.
+
+	     So instead, we make use of GDB's final cleanup mechanism.  */
+	  make_final_cleanup (cleanup_debuginfod_client, global_client);
+	  debuginfod_set_progressfn (global_client, progressfn);
+	}
     }
 
-  return global_client.get ();
+  return global_client;
 }
 
 /* Check if debuginfod is enabled.  If configured to do so, ask the user