[RFC] Support debuginfo and source file fetching via debuginfo server

Message ID CAJDtP-S6DpxEGG=T01h5_O8nbYYme1+ErJhcfJEomHBm2OObKQ@mail.gmail.com
State New, archived
Headers

Commit Message

Aaron Merey Nov. 5, 2019, 1:59 a.m. UTC
  On Mon, Nov 4, 2019 at 11:26 AM Frank Ch. Eigler <fche@redhat.com> wrote:
>
> Hi -
>
> > I haven't followed this thread closely, so it might be an obvious "no", but
> > I was wondering if the library could offer to install a callback that is call
> > relatively often, allowing GDB to interrupt the operation [...]
>
> That could also work, at the cost of extending the API.  Will play with it.
>
> - FChE

Attached is the updated debuginfo and source lookup code, using the
callback idea that Simon suggested. Downloads are now cancelled upon
SIGINT and the callback can be modified with code that prints progress
messages (the a and b parameters in the callback represent the
numerator and denominator of the download's completion fraction).
  

Comments

Simon Marchi Nov. 5, 2019, 5:27 a.m. UTC | #1
On 2019-11-04 8:59 p.m., Aaron Merey wrote:
> On Mon, Nov 4, 2019 at 11:26 AM Frank Ch. Eigler <fche@redhat.com> wrote:
>>
>> Hi -
>>
>>> I haven't followed this thread closely, so it might be an obvious "no", but
>>> I was wondering if the library could offer to install a callback that is call
>>> relatively often, allowing GDB to interrupt the operation [...]
>>
>> That could also work, at the cost of extending the API.  Will play with it.
>>
>> - FChE
> 
> Attached is the updated debuginfo and source lookup code, using the
> callback idea that Simon suggested. Downloads are now cancelled upon
> SIGINT and the callback can be modified with code that prints progress
> messages (the a and b parameters in the callback represent the
> numerator and denominator of the download's completion fraction).

Cool, thanks!

Maybe a user_data void pointer would be useful, to be able to get some
context in the callback?

I am noticing that debuginfod calls don't have any "context" or "handle"
parameters, which could suggest that the state of the library (if there
is any) is kept as global variables.  Is it safe to use the library to
do lookups in parallel, from different threads?  It is possible that
GDB or another program will want to do that some day.

Simon
  
Simon Marchi Nov. 5, 2019, 5:35 a.m. UTC | #2
On 2019-11-04 8:59 p.m., Aaron Merey wrote:
> Attached is the updated debuginfo and source lookup code, using the
> callback idea that Simon suggested. Downloads are now cancelled upon
> SIGINT and the callback can be modified with code that prints progress
> messages (the a and b parameters in the callback represent the
> numerator and denominator of the download's completion fraction).

I took a peek at the debuginfod code and have one quick suggestion.  It would
help to understand what is what if the debuginfod was subdivided in "client" and
"server" directories, and perhaps a "common" directory, if there's any code
to share between them.

Simon
  
Frank Ch. Eigler Nov. 5, 2019, 10:25 a.m. UTC | #3
Hi -

> Maybe a user_data void pointer would be useful, to be able to get some
> context in the callback?

In your case, with a single-threaded app with C++ lambda captured
variables, you wouldn't need the library to do that.  In a
multithreaded app's case, __thread variables and the promise to call
those functions back from the calling thread should again let one get
context if required.

> I am noticing that debuginfod calls don't have any "context" or "handle"
> parameters, which could suggest that the state of the library (if there
> is any) is kept as global variables.  [...]

It turns out it's the opposite: except for this callback function
pointer, it's practically all local variables therefore multithread-safe.

- FChE
  
Simon Marchi Nov. 5, 2019, 3:19 p.m. UTC | #4
On 2019-11-05 5:25 a.m., Frank Ch. Eigler wrote:
> Hi -
> 
>> Maybe a user_data void pointer would be useful, to be able to get some
>> context in the callback?
> 
> In your case, with a single-threaded app with C++ lambda captured
> variables, you wouldn't need the library to do that.  In a
> multithreaded app's case, __thread variables and the promise to call
> those functions back from the calling thread should again let one get
> context if required.

Hmm does it actually work to capture some variables in the C++ lambda?

Given that the callback is declared as a simple function pointer:

  64 typedef int (*debuginfod_progressfn_t)(long a, long b);
  65 debuginfod_progressfn_t debuginfod_set_progressfn(debuginfod_progressfn_t fn);

I'm not sure that we can pass a lambda that captures stuff.  Just like this program:

    static void func(void (*callback)(void)) { callback(); }
    int main() { int a; func([a] { }); return 0; }

Generates this error:

    test.cpp: In function ‘int main()’:
    test.cpp:2:33: error: cannot convert ‘main()::<lambda()>’ to ‘void (*)()’
        2 | int main() { int a; func([a] { }); return 0; }
          |                                 ^
          |                                 |
          |                                 main()::<lambda()>
    test.cpp:1:25: note:   initializing argument 1 of ‘void func(void (*)())’
        1 | static void func(void (*callback)(void)) { callback(); }
          |                  ~~~~~~~^~~~~~~~~~~~~~~

>> I am noticing that debuginfod calls don't have any "context" or "handle"
>> parameters, which could suggest that the state of the library (if there
>> is any) is kept as global variables.  [...]

I think that providing a void pointer for context is pretty standard thing
for C libraries that accept callbacks.  I would use __thread and the promise
to call the function back in the same thread to "retro-fit" some thread-safety
in an existing API that doesn't provide a context pointer, and that can't be
changed.  But if the API is new, I think the context pointer solution is more
obvious for the user.

> It turns out it's the opposite: except for this callback function
> pointer, it's practically all local variables therefore multithread-safe.

Ok, well I think it would be a good reason to adopt an API of the style:

    debuginfod_client *client = debuginfod_create_client ();
    debuginfod_set_progressfn (client, my_progress_cb);

and where you pass `client` to all further calls.  That client object would
keep the callback for those calls, but also any other state that you might
want to keep across in the future.  Again, this is easy to add in the
beginning, but hard to retro-fit later on.

Simon
  
Frank Ch. Eigler Nov. 5, 2019, 3:50 p.m. UTC | #5
Hi, Simon -

> [...]
> I think that providing a void pointer for context is pretty standard thing
> for C libraries that accept callbacks.  I would use __thread and the promise
> to call the function back in the same thread to "retro-fit" some thread-safety
> in an existing API that doesn't provide a context pointer, 

Remember that this is a batch file downloading tool.  We expect it to
be interrupted if a user gets impatient, in which case such a signal
would affect all downloads.  In the absence of a plausible user story
for the need for fine control over multiple different debuginfo
transfers in progress, I believe something simple is enough for now.


> Ok, well I think it would be a good reason to adopt an API of the
> style: [...]  Again, this is easy to add in the beginning, but hard
> to retro-fit later on.

Retro-fitting it later on is not that bad, when/if actual use cases
appear.  The default 'client context' can be the current global one.
A hypothetical future api can pass void* context parameters; we can
have two callback function types.  It being simple now does not
preclude necessary complexity later.  I appreciate the ideas, really,
but I am about as wary of overengineering as underengineering.

- FChE
  

Patch

 gdb/elfread.c | 36 +++++++++++++++++++++++++++++++++++-
 gdb/source.c  | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/gdb/elfread.c b/gdb/elfread.c
index 226e3f09d3..e24bca610f 100644
--- a/gdb/elfread.c
+++ b/gdb/elfread.c
@@ -48,7 +48,11 @@ 
 #include "auxv.h"
 #include "mdebugread.h"
 #include "ctfread.h"
+#include "gdbsupport/scoped_fd.h"
 #include "gdbsupport/gdb_string_view.h"
+#if HAVE_LIBDEBUGINFOD
+#include <elfutils/debuginfod.h>
+#endif
 
 /* Forward declarations.  */
 extern const struct sym_fns elf_sym_fns_gdb_index;
@@ -1311,8 +1315,38 @@  elf_symfile_read (struct objfile *objfile, symfile_add_flags symfile_flags)
 	  symbol_file_add_separate (debug_bfd.get (), debugfile.c_str (),
 				    symfile_flags, objfile);
 	}
-	else
+      else
+        {
 	  has_dwarf2 = false;
+
+#if HAVE_LIBDEBUGINFOD
+          const struct bfd_build_id *build_id;
+          char *symfile_path;
+
+          build_id = build_id_bfd_get (objfile->obfd);
+
+          /* Allow debuginfod to abort the download if SIGINT is raised.  */
+          debuginfod_set_progressfn (
+            [] (long a, long b) { return 1 ? check_quit_flag () : 0; }
+          );
+
+          /* Query debuginfod servers for symfile.  */
+          scoped_fd fd (debuginfod_find_debuginfo (build_id->data,
+                                                   build_id->size,
+                                                   &symfile_path));
+
+          if (fd.get () >= 0)
+	     {
+              /* file successfully retrieved from server.  */
+              gdb_bfd_ref_ptr debug_bfd (symfile_bfd_open (symfile_path));
+
+              symbol_file_add_separate (debug_bfd.get (), symfile_path,
+                                        symfile_flags, objfile);
+              xfree (symfile_path);
+              has_dwarf2 = true;
+            }
+#endif /* LIBDEBUGINFOD */
+        }
     }
 
   /* Read the CTF section only if there is no DWARF info.  */
diff --git a/gdb/source.c b/gdb/source.c
index 9f53d654f3..4e8a6558b7 100644
--- a/gdb/source.c
+++ b/gdb/source.c
@@ -30,6 +30,10 @@ 
 
 #include <sys/types.h>
 #include <fcntl.h>
+#include "build-id.h"
+#ifdef HAVE_LIBDEBUGINFOD
+#include <elfutils/debuginfod.h>
+#endif
 #include "gdbcore.h"
 #include "gdb_regex.h"
 #include "symfile.h"
@@ -1127,6 +1131,49 @@  open_source_file (struct symtab *s)
   s->fullname = NULL;
   scoped_fd fd = find_and_open_source (s->filename, SYMTAB_DIRNAME (s),
 				       &fullname);
+
+#if HAVE_LIBDEBUGINFOD
+  if (fd.get () < 0 && SYMTAB_COMPUNIT (s) != NULL)
+    {
+      const struct bfd_build_id *build_id;
+      const objfile *ofp = COMPUNIT_OBJFILE (SYMTAB_COMPUNIT (s));
+
+      std::string srcpath;
+      if (IS_DIR_SEPARATOR (s->filename[0]))
+        srcpath = s->filename;
+      else
+        {
+          srcpath = SYMTAB_DIRNAME (s);
+          srcpath += SLASH_STRING;
+          srcpath += s->filename;
+        }
+
+      build_id = build_id_bfd_get (ofp->obfd);
+
+      /* Query debuginfod for the source file.  */
+      if (build_id != NULL)
+        {
+          char *name_in_cache;
+
+          /* Allow debuginfod to abort the download if SIGINT is raised.  */
+          debuginfod_set_progressfn (
+            [] (long a, long b) { return 1 ? check_quit_flag () : 0; }
+          );
+
+          scoped_fd src_fd (debuginfod_find_source (build_id->data,
+                                                    build_id->size,
+                                                    srcpath.c_str (),
+                                                    &name_in_cache));
+
+          if (src_fd.get () >= 0)
+            fullname.reset (name_in_cache);
+
+          s->fullname = fullname.release ();
+          return src_fd;
+        }
+    }
+#endif /* HAVE_LIBDEBUGINFOD */
+
   s->fullname = fullname.release ();
   return fd;
 }