[2/4] python support for fetching separate debug files: have_debug_info

Message ID yjt2d28ho9cw.fsf@ruffy.mtv.corp.google.com
State New, archived
Headers

Commit Message

Doug Evans Nov. 20, 2014, 9:22 p.m. UTC
  This patch provides an API call to determine whether debug information
is present.  It's based on the same test that gdb internally uses
to decide whether to look for separate debug files.

2014-11-20  Doug Evans  <dje@google.com>

	* NEWS: Mention gdb.Objfile.have_debug_info.
	* python/py-objfile.c (objfpy_get_have_debug_info): New function.
	(objfile_getset): Add "have_debug_info".

	doc/
	* python.texi (Objfiles In Python): Document Objfile.have_debug_info.

	testsuite/
	* gdb.python/py-objfile.exp: Add tests for objfile.have_debug_info.
  

Comments

Eli Zaretskii Nov. 21, 2014, 7:41 a.m. UTC | #1
> From: Doug Evans <dje@google.com>
> Date: Thu, 20 Nov 2014 13:22:39 -0800
> 
> This patch provides an API call to determine whether debug information
> is present.  It's based on the same test that gdb internally uses
> to decide whether to look for separate debug files.

OK for the documentation parts.

Btw, I wonder why this is useful, given this caveat:

> +Note that a program compiled without @samp{-g} may still have some debug
> +information, e.g., from the @code{C} runtime.  Thus a value of @code{True}
> +for this attribute does not mean that debug information is present for
> +every source file in the program.  It only means that debug information
> +is present for at least one source file.

If this attribute cannot be relied upon, why is it a good idea to
expose it to Python?
  
Doug Evans Nov. 21, 2014, 5:33 p.m. UTC | #2
On Thu, Nov 20, 2014 at 11:41 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Doug Evans <dje@google.com>
>> Date: Thu, 20 Nov 2014 13:22:39 -0800
>>
>> This patch provides an API call to determine whether debug information
>> is present.  It's based on the same test that gdb internally uses
>> to decide whether to look for separate debug files.
>
> OK for the documentation parts.
>
> Btw, I wonder why this is useful, given this caveat:
>
>> +Note that a program compiled without @samp{-g} may still have some debug
>> +information, e.g., from the @code{C} runtime.  Thus a value of @code{True}
>> +for this attribute does not mean that debug information is present for
>> +every source file in the program.  It only means that debug information
>> +is present for at least one source file.
>
> If this attribute cannot be relied upon, why is it a good idea to
> expose it to Python?

It's a good question.
I thought about the name for this attribute for a non-insignificant
amount of time.
The problem that needs to be solved is for Python code to be able to tell
whether to spend time fetching separate debug files, as the latter can take
a significant amount of time.  Also, a program may use a large number of
shared libraries and the user may wish (or not wish) debug info to be
fetched for each one.  So we want, IMO, a simple and cheap initial
test for whether we need to fetch debug files.

For the use-case in question,  another way to look at the attribute is
"Has debug info been stripped or not?".
Thus one thought I had was naming it "is_stripped_debug".
The goal being to better specify the intent.  But it still has the
same problem: even if a binary didn't have debug info stripped that
doesn't mean that the desired pieces of it were compiled with -g.
Thus I went with have_debug_info.
[has_debug_info?]
I got to the point of not wanting to over-think this.

It turns out that in practice, for the use case in which this
attribute will be put, the answer it gives works well enough.
Plus providing the ability to be consistent with what gdb itself does
has value: being inconsistent can lead to confusion.

I can imagine some use-cases wanting more intelligent tests for
determining whether to put in the effort to fetch a separate debug
file  (e.g., is debug info for a particular class present).
Going this route is up to the implementer of the Python code.
What we're providing here is just building blocks.
  
Eli Zaretskii Nov. 21, 2014, 7:51 p.m. UTC | #3
> Date: Fri, 21 Nov 2014 09:33:59 -0800
> From: Doug Evans <dje@google.com>
> Cc: gdb-patches <gdb-patches@sourceware.org>, Pedro Alves <palves@redhat.com>, 
> 	Sergio Durigan Junior <sergiodj@redhat.com>
> 
> > If this attribute cannot be relied upon, why is it a good idea to
> > expose it to Python?
> 
> It's a good question.
> I thought about the name for this attribute for a non-insignificant
> amount of time.

The name is not my problem.

> The problem that needs to be solved is for Python code to be able to tell
> whether to spend time fetching separate debug files, as the latter can take
> a significant amount of time.  Also, a program may use a large number of
> shared libraries and the user may wish (or not wish) debug info to be
> fetched for each one.  So we want, IMO, a simple and cheap initial
> test for whether we need to fetch debug files.

Why not make that test part of the method that fetches the debug info?

> For the use-case in question,  another way to look at the attribute is
> "Has debug info been stripped or not?".

But there's no reliable way to determine that, either, is there?
  
Doug Evans Nov. 21, 2014, 8:22 p.m. UTC | #4
On Fri, Nov 21, 2014 at 11:51 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Fri, 21 Nov 2014 09:33:59 -0800
>> From: Doug Evans <dje@google.com>
>> Cc: gdb-patches <gdb-patches@sourceware.org>, Pedro Alves <palves@redhat.com>,
>>       Sergio Durigan Junior <sergiodj@redhat.com>
>>
>> > If this attribute cannot be relied upon, why is it a good idea to
>> > expose it to Python?
>>
>> It's a good question.
>> I thought about the name for this attribute for a non-insignificant
>> amount of time.
>
> The name is not my problem.
>
>> The problem that needs to be solved is for Python code to be able to tell
>> whether to spend time fetching separate debug files, as the latter can take
>> a significant amount of time.  Also, a program may use a large number of
>> shared libraries and the user may wish (or not wish) debug info to be
>> fetched for each one.  So we want, IMO, a simple and cheap initial
>> test for whether we need to fetch debug files.
>
> Why not make that test part of the method that fetches the debug info?

This is Python code.  What did you mean by "method" ?

Python doesn't come with an ELF reader.
Another way we *could* go, which I kinda like, is to provide a general
purpose ELF API to Python, or try to do the bfd kind of thing and
abstract away ELF vs COFF, etc, and export that through gdb.  Then one
could determine if debug info is present that way. If I were to do the
former (the ELF API) I'd like to make it separate from gdb: why write
something only some users can use.  The latter (abstract away the file
format) has its own problems of course, but one might simplify it to
something along the lines of what libiberty/simple-object* provides.
Either of these solutions allows one to watch for a special section
pointing at separate debug info (e.g., .gnu_debuglink).

[Down the road exporting a DWARF reader to Python would be useful too,
but that's later.  If it involved providing our own libelf/libdwarf so
much the better.]

>> For the use-case in question,  another way to look at the attribute is
>> "Has debug info been stripped or not?".
>
> But there's no reliable way to determine that, either, is there?

Beyond detecting the absence of the requisite debug sections (in
dwarf: .debug_info, et.al.) ?
Or were you thinking of something else?
  
Eli Zaretskii Nov. 22, 2014, 8:04 a.m. UTC | #5
> Date: Fri, 21 Nov 2014 12:22:23 -0800
> From: Doug Evans <dje@google.com>
> Cc: gdb-patches <gdb-patches@sourceware.org>, Pedro Alves <palves@redhat.com>, 
> 	Sergio Durigan Junior <sergiodj@redhat.com>
> 
> >> The problem that needs to be solved is for Python code to be able to tell
> >> whether to spend time fetching separate debug files, as the latter can take
> >> a significant amount of time.  Also, a program may use a large number of
> >> shared libraries and the user may wish (or not wish) debug info to be
> >> fetched for each one.  So we want, IMO, a simple and cheap initial
> >> test for whether we need to fetch debug files.
> >
> > Why not make that test part of the method that fetches the debug info?
> 
> This is Python code.  What did you mean by "method" ?

The method, which we expose to Python programs, which fetches debug
info.

> >> For the use-case in question,  another way to look at the attribute is
> >> "Has debug info been stripped or not?".
> >
> > But there's no reliable way to determine that, either, is there?
> 
> Beyond detecting the absence of the requisite debug sections (in
> dwarf: .debug_info, et.al.) ?
> Or were you thinking of something else?

I still don't understand what good will it make to have this
attribute.  It seems you would like it to allow an optimization,
whereby some clever Python extension to GDB might examine this
attribute and decide not to try to fetch the debug info.  But then why
not do that automatically all the time? why burden the Python
programmer with this?
  
Doug Evans Nov. 24, 2014, 9:06 p.m. UTC | #6
On Fri, Nov 21, 2014 at 12:22 PM, Doug Evans <dje@google.com> wrote:
> Another way we *could* go, which I kinda like, is to provide a general
> purpose ELF API to Python, or try to do the bfd kind of thing and
> abstract away ELF vs COFF, etc, and export that through gdb.  Then one
> could determine if debug info is present that way. If I were to do the
> former (the ELF API) I'd like to make it separate from gdb: why write
> something only some users can use.  The latter (abstract away the file
> format) has its own problems of course, but one might simplify it to
> something along the lines of what libiberty/simple-object* provides.
> Either of these solutions allows one to watch for a special section
> pointing at separate debug info (e.g., .gnu_debuglink).
>
> [Down the road exporting a DWARF reader to Python would be useful too,
> but that's later.  If it involved providing our own libelf/libdwarf so
> much the better.]

Filing for reference sake.
Another way to go, which I partially implemented,
was to export bfd's iovec to Python.
In the end it was more complex than I needed.
  

Patch

diff --git a/gdb/NEWS b/gdb/NEWS
index eecc7da..3a36fa8 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -13,6 +13,8 @@ 
      which is the gdb.Progspace object of the containing program space.
   ** gdb.Objfile objects have a new attribute "build_id",
      which is the build ID generated when the file was built.
+  ** gdb.Objfile objects have a new attribute "have_debug_info", which is
+     a boolean indicating if debug information for the objfile is present.
   ** A new event "gdb.clear_objfiles" has been added, triggered when
      selecting a new file to debug.
   ** You can now add attributes to gdb.Objfile and gdb.Progspace objects.
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index a47a259..1c9dbb87 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -3457,6 +3457,15 @@  command-line option in @ref{Options, , Command Line Options, ld.info,
 The GNU Linker}.
 @end defvar
 
+@defvar Objfile.have_debug_info
+A boolean indicating if debug information for the objfile is present.
+Note that a program compiled without @samp{-g} may still have some debug
+information, e.g., from the @code{C} runtime.  Thus a value of @code{True}
+for this attribute does not mean that debug information is present for
+every source file in the program.  It only means that debug information
+is present for at least one source file.
+@end defvar
+
 @defvar Objfile.progspace
 The containing program space of the objfile as a @code{gdb.Progspace}
 object.  @xref{Progspaces In Python}.
diff --git a/gdb/python/py-objfile.c b/gdb/python/py-objfile.c
index 05a7c21..5d0933f 100644
--- a/gdb/python/py-objfile.c
+++ b/gdb/python/py-objfile.c
@@ -112,6 +112,32 @@  objfpy_get_build_id (PyObject *self, void *closure)
   Py_RETURN_NONE;
 }
 
+/* An Objfile method which returns whether there debug information
+   has been stripped.  */
+
+static PyObject *
+objfpy_get_have_debug_info (PyObject *self, void *closure)
+{
+  objfile_object *obj = (objfile_object *) self;
+  struct objfile *objfile = obj->objfile;
+
+  if (objfile != NULL)
+    {
+      /* This uses the same test that the file readers use, e.g.,
+	 elf_symfile_read, because its main purpose is to decide whether
+	 separate debug info files should be fetched.
+	 See elf_symfile_read.  */
+      if (!objfile_has_partial_symbols (objfile)
+	  && !objfile_has_full_symbols (objfile)
+	  && objfile->separate_debug_objfile == NULL
+	  && objfile->separate_debug_objfile_backlink == NULL)
+	Py_RETURN_FALSE;
+      Py_RETURN_TRUE;
+    }
+
+  Py_RETURN_FALSE;
+}
+
 /* An Objfile method which returns the objfile's progspace, or None.  */
 
 static PyObject *
@@ -412,6 +438,8 @@  static PyGetSetDef objfile_getset[] =
     "The objfile's filename, or None.", NULL },
   { "build_id", objfpy_get_build_id, NULL,
     "The objfile's build id, or None.", NULL },
+  { "have_debug_info", objfpy_get_have_debug_info, NULL,
+    "True if there is debug information for the objfile.", NULL },
   { "progspace", objfpy_get_progspace, NULL,
     "The objfile's progspace, or None.", NULL },
   { "pretty_printers", objfpy_get_printers, objfpy_set_printers,
diff --git a/gdb/testsuite/gdb.python/py-objfile.exp b/gdb/testsuite/gdb.python/py-objfile.exp
index 74384ed..f09c64c 100644
--- a/gdb/testsuite/gdb.python/py-objfile.exp
+++ b/gdb/testsuite/gdb.python/py-objfile.exp
@@ -49,6 +49,9 @@  if [string compare $binfile_build_id ""] {
     unsupported "build-id is not supported by the compiler"
 }
 
+gdb_test "python print (objfile.have_debug_info)" "True" \
+    "Get objfile have_debug_info"
+
 gdb_test "python print (objfile.progspace)" "<gdb\.Progspace object at .*>" \
   "Get objfile program space"
 gdb_test "python print (objfile.is_valid())" "True" \
@@ -61,3 +64,20 @@  gdb_py_test_silent_cmd "python objfile.random_attribute = 42" \
     "Set random attribute in objfile" 1
 gdb_test "python print (objfile.random_attribute)" "42" \
     "Verify set of random attribute in objfile"
+
+# Now build another copy of the testcase, this time without debug info.
+
+if { [prepare_for_testing ${testfile}.exp ${testfile}2 ${srcfile} {nodebug ldflags=-Wl,--strip-debug}] } {
+    return -1
+}
+
+if ![runto_main] {
+    fail "Can't run to main"
+    return 0
+}
+
+gdb_py_test_silent_cmd "python objfile = gdb.objfiles()\[0\]" \
+    "Get no-debug objfile file" 1
+
+gdb_test "python print (objfile.have_debug_info)" "False" \
+    "Get objfile have_debug_info"