[v2,1/4] DWARF 5 support: Handle dwo_id

Message ID 20190909185803.182231-1-tamur@google.com
State New, archived
Headers

Commit Message

Terekhov, Mikhail via Gdb-patches Sept. 9, 2019, 6:58 p.m. UTC
  > Since you're touching the split-DWARF code, it might be a good idea to run the whole
> testsuite with -gsplit-dwarf, with DWARF 4 (to verify there are no regressions) and
> DWARF 5 (to check how things improve).                                         
Done. Both with and without -gsplit-dwarf the set of tests that fail is          
identical. This is expected as unless I introduced a bug, there should be no     
behavioral change at all for DWARF 4. I also ran the testsuite with -gdwarf-5    
and the number of failures went down from 32687 to 1163, but until gdb can       
handle 'hello world' (hopefully at the end my patch series) I think that is      
not a very meaningful metric.                                                    
                                                                                 
> And since DWARF 5 is relatively new stuff, results can vary greatly if using different versions of the
> same compiler to run the tests.  So if you could mention in the commit message which gcc version the
> tests were ran against, I think it would be useful.                            
Updated the commit message to include the gcc version (8.3.0).                   
                                                                                 
> I would say that the same change should be done in dwo_unit then.  But I totally
> agree that we should not mix up cleanups with adding new features.  So here's  
> what I suggest: revert the field in comp_unit_head to be just `signature` for  
> now, it will avoid having too many unrelated changes in this patch (updated all
> the users of that field).  We'll make a following cleanup patch to either rename
> the field or introduce a union.                                                
Done.                                                                            
                                                                                 
> Space before parenthesis.  Applies to the rest of the patch, for both          
> function signatures and calls.                                                 
Done.                                                                            
                                                                                 
> Let's use %s and dwarf_unit_type_name to print DW_UT_type here, as you did below.
Done.                                                                            
                                                                                 
> This comment needs to be updated (the last part).                              
Done.                                                                            
                                                                                 
> Ref: https://sourceware.org/gdb/wiki/Internals%20GDB-C-Coding-Standards#Comparison_With_NULL_And_Zero
Done.                                                                            
                                                                                 
> Is there a reason for this function to now take `reader` rather than `this_cu`?  At first
> sight, I don't see why this change is needed or why it would help.             
Done.                                                                            
                                                                                 
> Format the switch-case like:                                                   
Done.                                                                            
                                                                                 
Sorry for the style mistakes; I am accustomed to a different style and also      
have become too dependent on clang-tidy.
---


* DW_UT_skeleton and DW_UT_split_compile compilation units have dwo ids
to match the compilation unit in the skeleton and .dwo files. The dwo_id is
in the header.

Tested with CC=/usr/bin/gcc (version 8.3.0) against master branch (also with
-gsplit-dwarf and -gdwarf-4 flags) and there was no increase in the set of
tests that fails.

This is part of an effort to support DWARF 5 in gdb.


gdb/ChangeLog:                                                                   
                                                                                 
        * gdb/dwarf2read.c (comp_unit_head): Update comment.                     
        (dwarf2_dwo_name): New function declaration.                             
        (dwarf_unit_type_name): New function declaration.                        
        (read_comp_unit_head): Add support for new compilation units,            
        DW_UT_partial, DW_UT_skeleton, DW_UT_split_compile, DW_UT_split_type.    
        Particularly, DW_UT_skeleton and DW_UT_split_compile have dwo_id         
        (currently named as "signature") in their header. Also clarify error     
        messages.                                                                
        (lookup_dwo_id): New function. Returns the dwo id of the given           
        compile unit.                                                            
        (lookup_dwo_unit): Use the new lookup_dwo_id function.                   
        (init_cutu_and_read_dies): Use the new dwarf2_dwo_name and lookup_dwo_id 
        functions.                                                               
        (create_dwo_cu_reader): Use the added lookup_dwo_id function.            
        (dwarf2_dwo_name): Get the dwo name if present.                          
        (dwarf_unit_type_name): Convert DW_UT_* types to string for diagnostic   
        purposes.
---
 gdb/dwarf2read.c | 119 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 94 insertions(+), 25 deletions(-)
  

Comments

Simon Marchi Sept. 9, 2019, 9:21 p.m. UTC | #1
On 2019-09-09 2:58 p.m., Ali Tamur via gdb-patches wrote:
>> Since you're touching the split-DWARF code, it might be a good idea to run the whole
>> testsuite with -gsplit-dwarf, with DWARF 4 (to verify there are no regressions) and
>> DWARF 5 (to check how things improve).                                         
> Done. Both with and without -gsplit-dwarf the set of tests that fail is          
> identical. This is expected as unless I introduced a bug, there should be no     
> behavioral change at all for DWARF 4. I also ran the testsuite with -gdwarf-5    
> and the number of failures went down from 32687 to 1163, but until gdb can       
> handle 'hello world' (hopefully at the end my patch series) I think that is      
> not a very meaningful metric.                                                    

You're right that it's not very meaningful, but still encouraging :).

An interesting one will be to compare the results for DWARF 4 vs for DWARF 5.

>> And since DWARF 5 is relatively new stuff, results can vary greatly if using different versions of the
>> same compiler to run the tests.  So if you could mention in the commit message which gcc version the
>> tests were ran against, I think it would be useful.                            
> Updated the commit message to include the gcc version (8.3.0).                   

Thanks.

> Sorry for the style mistakes; I am accustomed to a different style and also      
> have become too dependent on clang-tidy.

No problem, and I completely understand.  I had a taste of clang-format (I
presume you meant clang-format) for C++ and black for Python, and it's really
nice not to have to think about formatting.  It would be wonderful to have
something equivalent here :).

I just found one last little thing:

> gdb/ChangeLog:                                                                   
>                                                                                  
>         * gdb/dwarf2read.c (comp_unit_head): Update comment.                     

The file name here should just be "dwarf2read.c", as it should be relative to
the ChangeLog file the entry is in.

The patch LGTM with that fixed, you can go ahead and push.  Thanks for following
up quickly and efficiently on review comments.

I'd like to take a look at the other patches of the series, but that won't be
before at least next week, as I'll be quite busy with the GNU Cauldron Conference
until then.  But if somebody else wants to review them, please go ahead.

Simon
  
Terekhov, Mikhail via Gdb-patches Sept. 10, 2019, 1:36 a.m. UTC | #2
Thank you for the review, I am submitting now.

On Mon, Sep 9, 2019 at 2:21 PM Simon Marchi <simark@simark.ca> wrote:
>
> On 2019-09-09 2:58 p.m., Ali Tamur via gdb-patches wrote:
> >> Since you're touching the split-DWARF code, it might be a good idea to run the whole
> >> testsuite with -gsplit-dwarf, with DWARF 4 (to verify there are no regressions) and
> >> DWARF 5 (to check how things improve).
> > Done. Both with and without -gsplit-dwarf the set of tests that fail is
> > identical. This is expected as unless I introduced a bug, there should be no
> > behavioral change at all for DWARF 4. I also ran the testsuite with -gdwarf-5
> > and the number of failures went down from 32687 to 1163, but until gdb can
> > handle 'hello world' (hopefully at the end my patch series) I think that is
> > not a very meaningful metric.
>
> You're right that it's not very meaningful, but still encouraging :).
>
> An interesting one will be to compare the results for DWARF 4 vs for DWARF 5.
>
> >> And since DWARF 5 is relatively new stuff, results can vary greatly if using different versions of the
> >> same compiler to run the tests.  So if you could mention in the commit message which gcc version the
> >> tests were ran against, I think it would be useful.
> > Updated the commit message to include the gcc version (8.3.0).
>
> Thanks.
>
> > Sorry for the style mistakes; I am accustomed to a different style and also
> > have become too dependent on clang-tidy.
>
> No problem, and I completely understand.  I had a taste of clang-format (I
> presume you meant clang-format) for C++ and black for Python, and it's really
> nice not to have to think about formatting.  It would be wonderful to have
> something equivalent here :).
>
> I just found one last little thing:
>
> > gdb/ChangeLog:
> >
> >         * gdb/dwarf2read.c (comp_unit_head): Update comment.
>
> The file name here should just be "dwarf2read.c", as it should be relative to
> the ChangeLog file the entry is in.
>
> The patch LGTM with that fixed, you can go ahead and push.  Thanks for following
> up quickly and efficiently on review comments.
>
> I'd like to take a look at the other patches of the series, but that won't be
> before at least next week, as I'll be quite busy with the GNU Cauldron Conference
> until then.  But if somebody else wants to review them, please go ahead.
>
> Simon
  

Patch

diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
index fb888da7b8..fc30ddccd0 100644
--- a/gdb/dwarf2read.c
+++ b/gdb/dwarf2read.c
@@ -373,8 +373,11 @@  struct comp_unit_head
      This will be the first byte following the compilation unit header.  */
   cu_offset first_die_cu_offset;
 
-  /* 64-bit signature of this type unit - it is valid only for
-     UNIT_TYPE DW_UT_type.  */
+
+  /* 64-bit signature of this unit. For type units, it denotes the signature of
+     the type (DW_UT_type in DWARF 4, additionally DW_UT_split_type in DWARF 5).
+     Also used in DWARF 5, to denote the dwo id when the unit type is
+     DW_UT_skeleton or DW_UT_split_compile.  */
   ULONGEST signature;
 
   /* For types, offset in the type's DIE of the type defined by this TU.  */
@@ -1579,6 +1582,8 @@  static struct attribute *dwarf2_attr_no_follow (struct die_info *,
 static const char *dwarf2_string_attr (struct die_info *die, unsigned int name,
                                        struct dwarf2_cu *cu);
 
+static const char *dwarf2_dwo_name (struct die_info *die, struct dwarf2_cu *cu);
+
 static int dwarf2_flag_true_p (struct die_info *die, unsigned name,
                                struct dwarf2_cu *cu);
 
@@ -1761,6 +1766,8 @@  static const char *dwarf_tag_name (unsigned int);
 
 static const char *dwarf_attr_name (unsigned int);
 
+static const char *dwarf_unit_type_name (int unit_type);
+
 static const char *dwarf_form_name (unsigned int);
 
 static const char *dwarf_bool_name (unsigned int);
@@ -6390,18 +6397,28 @@  read_comp_unit_head (struct comp_unit_head *cu_header,
       switch (cu_header->unit_type)
 	{
 	case DW_UT_compile:
+	case DW_UT_partial:
+	case DW_UT_skeleton:
+	case DW_UT_split_compile:
 	  if (section_kind != rcuh_kind::COMPILE)
 	    error (_("Dwarf Error: wrong unit_type in compilation unit header "
-		   "(is DW_UT_compile, should be DW_UT_type) [in module %s]"),
-		   filename);
+		   "(is %s, should be %s) [in module %s]"),
+		   dwarf_unit_type_name (cu_header->unit_type),
+		   dwarf_unit_type_name (DW_UT_type), filename);
 	  break;
 	case DW_UT_type:
+	case DW_UT_split_type:
 	  section_kind = rcuh_kind::TYPE;
 	  break;
 	default:
 	  error (_("Dwarf Error: wrong unit_type in compilation unit header "
-		 "(is %d, should be %d or %d) [in module %s]"),
-		 cu_header->unit_type, DW_UT_compile, DW_UT_type, filename);
+		 "(is %#04x, should be one of: %s, %s, %s, %s or %s) "
+		 "[in module %s]"), cu_header->unit_type,
+		 dwarf_unit_type_name (DW_UT_compile),
+		 dwarf_unit_type_name (DW_UT_skeleton),
+		 dwarf_unit_type_name (DW_UT_split_compile),
+		 dwarf_unit_type_name (DW_UT_type),
+		 dwarf_unit_type_name (DW_UT_split_type), filename);
 	}
 
       cu_header->addr_size = read_1_byte (abfd, info_ptr);
@@ -6422,13 +6439,19 @@  read_comp_unit_head (struct comp_unit_head *cu_header,
 		    _("read_comp_unit_head: dwarf from non elf file"));
   cu_header->signed_addr_p = signed_addr;
 
-  if (section_kind == rcuh_kind::TYPE)
-    {
-      LONGEST type_offset;
+  bool header_has_signature = section_kind == rcuh_kind::TYPE
+    || cu_header->unit_type == DW_UT_skeleton
+    || cu_header->unit_type == DW_UT_split_compile;
 
+  if (header_has_signature)
+    {
       cu_header->signature = read_8_bytes (abfd, info_ptr);
       info_ptr += 8;
+    }
 
+  if (section_kind == rcuh_kind::TYPE)
+    {
+      LONGEST type_offset;
       type_offset = read_offset (abfd, info_ptr, cu_header, &bytes_read);
       info_ptr += bytes_read;
       cu_header->type_cu_offset_in_tu = (cu_offset) type_offset;
@@ -7297,6 +7320,21 @@  read_cutu_die_from_dwo (struct dwarf2_per_cu_data *this_cu,
   return 1;
 }
 
+/* Return the signature of the compile unit, if found. In DWARF 4 and before,
+   the signature is in the DW_AT_GNU_dwo_id attribute. In DWARF 5 and later, the
+   signature is part of the header.  */
+static gdb::optional<ULONGEST>
+lookup_dwo_id (struct dwarf2_cu *cu, struct die_info* comp_unit_die)
+{
+  if (cu->header.version >= 5)
+    return cu->header.signature;
+  struct attribute *attr;
+  attr = dwarf2_attr (comp_unit_die, DW_AT_GNU_dwo_id, cu);
+  if (attr == nullptr)
+    return gdb::optional<ULONGEST> ();
+  return DW_UNSND (attr);
+}
+
 /* Subroutine of init_cutu_and_read_dies to simplify it.
    Look up the DWO unit specified by COMP_UNIT_DIE of THIS_CU.
    Returns NULL if the specified DWO unit cannot be found.  */
@@ -7306,14 +7344,13 @@  lookup_dwo_unit (struct dwarf2_per_cu_data *this_cu,
 		 struct die_info *comp_unit_die)
 {
   struct dwarf2_cu *cu = this_cu->cu;
-  ULONGEST signature;
   struct dwo_unit *dwo_unit;
   const char *comp_dir, *dwo_name;
 
   gdb_assert (cu != NULL);
 
   /* Yeah, we look dwo_name up again, but it simplifies the code.  */
-  dwo_name = dwarf2_string_attr (comp_unit_die, DW_AT_GNU_dwo_name, cu);
+  dwo_name = dwarf2_dwo_name (comp_unit_die, cu);
   comp_dir = dwarf2_string_attr (comp_unit_die, DW_AT_comp_dir, cu);
 
   if (this_cu->is_debug_types)
@@ -7323,21 +7360,17 @@  lookup_dwo_unit (struct dwarf2_per_cu_data *this_cu,
       /* Since this_cu is the first member of struct signatured_type,
 	 we can go from a pointer to one to a pointer to the other.  */
       sig_type = (struct signatured_type *) this_cu;
-      signature = sig_type->signature;
       dwo_unit = lookup_dwo_type_unit (sig_type, dwo_name, comp_dir);
     }
   else
     {
-      struct attribute *attr;
-
-      attr = dwarf2_attr (comp_unit_die, DW_AT_GNU_dwo_id, cu);
-      if (! attr)
+      gdb::optional<ULONGEST> signature = lookup_dwo_id (cu, comp_unit_die);
+      if (!signature.has_value ())
 	error (_("Dwarf Error: missing dwo_id for dwo_name %s"
 		 " [in module %s]"),
 	       dwo_name, objfile_name (this_cu->dwarf2_per_objfile->objfile));
-      signature = DW_UNSND (attr);
       dwo_unit = lookup_dwo_comp_unit (this_cu, dwo_name, comp_dir,
-				       signature);
+				       *signature);
     }
 
   return dwo_unit;
@@ -7449,7 +7482,6 @@  init_cutu_and_read_dies (struct dwarf2_per_cu_data *this_cu,
   struct die_reader_specs reader;
   struct die_info *comp_unit_die;
   int has_children;
-  struct attribute *attr;
   struct signatured_type *sig_type = NULL;
   struct dwarf2_section_info *abbrev_section;
   /* Non-zero if CU currently points to a DWO file and we need to
@@ -7586,9 +7618,9 @@  init_cutu_and_read_dies (struct dwarf2_per_cu_data *this_cu,
 
      Note that if USE_EXISTING_OK != 0, and THIS_CU->cu already contains a
      DWO CU, that this test will fail (the attribute will not be present).  */
-  attr = dwarf2_attr (comp_unit_die, DW_AT_GNU_dwo_name, cu);
+  const char *dwo_name = dwarf2_dwo_name (comp_unit_die, cu);
   abbrev_table_up dwo_abbrev_table;
-  if (attr)
+  if (dwo_name != nullptr)
     {
       struct dwo_unit *dwo_unit;
       struct die_info *dwo_comp_unit_die;
@@ -11839,10 +11871,9 @@  create_dwo_cu_reader (const struct die_reader_specs *reader,
   struct create_dwo_cu_data *data = (struct create_dwo_cu_data *) datap;
   struct dwo_file *dwo_file = data->dwo_file;
   struct dwo_unit *dwo_unit = &data->dwo_unit;
-  struct attribute *attr;
 
-  attr = dwarf2_attr (comp_unit_die, DW_AT_GNU_dwo_id, cu);
-  if (attr == NULL)
+  gdb::optional<ULONGEST> signature = lookup_dwo_id (cu, comp_unit_die);
+  if (!signature.has_value ())
     {
       complaint (_("Dwarf Error: debug entry at offset %s is missing"
 		   " its dwo_id [in module %s]"),
@@ -11851,7 +11882,7 @@  create_dwo_cu_reader (const struct die_reader_specs *reader,
     }
 
   dwo_unit->dwo_file = dwo_file;
-  dwo_unit->signature = DW_UNSND (attr);
+  dwo_unit->signature = *signature;
   dwo_unit->section = section;
   dwo_unit->sect_off = sect_off;
   dwo_unit->length = cu->per_cu->length;
@@ -20113,6 +20144,17 @@  dwarf2_string_attr (struct die_info *die, unsigned int name, struct dwarf2_cu *c
   return str;
 }
 
+/* Return the dwo name or NULL if not present. If present, it is in either
+   DW_AT_GNU_dwo_name or DW_AT_dwo_name atrribute.  */
+static const char *
+dwarf2_dwo_name (struct die_info *die, struct dwarf2_cu *cu)
+{
+  const char *dwo_name = dwarf2_string_attr (die, DW_AT_GNU_dwo_name, cu);
+  if (dwo_name == nullptr)
+    dwo_name = dwarf2_string_attr (die, DW_AT_dwo_name, cu);
+  return dwo_name;
+}
+
 /* Return non-zero iff the attribute NAME is defined for the given DIE,
    and holds a non-zero value.  This function should only be used for
    DW_FORM_flag or DW_FORM_flag_present attributes.  */
@@ -22812,6 +22854,33 @@  dwarf_attr_name (unsigned attr)
   return name;
 }
 
+/* Convert a unit type to corresponding DW_UT name.  */
+
+static const char *
+dwarf_unit_type_name (int unit_type) {
+  switch (unit_type)
+    {
+      case 0x01:
+	return "DW_UT_compile (0x01)";
+      case 0x02:
+	return "DW_UT_type (0x02)";
+      case 0x03:
+	return "DW_UT_partial (0x03)";
+      case 0x04:
+	return "DW_UT_skeleton (0x04)";
+      case 0x05:
+	return "DW_UT_split_compile (0x05)";
+      case 0x06:
+	return "DW_UT_split_type (0x06)";
+      case 0x80:
+	return "DW_UT_lo_user (0x80)";
+      case 0xff:
+	return "DW_UT_hi_user (0xff)";
+      default:
+	return nullptr;
+    }
+}
+
 /* Convert a DWARF value form code into its string name.  */
 
 static const char *