From patchwork Tue Oct  1 04:52:19 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Terekhov,
 Mikhail via Gdb-patches" <gdb-patches@sourceware.org>
X-Patchwork-Id: 34758
Received: (qmail 56770 invoked by alias); 1 Oct 2019 04:53:40 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Delivered-To: mailing list gdb-patches@sourceware.org
Received: (qmail 24013 invoked by uid 89); 1 Oct 2019 04:52:28 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-24.2 required=5.0 tests=AWL, BAYES_00,
	GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,
	RCVD_IN_DNSWL_NONE,
	SPF_PASS autolearn=ham version=3.3.1 spammy=seemingly,
	complaint
X-HELO: mail-pf1-f201.google.com
Received: from mail-pf1-f201.google.com (HELO mail-pf1-f201.google.com)
	(209.85.210.201) by sourceware.org
	(qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
	Tue, 01 Oct 2019 04:52:24 +0000
Received: by mail-pf1-f201.google.com with SMTP id b13so9455778pfp.6 for
	<gdb-patches@sourceware.org>; Mon, 30 Sep 2019 21:52:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
	s=20161025; h=date:message-id:mime-version:subject:from:to:cc;
	bh=ic0mB1J2PNNQDlh22anMVy42lWgWP86WX+rd4ZYpa9s=;
	b=jaPox7HnqwHdQblV894ZrH55ZE1m5YXg5FJwD+1VeblW01FfVxPcX1G2Ifkfo5yrH7
	W06b6vg63Jjm/4XbFCPg2h52/v+Rv/oBI6eZ67FyIecoBd1oa54t98RcgTCqZbLa8KFP
	HtqOU10OVhmRvDWLBUb+s+RQdncLrDFEoYjkg2DEt9FapJbpLcyfVrJjPj5qWOlDuKNA
	fAb4RuoabwGYIPbqUzh5u8qeTlZuxTw0u0ApSQCUnJW1UdX8cHky4BbDy2MhVLsUe1/d
	Z8aT2/yaIVUTvsS9u4znWOlqzH3XaqVw9h3//W85WZxSdHE4vAOwnvZglzkMEhN+bxO/
	tsEw==
Date: Mon, 30 Sep 2019 21:52:19 -0700
Message-Id: <20191001045219.141577-1-tamur@google.com>
Mime-Version: 1.0
Subject: [PATCH] DWARF 5 support: Handle line table and file indexes
X-Patchwork-Original-From: "Ali Tamur via gdb-patches"
 <gdb-patches@sourceware.org>
From: "Terekhov, Mikhail via Gdb-patches" <gdb-patches@sourceware.org>
Reply-To: Ali Tamur <tamur@google.com>
To: gdb-patches@sourceware.org
Cc: Ali Tamur <tamur@google.com>
X-IsSubscribed: yes

*  Fix handling of file and directory indexes in line tables; in DWARF 5 the
indexes are zero-based. Make file_names field private to abstract this detail
from the clients. Introduce a new file_names_size method. Reflect these changes
in clients.
*  Pass whether the file index is zero or one based to a few methods.
*  Handle DW_FORM_data16 in read_formatted_entries; it is used to record MD5
of the file entries in DWARF 5.
*  Fix a bug in line header parsing that calculates the length of the header
incorrectly. (Seemingly this manifests itself only in DWARF 5).
*  Introduce a new method, is_valid_file_index that takes into account whether
it is DWARF 5. Use it in related clients.

Tested with CC=/usr/bin/gcc (version 8.3.0) against master branch (also with
-gsplit-dwarf and -gdwarf-4 flags) and there was no increase in the set of
tests that fails. (gdb still cannot debug a 'hello world' program with DWARF 5,
so for the time being, this is all we care about).

This is part of an effort to support DWARF 5 in gdb.
---
 gdb/dwarf2read.c | 161 +++++++++++++++++++++++++++--------------------
 1 file changed, 94 insertions(+), 67 deletions(-)

diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
index 53e7393a7c..c474bdcd8b 100644
--- a/gdb/dwarf2read.c
+++ b/gdb/dwarf2read.c
@@ -922,13 +922,13 @@ typedef void (die_reader_func_ftype) (const struct die_reader_specs *reader,
 				      int has_children,
 				      void *data);
 
-/* A 1-based directory index.  This is a strong typedef to prevent
-   accidentally using a directory index as a 0-based index into an
-   array/vector.  */
-enum class dir_index : unsigned int {};
+/* dir_index is 1-based in DWARF 4 and before, and is 0-based in DWARF 5 and
+   later.  */
+typedef int dir_index;
 
-/* Likewise, a 1-based file name index.  */
-enum class file_name_index : unsigned int {};
+/* file_name_index is 1-based in DWARF 4 and before, and is 0-based in DWARF 5
+   and later.  */
+typedef int file_name_index;
 
 struct file_entry
 {
@@ -980,26 +980,30 @@ struct line_header
   void add_file_name (const char *name, dir_index d_index,
 		      unsigned int mod_time, unsigned int length);
 
-  /* Return the include dir at INDEX (1-based).  Returns NULL if INDEX
-     is out of bounds.  */
+  /* Return the include dir at INDEX (0-based in DWARF 5 and 1-based before).
+     Returns NULL if INDEX is out of bounds.  */
   const char *include_dir_at (dir_index index) const
   {
-    /* Convert directory index number (1-based) to vector index
-       (0-based).  */
-    size_t vec_index = to_underlying (index) - 1;
+    size_t vec_index;
+    if (version <= 4)
+      vec_index = index - 1;
+    else
+      vec_index = index;
 
     if (vec_index >= include_dirs.size ())
       return NULL;
     return include_dirs[vec_index];
   }
 
-  /* Return the file name at INDEX (1-based).  Returns NULL if INDEX
-     is out of bounds.  */
-  file_entry *file_name_at (file_name_index index)
+  /* Return the file name at INDEX (0-based in DWARF 5 and 1-based before).
+     Returns NULL if INDEX is out of bounds.  */
+  file_entry *file_name_at (file_name_index index, bool is_zero_indexed)
   {
-    /* Convert file name index number (1-based) to vector index
-       (0-based).  */
-    size_t vec_index = to_underlying (index) - 1;
+    size_t vec_index;
+    if (is_zero_indexed || version >= 5)
+        vec_index = index;
+    else
+      vec_index = index - 1;
 
     if (vec_index >= file_names.size ())
       return NULL;
@@ -1032,12 +1036,20 @@ struct line_header
      pointers.  The memory is owned by debug_line_buffer.  */
   std::vector<const char *> include_dirs;
 
-  /* The file_names table.  */
-  std::vector<file_entry> file_names;
+  int file_names_size () {
+    return file_names.size();
+  }
 
   /* The start and end of the statement program following this
      header.  These point into dwarf2_per_objfile->line_buffer.  */
   const gdb_byte *statement_program_start {}, *statement_program_end {};
+
+ private:
+  /* The file_names table. This is private because the meaning of indexes differ
+     among DWARF versions (The first valid index is 1 in DWARF 4 and before,
+     and is 0 in DWARF 5 and later). So the client should use file_name_at
+     method for access.  */
+  std::vector<file_entry> file_names;
 };
 
 typedef std::unique_ptr<line_header> line_header_up;
@@ -1962,7 +1974,7 @@ static file_and_directory find_file_and_directory (struct die_info *die,
 						   struct dwarf2_cu *cu);
 
 static char *file_full_name (int file, struct line_header *lh,
-			     const char *comp_dir);
+			     const char *comp_dir, bool is_zero_indexed);
 
 /* Expected enum dwarf_unit_type for read_comp_unit_head.  */
 enum class rcuh_kind { COMPILE, TYPE };
@@ -3638,7 +3650,7 @@ dw2_get_file_names_reader (const struct die_reader_specs *reader,
   struct objfile *objfile = dwarf2_per_objfile->objfile;
   struct dwarf2_per_cu_data *lh_cu;
   struct attribute *attr;
-  int i;
+  int i = 0;
   void **slot;
   struct quick_file_names *qfn;
 
@@ -3697,13 +3709,13 @@ dw2_get_file_names_reader (const struct die_reader_specs *reader,
   if (strcmp (fnd.name, "<unknown>") != 0)
     ++offset;
 
-  qfn->num_file_names = offset + lh->file_names.size ();
+  qfn->num_file_names = offset + lh->file_names_size ();
   qfn->file_names =
     XOBNEWVEC (&objfile->objfile_obstack, const char *, qfn->num_file_names);
   if (offset != 0)
     qfn->file_names[0] = xstrdup (fnd.name);
-  for (i = 0; i < lh->file_names.size (); ++i)
-    qfn->file_names[i + offset] = file_full_name (i + 1, lh.get (), fnd.comp_dir);
+  for (i = 0; i < lh->file_names_size (); ++i)
+    qfn->file_names[i + offset] = file_full_name (i + 1, lh.get (), fnd.comp_dir, true);
   qfn->real_names = NULL;
 
   lh_cu->v.quick->file_names = qfn;
@@ -11696,16 +11708,16 @@ dwarf2_cu::setup_type_unit_groups (struct die_info *die)
 	 process_full_type_unit still needs to know if this is the first
 	 time.  */
 
-      tu_group->num_symtabs = line_header->file_names.size ();
+      tu_group->num_symtabs = line_header->file_names_size ();
       tu_group->symtabs = XNEWVEC (struct symtab *,
-				   line_header->file_names.size ());
+				   line_header->file_names_size ());
 
-      for (i = 0; i < line_header->file_names.size (); ++i)
+      for (i = 0; i < line_header->file_names_size (); ++i)
 	{
-	  file_entry &fe = line_header->file_names[i];
+	  file_entry *fe = line_header->file_name_at (i, true);
 
-	  dwarf2_start_subfile (this, fe.name,
-				fe.include_dir (line_header));
+	  dwarf2_start_subfile (this, fe->name,
+				fe->include_dir (line_header));
 	  buildsym_compunit *b = get_builder ();
 	  if (b->get_current_subfile ()->symtab == NULL)
 	    {
@@ -11718,8 +11730,8 @@ dwarf2_cu::setup_type_unit_groups (struct die_info *die)
 		= allocate_symtab (cust, b->get_current_subfile ()->name);
 	    }
 
-	  fe.symtab = b->get_current_subfile ()->symtab;
-	  tu_group->symtabs[i] = fe.symtab;
+	  fe->symtab = b->get_current_subfile ()->symtab;
+	  tu_group->symtabs[i] = fe->symtab;
 	}
     }
   else
@@ -11732,11 +11744,10 @@ dwarf2_cu::setup_type_unit_groups (struct die_info *die)
 			compunit_language (cust),
 			0, cust));
 
-      for (i = 0; i < line_header->file_names.size (); ++i)
+      for (i = 0; i < line_header->file_names_size (); ++i)
 	{
-	  file_entry &fe = line_header->file_names[i];
-
-	  fe.symtab = tu_group->symtabs[i];
+	  file_entry *fe = line_header->file_name_at (i, true);
+	  fe->symtab = tu_group->symtabs[i];
 	}
     }
 
@@ -16209,7 +16220,7 @@ process_structure_scope (struct die_info *die, struct dwarf2_cu *cu)
 	    {
 	      /* Any related symtab will do.  */
 	      symtab
-		= cu->line_header->file_name_at (file_name_index (1))->symtab;
+		= cu->line_header->file_name_at (0, true)->symtab;
 	    }
 	  else
 	    {
@@ -20277,7 +20288,7 @@ line_header::add_file_name (const char *name,
 {
   if (dwarf_line_debug >= 2)
     fprintf_unfiltered (gdb_stdlog, "Adding file %u: %s\n",
-			(unsigned) file_names.size () + 1, name);
+			(unsigned) file_names_size () + 1, name);
 
   file_names.emplace_back (name, d_index, mod_time, length);
 }
@@ -20393,6 +20404,11 @@ read_formatted_entries (struct dwarf2_per_objfile *dwarf2_per_objfile,
 	      buf += 8;
 	      break;
 
+	    case DW_FORM_data16:
+	      /*  This is used for MD5, but file_entry does not record MD5s. */
+	      buf += 16;
+	      break;
+
 	    case DW_FORM_udata:
 	      uint.emplace (read_unsigned_leb128 (abfd, buf, &bytes_read));
 	      buf += bytes_read;
@@ -20493,12 +20509,15 @@ dwarf_decode_line_header (sect_offset sect_off, struct dwarf2_cu *cu)
     read_checked_initial_length_and_offset (abfd, line_ptr, &cu->header,
 					    &bytes_read, &offset_size);
   line_ptr += bytes_read;
+
+  const gdb_byte *start_here = line_ptr;
+
   if (line_ptr + lh->total_length > (section->buffer + section->size))
     {
       dwarf2_statement_list_fits_in_line_number_section_complaint ();
       return 0;
     }
-  lh->statement_program_end = line_ptr + lh->total_length;
+  lh->statement_program_end = start_here + lh->total_length;
   lh->version = read_2_bytes (abfd, line_ptr);
   line_ptr += 2;
   if (lh->version > 5)
@@ -20528,6 +20547,7 @@ dwarf_decode_line_header (sect_offset sect_off, struct dwarf2_cu *cu)
     }
   lh->header_length = read_offset_1 (abfd, line_ptr, offset_size);
   line_ptr += offset_size;
+  lh->statement_program_start = line_ptr + lh->header_length;
   lh->minimum_instruction_length = read_1_byte (abfd, line_ptr);
   line_ptr += 1;
   if (lh->version >= 4)
@@ -20612,7 +20632,6 @@ dwarf_decode_line_header (sect_offset sect_off, struct dwarf2_cu *cu)
 	}
       line_ptr += bytes_read;
     }
-  lh->statement_program_start = line_ptr;
 
   if (line_ptr > (section->buffer + section->size))
     complaint (_("line number info header doesn't "
@@ -20629,18 +20648,18 @@ dwarf_decode_line_header (sect_offset sect_off, struct dwarf2_cu *cu)
    Returns NULL if FILE_INDEX should be ignored, i.e., it is pst->filename.  */
 
 static const char *
-psymtab_include_file_name (const struct line_header *lh, int file_index,
+psymtab_include_file_name (struct line_header *lh, int file_index,
 			   const struct partial_symtab *pst,
 			   const char *comp_dir,
 			   gdb::unique_xmalloc_ptr<char> *name_holder)
 {
-  const file_entry &fe = lh->file_names[file_index];
-  const char *include_name = fe.name;
+  file_entry *fe = lh->file_name_at (file_index, true);
+  const char *include_name = fe->name;
   const char *include_name_to_compare = include_name;
   const char *pst_filename;
   int file_is_pst;
 
-  const char *dir_name = fe.include_dir (lh);
+  const char *dir_name = fe->include_dir (lh);
 
   gdb::unique_xmalloc_ptr<char> hold_compare;
   if (!IS_ABSOLUTE_PATH (include_name)
@@ -20712,7 +20731,7 @@ public:
   {
     /* lh->file_names is 0-based, but the file name numbers in the
        statement program are 1-based.  */
-    return m_line_header->file_name_at (m_file);
+    return m_line_header->file_name_at (m_file, false);
   }
 
   /* Record the line in the state machine.  END_SEQUENCE is true if
@@ -20809,8 +20828,8 @@ private:
      and initialized according to the DWARF spec.  */
 
   unsigned char m_op_index = 0;
-  /* The line table index (1-based) of the current file.  */
-  file_name_index m_file = (file_name_index) 1;
+  /* The line table index of the current file.  */
+  file_name_index m_file = 1;
   unsigned int m_line = 1;
 
   /* These are initialized in the constructor.  */
@@ -21002,7 +21021,7 @@ lnp_state_machine::record_line (bool end_sequence)
       fprintf_unfiltered (gdb_stdlog,
 			  "Processing actual line %u: file %u,"
 			  " address %s, is_stmt %u, discrim %u\n",
-			  m_line, to_underlying (m_file),
+			  m_line, m_file,
 			  paddress (m_gdbarch, m_address),
 			  m_is_stmt, m_discriminator);
     }
@@ -21345,8 +21364,8 @@ dwarf_decode_lines (struct line_header *lh, const char *comp_dir,
 
       /* Now that we're done scanning the Line Header Program, we can
          create the psymtab of each included file.  */
-      for (file_index = 0; file_index < lh->file_names.size (); file_index++)
-        if (lh->file_names[file_index].included_p == 1)
+      for (file_index = 0; file_index < lh->file_names_size (); file_index++)
+        if (lh->file_name_at (file_index, true)->included_p == 1)
           {
 	    gdb::unique_xmalloc_ptr<char> name_holder;
 	    const char *include_name =
@@ -21365,11 +21384,11 @@ dwarf_decode_lines (struct line_header *lh, const char *comp_dir,
       struct compunit_symtab *cust = builder->get_compunit_symtab ();
       int i;
 
-      for (i = 0; i < lh->file_names.size (); i++)
+      for (i = 0; i < lh->file_names_size (); i++)
 	{
-	  file_entry &fe = lh->file_names[i];
+	  file_entry *fe = lh->file_name_at (i, true);
 
-	  dwarf2_start_subfile (cu, fe.name, fe.include_dir (lh));
+	  dwarf2_start_subfile (cu, fe->name, fe->include_dir (lh));
 
 	  if (builder->get_current_subfile ()->symtab == NULL)
 	    {
@@ -21377,7 +21396,7 @@ dwarf_decode_lines (struct line_header *lh, const char *comp_dir,
 		= allocate_symtab (cust,
 				   builder->get_current_subfile ()->name);
 	    }
-	  fe.symtab = builder->get_current_subfile ()->symtab;
+	  fe->symtab = builder->get_current_subfile ()->symtab;
 	}
     }
 }
@@ -21596,7 +21615,7 @@ new_symbol (struct die_info *die, struct type *type, struct dwarf2_cu *cu,
 	  struct file_entry *fe;
 
 	  if (cu->line_header != NULL)
-	    fe = cu->line_header->file_name_at (file_index);
+	    fe = cu->line_header->file_name_at (file_index, false);
 	  else
 	    fe = NULL;
 
@@ -24154,22 +24173,29 @@ dwarf_alloc_die (struct dwarf2_cu *cu, int num_attrs)
    *LH's file name table.  The result is allocated using xmalloc; the caller is
    responsible for freeing it.  */
 
+bool is_valid_file_index (int file_index, struct line_header *lh,
+			  bool is_zero_indexed) {
+  if (is_zero_indexed || lh->version >= 5)
+    return 0 <= file_index && file_index < lh->file_names_size ();
+  return 1 <= file_index && file_index <= lh->file_names_size ();
+}
+
 static char *
-file_file_name (int file, struct line_header *lh)
+file_file_name (int file, struct line_header *lh, bool is_zero_indexed)
 {
   /* Is the file number a valid index into the line header's file name
      table?  Remember that file numbers start with one, not zero.  */
-  if (1 <= file && file <= lh->file_names.size ())
+  if (is_valid_file_index (file, lh, is_zero_indexed))
     {
-      const file_entry &fe = lh->file_names[file - 1];
+      const file_entry *fe = lh->file_name_at (file, is_zero_indexed);
 
-      if (!IS_ABSOLUTE_PATH (fe.name))
+      if (!IS_ABSOLUTE_PATH (fe->name))
 	{
-	  const char *dir = fe.include_dir (lh);
+	  const char *dir = fe->include_dir (lh);
 	  if (dir != NULL)
-	    return concat (dir, SLASH_STRING, fe.name, (char *) NULL);
+	    return concat (dir, SLASH_STRING, fe->name, (char *) NULL);
 	}
-      return xstrdup (fe.name);
+      return xstrdup (fe->name);
     }
   else
     {
@@ -24193,13 +24219,14 @@ file_file_name (int file, struct line_header *lh)
    compilation.  The result is allocated using xmalloc; the caller is
    responsible for freeing it.  */
 static char *
-file_full_name (int file, struct line_header *lh, const char *comp_dir)
+file_full_name (int file, struct line_header *lh, const char *comp_dir,
+		bool is_zero_indexed)
 {
   /* Is the file number a valid index into the line header's file name
      table?  Remember that file numbers start with one, not zero.  */
-  if (1 <= file && file <= lh->file_names.size ())
+  if (is_valid_file_index (file, lh, is_zero_indexed))
     {
-      char *relative = file_file_name (file, lh);
+      char *relative = file_file_name (file, lh, is_zero_indexed);
 
       if (IS_ABSOLUTE_PATH (relative) || comp_dir == NULL)
 	return relative;
@@ -24207,7 +24234,7 @@ file_full_name (int file, struct line_header *lh, const char *comp_dir)
 		       relative, (char *) NULL);
     }
   else
-    return file_file_name (file, lh);
+    return file_file_name (file, lh, is_zero_indexed);
 }
 
 
@@ -24218,7 +24245,7 @@ macro_start_file (struct dwarf2_cu *cu,
                   struct line_header *lh)
 {
   /* File name relative to the compilation directory of this source file.  */
-  char *file_name = file_file_name (file, lh);
+  char *file_name = file_file_name (file, lh, false);
 
   if (! current_file)
     {