[review,RFC] Support $ORIGIN in DW_AT_comp_dir

Message ID gerrit.1572295041000.I5cd12d8c37c19fd2e9229b474d37f72185e41178@gnutoolchain-gerrit.osci.io
State New, archived
Headers

Commit Message

Simon Marchi (Code Review) Oct. 28, 2019, 8:37 p.m. UTC
  Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/402
......................................................................

[RFC] Support $ORIGIN in DW_AT_comp_dir

[Note that the test doesn't currently work, because build_executable_from_specs
doesn't like relative src names for some reason. However, to be a good
test it really needs a relative name due to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91888]

This implements this DWARF feature request:
http://dwarfstd.org/ShowIssue.php?issue=171130.2

With reproducible builds and with build output caching, it is
desirable to have the build output be independent of the
build environment. GCC's -fdebug-prefix-map allows setting
the comp dir to a different value. But the obvious choice,
i.e. ".", has the downside that GCC will not find DWOs/
source code when $PWD is outside of the build output directory.

So, this patch allows specifying something like
-fdebug-prefix-map=/output/dir=\$ORIGIN, and GDB will find things
correctly, even when $PWD is not the output directory.

OPEN QUESTIONS:
- where exactly can this $ORIGIN thingy appear? The proposal mentions
  DW_AT_comp_dir, but I expect you'll need that elsewhere too... Given that in
  dwarf>=5 the line tables are independent of the rest of the debug info,
  you'll need at least to include the directory 0 of the line table. But what
  about other line table directories? And other attributes? If a CU has a
  regular DW_AT_comp_dir, but then a specific DIE has DW_AT_decl_file
  $ORIGIN/foo.cc, what does that mean?
- what exactly does $ORIGIN refer to? If I e.g. split the debug info into a
  separate debug file (objcopy --only-keep-debug), which file is it relative
  to?
- should there be a way to suppress/escape this handling to cover the
  pathological case of a literal "$ORIGIN" directory? $$ORIGIN ? ./$ORIGIN ?
- dwarf2 used to recommend (not require) that the debug info paths are of the
  form "hostname:path". This didn't turn out to be useful so this wording was
  scrapped from later standard revisions. However, lldb (I don't know about
  gdb), still has some code to handle (= ignore) the hostname prefix. I myself
  am not convinced that this is a good idea, but it has occurred to me whether
  we shouldn't revive this syntax for the $ORIGIN thingy.

Change-Id: I5cd12d8c37c19fd2e9229b474d37f72185e41178
---
M gdb/dwarf2read.c
A gdb/testsuite/gdb.dwarf2/fission-origin-comp-dir.c
A gdb/testsuite/gdb.dwarf2/fission-origin-comp-dir.exp
3 files changed, 92 insertions(+), 5 deletions(-)
  

Comments

Simon Marchi (Code Review) Oct. 28, 2019, 10:03 p.m. UTC | #1
matt rice has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/402
......................................................................


Patch Set 1:

regarding your open questions, it seems according to 
https://www.sco.com/developers/gabi/latest/ch5.dynamic.html#substitution

valid names (like 'ORIGIN') match as:

name := [_a-zA-Z0-9]*

quoting:
If a dollar sign is not immediately followed by a name or a brace-enclosed name, the behavior of the dynamic linker is unspecified. 
---

so $$ORIGIN is unspecified,

quoting:
Moreover, the pathname will contain no symbolic links or use of . or .. components.
---

however I believe this is refering to the path _after_ the substitution sequence rather than before.


If the '/' occurs before the substitution sequence, I believe that the following may(?) apply, in which case ./$ORIGIN is treated as a directory named literally $ORIGIN in the current directory?

https://www.sco.com/developers/gabi/latest/ch5.dynamic.html#shobj_dependencies

quoting:
If a shared object name has one or more slash (/) characters anywhere in the name, such as /usr/lib/lib2 or directory/file, the dynamic linker uses that string directly as the path name. If the name has no slashes, such as lib1, three facilities specify shared object path 
---

Note the enclosed link includes another valid name not matched by the patch, in particular:
${ORIGIN}
  
Simon Marchi (Code Review) Jan. 15, 2020, 7:50 p.m. UTC | #2
Christian Biesinger has abandoned this change. ( https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/402 )

Change subject: [RFC] Support $ORIGIN in DW_AT_comp_dir
......................................................................


Abandoned

Closing this, will send a new version directly to gdb-patches
  

Patch

diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
index 4372a47..c6cff7d 100644
--- a/gdb/dwarf2read.c
+++ b/gdb/dwarf2read.c
@@ -75,6 +75,10 @@ 
 #include <algorithm>
 #include <unordered_map>
 #include "gdbsupport/selftest.h"
+#include <cmath>
+#include <set>
+#include <forward_list>
+#include <regex>
 #include "rust-lang.h"
 #include "gdbsupport/pathstuff.h"
 
@@ -1784,6 +1788,8 @@ 
 
 static const char *dwarf_type_encoding_name (unsigned int);
 
+static const char *dwarf2_comp_dir (struct die_info *die, struct dwarf2_cu *cu);
+
 static struct die_info *sibling_die (struct die_info *);
 
 static void dump_die_shallow (struct ui_file *, int indent, struct die_info *);
@@ -7234,6 +7240,7 @@ 
   high_pc = NULL;
   ranges = NULL;
   comp_dir = NULL;
+  const char *comp_dir_str = NULL;
 
   if (stub_comp_unit_die != NULL)
     {
@@ -7244,7 +7251,7 @@ 
       low_pc = dwarf2_attr (stub_comp_unit_die, DW_AT_low_pc, cu);
       high_pc = dwarf2_attr (stub_comp_unit_die, DW_AT_high_pc, cu);
       ranges = dwarf2_attr (stub_comp_unit_die, DW_AT_ranges, cu);
-      comp_dir = dwarf2_attr (stub_comp_unit_die, DW_AT_comp_dir, cu);
+      comp_dir_str = dwarf2_comp_dir (stub_comp_unit_die, cu);
 
       /* There should be a DW_AT_addr_base attribute here (if needed).
 	 We need the value before we can process DW_FORM_GNU_addr_index
@@ -7262,13 +7269,15 @@ 
 	cu->ranges_base = DW_UNSND (attr);
     }
   else if (stub_comp_dir != NULL)
+    comp_dir_str = stub_comp_dir;
+  if (comp_dir_str != nullptr)
     {
       /* Reconstruct the comp_dir attribute to simplify the code below.  */
       comp_dir = XOBNEW (&cu->comp_unit_obstack, struct attribute);
       comp_dir->name = DW_AT_comp_dir;
       comp_dir->form = DW_FORM_string;
       DW_STRING_IS_CANONICAL (comp_dir) = 0;
-      DW_STRING (comp_dir) = stub_comp_dir;
+      DW_STRING (comp_dir) = comp_dir_str;
     }
 
   /* Set up for reading the DWO CU/TU.  */
@@ -7411,7 +7420,7 @@ 
 
   /* Yeah, we look dwo_name up again, but it simplifies the code.  */
   dwo_name = dwarf2_dwo_name (comp_unit_die, cu);
-  comp_dir = dwarf2_string_attr (comp_unit_die, DW_AT_comp_dir, cu);
+  comp_dir = dwarf2_comp_dir (comp_unit_die, cu);
 
   if (this_cu->is_debug_types)
     {
@@ -8047,7 +8056,7 @@ 
   pst = create_partial_symtab (per_cu, filename);
 
   /* This must be done before calling dwarf2_build_include_psymtabs.  */
-  pst->dirname = dwarf2_string_attr (comp_unit_die, DW_AT_comp_dir, cu);
+  pst->dirname = dwarf2_comp_dir (comp_unit_die, cu);
 
   baseaddr = ANOFFSET (objfile->section_offsets, SECT_OFF_TEXT (objfile));
 
@@ -11447,7 +11456,7 @@ 
   /* Find the filename.  Do not use dwarf2_name here, since the filename
      is not a source language identifier.  */
   res.name = dwarf2_string_attr (die, DW_AT_name, cu);
-  res.comp_dir = dwarf2_string_attr (die, DW_AT_comp_dir, cu);
+  res.comp_dir = dwarf2_comp_dir (die, cu);
 
   if (res.comp_dir == NULL
       && producer_is_gcc_lt_4_3 (cu) && res.name != NULL
@@ -20219,6 +20228,31 @@ 
   return str;
 }
 
+/* Returns the DW_AT_comp_dir attribute, with $ORIGIN replaced as necessary.
+   Returns the empty string if there is no DW_AT_comp_dir attribute.  */
+static const char *
+dwarf2_comp_dir (struct die_info *die, struct dwarf2_cu *cu)
+{
+  static const char origin_str[] = "$ORIGIN";
+  static constexpr size_t origin_str_len = sizeof(origin_str) - 1;
+
+  const char* comp_dir = dwarf2_string_attr (die, DW_AT_comp_dir, cu);
+  if (comp_dir == nullptr)
+    return nullptr;
+
+  if (!startswith (comp_dir, origin_str))
+    return comp_dir;
+
+  struct objfile *objfile = cu->per_cu->dwarf2_per_objfile->objfile;
+  std::string comp_dir_str = comp_dir;
+  std::string replaced
+    = ldirname (objfile_name (objfile)) + comp_dir_str.substr (origin_str_len);
+  char *alloc = (char *) obstack_alloc (&objfile->objfile_obstack,
+					replaced.length () + 1);
+  memcpy (alloc, replaced.c_str (), replaced.length () + 1);
+  return alloc;
+}
+
 /* Return the dwo name or NULL if not present. If present, it is in either
    DW_AT_GNU_dwo_name or DW_AT_dwo_name attribute.  */
 static const char *
diff --git a/gdb/testsuite/gdb.dwarf2/fission-origin-comp-dir.c b/gdb/testsuite/gdb.dwarf2/fission-origin-comp-dir.c
new file mode 100644
index 0000000..f5f2501
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/fission-origin-comp-dir.c
@@ -0,0 +1,11 @@ 
+static int func2(void) {
+  return 42;
+}
+
+int
+main (void)
+{
+  int val = func2();
+  return val;
+}
+
diff --git a/gdb/testsuite/gdb.dwarf2/fission-origin-comp-dir.exp b/gdb/testsuite/gdb.dwarf2/fission-origin-comp-dir.exp
new file mode 100644
index 0000000..8a996d0
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/fission-origin-comp-dir.exp
@@ -0,0 +1,42 @@ 
+# Copyright 2013-2019 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+load_lib dwarf.exp
+
+# This test can only be run on targets which support DWARF-2.
+if ![dwarf2_support] {
+    return 0
+}
+
+standard_testfile
+
+set relative_srcfile [relative_filename [pwd] $srcdir/fission-origin-comp-dir.c]
+print $relative_srcfile
+
+set pwd [pwd]
+
+if { [prepare_for_testing "failed to prepare" $testfile $relative_srcfile \
+	  [list additional_flags=-gsplit-dwarf additional_flags=-fdebug-prefix-map=$pwd=\\\$ORIGIN]] } {
+    return -1
+}
+
+clean_restart $binfile
+gdb_reinitialize_dir "/NoNeXiStInG"
+
+runto_main
+
+gdb_test_no_output "next"
+gdb_test "p val" "42"
+gdb_test "break func2" "Breakpoint .*"