[PATCHv2,1/2] gdb: avoid '//' in filenames when searching for debuginfo

Message ID aea2feac676382ba38d6a84681a9211b10623175.1721120333.git.aburgess@redhat.com
State New
Headers
Series find debug by debuglink minor fixes |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Test passed

Commit Message

Andrew Burgess July 16, 2024, 9 a.m. UTC
  I spotted that the gdb.base/sysroot-debug-lookup.exp test that I added
recently actually had a KPASS when run with the
native-extended-gdbserver board.  This was an oversight when adding
the test.

The failures in this test, when using the 'unix' board, are logged as
bug PR gdb/31804.  The problem appears to be caused by the use of the
child_path function in find_separate_debug_file.

What happens on the 'unix' board is that the file is specified to GDB
with a target: prefix, however GDB spots that the target filesystem is
local to GDB and so opens the file without a target: prefix.  When we
call into find_separate_debug_file the DIR and CANON_DIR arguments,
which are computed from the objfile_name() no longer have a target:
prefix.

However, in this test if the file was opened with a target: prefix,
then the sysroot also has a target: prefix.  When child_path is called
it looks for a common prefix between CANON_DIR (from the objfile_name)
and the sysroot.  However, the sysroot still has the target: prefix,
which means the child_path() call fails and returns nullptr.

What happens in the native-extended-gdbserver case is that GDB doesn't
see the target filesystem as local.  Now the filename retains the
target: prefix, which means that in the child_path() call both the
sysroot and the CANON_DIR have a target: prefix, and so the
child_path() call succeeds.  This allows GDB to progress, try some
additional paths, and then find the debug information.

So, this commit changes gdb.base/sysroot-debug-lookup.exp to expect
the test to succeed when using the native-extended-gdbserver protocol.

This leaves one KFAIL when using the native-extended-gdbserver board,
we find the debug information but (apparently) find it in the wrong
file.  What's happening is that when GDB builds the filename for the
debug information we end up with a '//' string as a directory
separator, the test regexp only expects a single separator.

Instead of just fixing the test regexp, I've updated the path_join
function in gdbsupport/pathstuff.{cc,h} to allow for absolute paths to
appear in the argument list after the first argument.  This means it's
now possible to do this:

  auto result = path_join ("/a/b/c", "/d/e/f");
  gdb_assert (result == "/a/b/c/d/e/f");

Additionally I've changed path_join so that it avoids adding
unnecessary directory separators.  In the above case when the two
paths were joined GDB only added a single separator between 'c' and
'd'.  But additionally, if we did this:

  auto result = path_join ("/a/b/c/", "/d/e/f");
  gdb_assert (result == "/a/b/c/d/e/f");

We'd still only get a single separator.

With these changes to path_join I can now make use of this function in
find_separate_debug_file.  With this done I now have no KFAIL when
using the native-extended-gdbserver board.

After this commit we still have 2 KFAIL when not using the
native-gdbserver and unix boards, these will be addressed in the next
commit.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31804
---
 gdb/symfile.c                                 | 41 ++++++-------------
 .../gdb.base/sysroot-debug-lookup.exp         |  6 ++-
 gdbsupport/pathstuff.cc                       | 16 +++++---
 gdbsupport/pathstuff.h                        |  6 ++-
 4 files changed, 32 insertions(+), 37 deletions(-)
  

Comments

Keith Seitz Aug. 9, 2024, 7:45 p.m. UTC | #1
On 7/16/24 2:00 AM, Andrew Burgess wrote:
> 
> Instead of just fixing the test regexp, I've updated the path_join
> function in gdbsupport/pathstuff.{cc,h} to allow for absolute paths to
> appear in the argument list after the first argument.  This means it's
> now possible to do this:
> 
>    auto result = path_join ("/a/b/c", "/d/e/f");
>    gdb_assert (result == "/a/b/c/d/e/f");
> 
> Additionally I've changed path_join so that it avoids adding
> unnecessary directory separators.  In the above case when the two
> paths were joined GDB only added a single separator between 'c' and
> 'd'.  But additionally, if we did this:
> 
>    auto result = path_join ("/a/b/c/", "/d/e/f");
>    gdb_assert (result == "/a/b/c/d/e/f");
> 
> We'd still only get a single separator.

This is a really nice cleanup. I've never been a fan of constructing
filenames/paths using string concatenation. It has always caused
problems for me.

I've spent a lot of time setting up Msys2 on my personal laptop to test
this, and as you and Eli surmise, there are still issues with drive
specifications.

The good news, though, is that there is some forward progress here,
and given the previous discussion, I think this is good to go as-is.

So, feel free to add

Reviewed-by: Keith Seitz <keiths@redhat.com>


Now the not-so-good. I just want to follow-up on the discussion
in

   https://inbox.sourceware.org/gdb-patches/87h6du6x8q.fsf@redhat.com/

with some data.

In my build, I've configured with
--with-separate-debug-dir=/usr/lib/debug and created c:/sysroot.
Therein lies usr/bin/mytest.exe and usr/lib/debug/usr/bin/mytest.exe.debug.

Following the procedure you outlined in the above link and filtering for
the "Trying" bits that outputs, we get:

$ ./gdb.original -iex "set debug separate-debug-file on" -iex "set 
sysroot c:/sysroot" -iex "set separate-debug-file-directory 
c:/sysroot/usr/lib" c:/sysroot/usr/bin/mytest -batch -ex "file 
c:/sysroot/usr/bin/mytest"|& grep Trying|awk '{print $4}'
c:\sysroot\usr\bin\mytest.exe.debug...
c:\sysroot\usr\bin\.debug/mytest.exe.debug...
C:/usr/lib/debug/c\sysroot\usr\bin\mytest.exe.debug...
C:/usr/lib/debug/usr/bin/mytest.exe.debug...
c:/sysrootC:/usr/lib/debug/usr/bin/mytest.exe.debug...
c:\sysroot\usr\bin\mytest.exe.debug...
c:\sysroot\usr\bin\.debug/mytest.exe.debug...
C:/usr/lib/debug/c\sysroot\usr\bin\mytest.exe.debug...
C:/usr/lib/debug/usr/bin/mytest.exe.debug...
c:/sysrootC:/usr/lib/debug/usr/bin/mytest.exe.debug...


Most of this is well-formed, according to the User Manual ("set
sysroot" discusses this).

However, you've already noticed the boo-boo:
c:/sysrootC:/usr/lib/debug/usr/bin/mytest.exe.debug (repeated)

That is clearly incorrect.

With this (#1 of 2) patch applied, things are a little better. The
erroneous paths become
"c:/sysroot/C:/usr/lib/debug/usr/bin/mytest.exe.debug".

With your small follow-on patch applied, things are different
but still incorrect:
"c:/sysroot/C/C:/usr/lib/debug/usr/bin/mytest.exe.debug"

So there's a bit more tweaking to be done there for Windows.

If you need anyone to test anything related to this, I can help out.

Keith
  
Eli Zaretskii Aug. 10, 2024, 5:36 a.m. UTC | #2
> Date: Fri, 9 Aug 2024 12:45:56 -0700
> From: Keith Seitz <keiths@redhat.com>
> 
> In my build, I've configured with
> --with-separate-debug-dir=/usr/lib/debug and created c:/sysroot.
> Therein lies usr/bin/mytest.exe and usr/lib/debug/usr/bin/mytest.exe.debug.

Does such a configuration make sense on Windows?  /usr/lib/debug is
not an absolute file name there, so IMO the main use case we should
strive to support is GDB configured with

  --with-separate-debug-dir=X:/usr/lib/debug

where X is some drive letter (which should be different from C if
testing the C:/sysroot scenario).

> However, you've already noticed the boo-boo:
> c:/sysrootC:/usr/lib/debug/usr/bin/mytest.exe.debug (repeated)
> 
> That is clearly incorrect.

Yep.
  
Keith Seitz Aug. 10, 2024, 9:07 p.m. UTC | #3
On 8/9/24 10:36 PM, Eli Zaretskii wrote:
>> Date: Fri, 9 Aug 2024 12:45:56 -0700
>> From: Keith Seitz <keiths@redhat.com>
>>
>> In my build, I've configured with
>> --with-separate-debug-dir=/usr/lib/debug and created c:/sysroot.
>> Therein lies usr/bin/mytest.exe and usr/lib/debug/usr/bin/mytest.exe.debug.
> 
> Does such a configuration make sense on Windows?  /usr/lib/debug is
> not an absolute file name there, so IMO the main use case we should
> strive to support is GDB configured with
> 
>    --with-separate-debug-dir=X:/usr/lib/debug
> 
> where X is some drive letter (which should be different from C if
> testing the C:/sysroot scenario).

Hahaha. Good catch -- my fingers were on autopilot/Cygwin mode. I'll
redo this to see if it makes any difference.

Keith
  
Andrew Burgess Aug. 19, 2024, 2:08 p.m. UTC | #4
Keith Seitz <keiths@redhat.com> writes:

> On 7/16/24 2:00 AM, Andrew Burgess wrote:
>> 
>> Instead of just fixing the test regexp, I've updated the path_join
>> function in gdbsupport/pathstuff.{cc,h} to allow for absolute paths to
>> appear in the argument list after the first argument.  This means it's
>> now possible to do this:
>> 
>>    auto result = path_join ("/a/b/c", "/d/e/f");
>>    gdb_assert (result == "/a/b/c/d/e/f");
>> 
>> Additionally I've changed path_join so that it avoids adding
>> unnecessary directory separators.  In the above case when the two
>> paths were joined GDB only added a single separator between 'c' and
>> 'd'.  But additionally, if we did this:
>> 
>>    auto result = path_join ("/a/b/c/", "/d/e/f");
>>    gdb_assert (result == "/a/b/c/d/e/f");
>> 
>> We'd still only get a single separator.
>
> This is a really nice cleanup. I've never been a fan of constructing
> filenames/paths using string concatenation. It has always caused
> problems for me.
>
> I've spent a lot of time setting up Msys2 on my personal laptop to test
> this, and as you and Eli surmise, there are still issues with drive
> specifications.
>
> The good news, though, is that there is some forward progress here,
> and given the previous discussion, I think this is good to go as-is.
>
> So, feel free to add
>
> Reviewed-by: Keith Seitz <keiths@redhat.com>

Thanks Keith.

It's been a while with no further feedback on these patches.  And as the
fixes are petty localised, I've gone ahead and merged both the patches
in this series.

If anyone has any post-commit feedback then I'm happy to address it.

>
>
> Now the not-so-good. I just want to follow-up on the discussion
> in
>
>    https://inbox.sourceware.org/gdb-patches/87h6du6x8q.fsf@redhat.com/
>
> with some data.
>
> In my build, I've configured with
> --with-separate-debug-dir=/usr/lib/debug and created c:/sysroot.
> Therein lies usr/bin/mytest.exe and usr/lib/debug/usr/bin/mytest.exe.debug.
>
> Following the procedure you outlined in the above link and filtering for
> the "Trying" bits that outputs, we get:
>
> $ ./gdb.original -iex "set debug separate-debug-file on" -iex "set 
> sysroot c:/sysroot" -iex "set separate-debug-file-directory 
> c:/sysroot/usr/lib" c:/sysroot/usr/bin/mytest -batch -ex "file 
> c:/sysroot/usr/bin/mytest"|& grep Trying|awk '{print $4}'
> c:\sysroot\usr\bin\mytest.exe.debug...
> c:\sysroot\usr\bin\.debug/mytest.exe.debug...
> C:/usr/lib/debug/c\sysroot\usr\bin\mytest.exe.debug...
> C:/usr/lib/debug/usr/bin/mytest.exe.debug...
> c:/sysrootC:/usr/lib/debug/usr/bin/mytest.exe.debug...
> c:\sysroot\usr\bin\mytest.exe.debug...
> c:\sysroot\usr\bin\.debug/mytest.exe.debug...
> C:/usr/lib/debug/c\sysroot\usr\bin\mytest.exe.debug...
> C:/usr/lib/debug/usr/bin/mytest.exe.debug...
> c:/sysrootC:/usr/lib/debug/usr/bin/mytest.exe.debug...
>
>
> Most of this is well-formed, according to the User Manual ("set
> sysroot" discusses this).
>
> However, you've already noticed the boo-boo:
> c:/sysrootC:/usr/lib/debug/usr/bin/mytest.exe.debug (repeated)
>
> That is clearly incorrect.
>
> With this (#1 of 2) patch applied, things are a little better. The
> erroneous paths become
> "c:/sysroot/C:/usr/lib/debug/usr/bin/mytest.exe.debug".
>
> With your small follow-on patch applied, things are different
> but still incorrect:
> "c:/sysroot/C/C:/usr/lib/debug/usr/bin/mytest.exe.debug"

Part of my problem is that I'm not 100% clear what the "right" solution
actually is.

I suspect that we do need to keep the drive letter in the result
somewhere, so maybe:

  C:/sysroot/C/usr/lib/debug/usr/bin/mytest.exe.debug

Because we could have multiple files with the same filename, but on
different drives, e.g.:

   C:/usr/bin/foo.exe
   D:/usr/bin/foo.exe

If the debug for this is within the same sysroot then the C/D needs to
be present somewhere in order to tell these two choices apart.

Thanks,
Andrew

>
> So there's a bit more tweaking to be done there for Windows.
>
> If you need anyone to test anything related to this, I can help out.
>
> Keith
  

Patch

diff --git a/gdb/symfile.c b/gdb/symfile.c
index b0510b4b409..aeee30af50d 100644
--- a/gdb/symfile.c
+++ b/gdb/symfile.c
@@ -1370,17 +1370,13 @@  find_separate_debug_file (const char *dir,
      objfile_name (objfile));
 
   /* First try in the same directory as the original file.  */
-  std::string debugfile = dir;
-  debugfile += debuglink;
+  std::string debugfile = path_join (dir, debuglink);
 
   if (separate_debug_file_exists (debugfile, crc32, objfile, warnings))
     return debugfile;
 
   /* Then try in the subdirectory named DEBUG_SUBDIRECTORY.  */
-  debugfile = dir;
-  debugfile += DEBUG_SUBDIRECTORY;
-  debugfile += "/";
-  debugfile += debuglink;
+  debugfile = path_join (dir, DEBUG_SUBDIRECTORY, debuglink);
 
   if (separate_debug_file_exists (debugfile, crc32, objfile, warnings))
     return debugfile;
@@ -1393,6 +1389,7 @@  find_separate_debug_file (const char *dir,
   bool target_prefix = is_target_filename (dir);
   const char *dir_notarget
     = target_prefix ? dir + strlen (TARGET_SYSROOT_PREFIX) : dir;
+  const char *target_prefix_str = target_prefix ? TARGET_SYSROOT_PREFIX : "";
   std::vector<gdb::unique_xmalloc_ptr<char>> debugdir_vec
     = dirnames_to_char_ptr_vec (debug_file_directory.c_str ());
   gdb::unique_xmalloc_ptr<char> canon_sysroot
@@ -1421,12 +1418,8 @@  find_separate_debug_file (const char *dir,
 
   for (const gdb::unique_xmalloc_ptr<char> &debugdir : debugdir_vec)
     {
-      debugfile = target_prefix ? TARGET_SYSROOT_PREFIX : "";
-      debugfile += debugdir;
-      debugfile += "/";
-      debugfile += drive;
-      debugfile += dir_notarget;
-      debugfile += debuglink;
+      debugfile = path_join (target_prefix_str, debugdir.get (),
+			     drive.c_str (), dir_notarget, debuglink);
 
       if (separate_debug_file_exists (debugfile, crc32, objfile, warnings))
 	return debugfile;
@@ -1443,12 +1436,8 @@  find_separate_debug_file (const char *dir,
 	{
 	  /* If the file is in the sysroot, try using its base path in
 	     the global debugfile directory.  */
-	  debugfile = target_prefix ? TARGET_SYSROOT_PREFIX : "";
-	  debugfile += debugdir;
-	  debugfile += "/";
-	  debugfile += base_path;
-	  debugfile += "/";
-	  debugfile += debuglink;
+	  debugfile = path_join (target_prefix_str, debugdir.get (),
+				 base_path, debuglink);
 
 	  if (separate_debug_file_exists (debugfile, crc32, objfile, warnings))
 	    return debugfile;
@@ -1461,21 +1450,17 @@  find_separate_debug_file (const char *dir,
 	     same result as above.  */
 	  if (gdb_sysroot != TARGET_SYSROOT_PREFIX)
 	    {
-	      debugfile = target_prefix ? TARGET_SYSROOT_PREFIX : "";
+	      std::string root;
 	      if (is_target_filename (gdb_sysroot))
 		{
-		  std::string root
-		    = gdb_sysroot.substr (strlen (TARGET_SYSROOT_PREFIX));
+		  root = gdb_sysroot.substr (strlen (TARGET_SYSROOT_PREFIX));
 		  gdb_assert (!root.empty ());
-		  debugfile += root;
 		}
 	      else
-		debugfile += gdb_sysroot;
-	      debugfile += debugdir;
-	      debugfile += "/";
-	      debugfile += base_path;
-	      debugfile += "/";
-	      debugfile += debuglink;
+		root = gdb_sysroot;
+
+	      debugfile = path_join (target_prefix_str, root.c_str (),
+				     debugdir.get (), base_path, debuglink);
 
 	      if (separate_debug_file_exists (debugfile, crc32, objfile,
 					      warnings))
diff --git a/gdb/testsuite/gdb.base/sysroot-debug-lookup.exp b/gdb/testsuite/gdb.base/sysroot-debug-lookup.exp
index 5f17315c027..30d492e0d47 100644
--- a/gdb/testsuite/gdb.base/sysroot-debug-lookup.exp
+++ b/gdb/testsuite/gdb.base/sysroot-debug-lookup.exp
@@ -183,13 +183,15 @@  proc_with_prefix lookup_via_debuglink {} {
 	#
 	# Bug PR gdb/30866 seems to be the (or a) relevant bug for
 	# this problem.
-	if { $sysroot_prefix ne "" } {
+	if { $sysroot_prefix ne ""
+	     && [target_info gdb_protocol] ne "extended-remote" } {
 	    setup_kfail "*-*-*" 31804
 	}
 	gdb_assert { $::gdb_file_cmd_debug_info eq "debug" } \
 	    "ensure debug information was found"
 
-	if { $sysroot_prefix ne "" } {
+	if { $sysroot_prefix ne ""
+	     && [target_info gdb_protocol] ne "extended-remote"  } {
 	    setup_kfail "*-*-*" 31804
 	}
 	set re [string_to_regexp "Reading symbols from ${sysroot_prefix}$debug_symlink..."]
diff --git a/gdbsupport/pathstuff.cc b/gdbsupport/pathstuff.cc
index 9c3816ccc63..029e3c9a890 100644
--- a/gdbsupport/pathstuff.cc
+++ b/gdbsupport/pathstuff.cc
@@ -198,11 +198,17 @@  path_join (gdb::array_view<const char *> paths)
     {
       const char *path = paths[i];
 
-      if (i > 0)
-	gdb_assert (strlen (path) == 0 || !IS_ABSOLUTE_PATH (path));
-
-      if (!ret.empty () && !IS_DIR_SEPARATOR (ret.back ()))
-	  ret += '/';
+      if (!ret.empty ())
+	{
+	  /* If RET doesn't already end with a separator then add one.  */
+	  if (!IS_DIR_SEPARATOR (ret.back ()))
+	    ret += '/';
+
+	  /* Now that RET ends with a separator, ignore any at the start of
+	     PATH.  */
+	  while (IS_DIR_SEPARATOR (path[0]))
+	    ++path;
+	}
 
       ret.append (path);
     }
diff --git a/gdbsupport/pathstuff.h b/gdbsupport/pathstuff.h
index a61ce23efde..22170bbabf2 100644
--- a/gdbsupport/pathstuff.h
+++ b/gdbsupport/pathstuff.h
@@ -80,8 +80,10 @@  extern const char *child_path (const char *parent, const char *child);
 
 /* Join elements in PATHS into a single path.
 
-   The first element can be absolute or relative.  All the others must be
-   relative.  */
+   The first element can be absolute or relative.  Only a single directory
+   separator will be placed between elements of PATHS, if one element ends
+   with a directory separator, or an element starts with a directory
+   separator, then these will be collapsed into a single separator.  */
 
 extern std::string path_join (gdb::array_view<const char *> paths);