Fix lookup of separate debug file on MS-Windows

Message ID 83ef5ze84y.fsf@gnu.org
State New, archived
Headers

Commit Message

Eli Zaretskii April 18, 2019, 1:49 p.m. UTC
  If you put the separate debug file in a global debug directory, GDB on
MS-Windows will currently fail to find it.  This happens because we
obtain the directory to look up the debug file by concatenating the
debug directory name with the leading directories of the executable,
and the latter includes the drive letter on MS-Windows.  So we get an
invalid file name like

   d:/usr/lib/debug/d:/usr/bin/foo.debug

The patch below fixes that:



OK to commit to both branches (with the necessary ChangeLog entries)?

Btw, the removal of the leading slash of dir_notarget could
potentially benefit Posix systems as well.  Or maybe we should not
append the literal slash in this snippet from
find_separate_debug_file:

      debugfile = target_prefix ? "target:" : "";
      debugfile += debugdir.get ();
      debugfile += "/";  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      debugfile += dir_notarget;
      debugfile += debuglink;

since AFAICT dir_notarget will always begin with a slash (I added a
test for that because I wasn't sure that is indeed so).
  

Comments

LRN April 18, 2019, 4:19 p.m. UTC | #1
On 18.04.2019 16:49, Eli Zaretskii wrote:
> If you put the separate debug file in a global debug directory, GDB on
> MS-Windows will currently fail to find it.  This happens because we
> obtain the directory to look up the debug file by concatenating the
> debug directory name with the leading directories of the executable,
> and the latter includes the drive letter on MS-Windows.  So we get an
> invalid file name like
> 
>    d:/usr/lib/debug/d:/usr/bin/foo.debug
> 
> +  /* MS-Windows/MS-DOS don't allow colons in file names; we must strip
> +     the drive letter, so that the file name resulting from splicing
> +     below will be valid.  */
> +  if (HAS_DRIVE_SPEC (dir_notarget))
> +    {
> +      dir_notarget = STRIP_DRIVE_SPEC (dir_notarget);
> +      /* We will append a slash to debugdir, so remove the leading
> +	 slash as well.  */
> +      if (IS_DIR_SEPARATOR (dir_notarget[0]))
> +	dir_notarget++;
> +    }
> +

While this is *a* fix, the result, AFAIU is:

d:/usr/lib/debug/usr/bin/foo.debug

This is a functional filename, but is it correct? What if the path to "foo" is:

c:/some/directory/bin/foo

? In that case your patch will produce:

d:/usr/lib/debug/some/directory/bin/foo

Actually, no, that's not right. If gdb and 'foo' are in the same tree (this is
a valid case, for example, if you're dealing with a mingw installation), then
gdb will autodetect debug directory as 'c:/some/directory/lib/debug', and the
debug path will be:

c:/some/directory/lib/debug/some/directory/bin/foo

To make it more obvious:

c:/some/directory/mingw/bin/foo + c:/some/directory/mingw/bin/gdb (meaning
c:/some/directory/mingw/lib/debug debug directory)

will produce

c:/some/directory/mingw/lib/debug/some/directory/mingw/bin/foo

, if i'm not mistaken. So the problems i see with this approach are:

1) Drive letters are lost (meaning that you can't have two paths with equal
names on two different drives be mapped to different debug tree paths - they
only differ in the drive letter, and you make that letter disappear)

A more correct way to fix this is to replace 'X:/' with 'X/'.

2) This doesn't take runtime tree location into account. Your debug tree has to
reflect the absolute, Windows path to the binary (since you're not doing any
other adjustments, and since gdb gives you an asbolute, Windows path to begin
with), otherwise gdb won't find the debug files.
I.e. in the 'c:/some/directory/mingw/lib/debug/some/directory/mingw/bin/foo'
example above the second '/some/directory' is clearly incidental, as the user
just used that as a root of an installation. The package maintainer can't know
where the binaries will end up, and can only predict the '/mingw/bin/foo' part.
I.e. the path that gdb should look for should be:

c:/some/directory/mingw/lib/debug/mingw/bin/foo

or

c:/some/directory/mingw/lib/debug/bin/foo

(depending on whether we consider '/mingw' or '/' as the root directory;
probably the former, although some people might make a case for the latter),
and this is where a package manager can put the files (the package will contain
'/mingw/lib/debug/bin/foo' debug file, and the package manager will prepend the
install root to that path when installing).

Note that (1) and (2) somewhat contradict one another.

One way to satisfy everyone is to implement all lookups (with the prefix first,
then with the absolute path with a drive letter; after that it might even try
the without-drive-letter case, or something else, as long as the debug info
file isn't found). This is not unheard of - for example, gcc can look for
include files in multiple non-existing directories, to support different
installation schemes.

(the reason why i have such a detailed opinion on the matter is that i've
debugged this very code for my latest gdb patch[0], which i submitted last
month, and back then i noticed the same behaviour you did).

[0]: https://www.sourceware.org/ml/gdb-patches/2019-03/msg00082.html
  
Eli Zaretskii April 18, 2019, 6:40 p.m. UTC | #2
> From: LRN <lrn1986@gmail.com>
> Date: Thu, 18 Apr 2019 19:19:39 +0300
> 
> While this is *a* fix, the result, AFAIU is:
> 
> d:/usr/lib/debug/usr/bin/foo.debug
> 
> This is a functional filename, but is it correct?

It fits the documentation, so yes.

> 1) Drive letters are lost (meaning that you can't have two paths with equal
> names on two different drives be mapped to different debug tree paths - they
> only differ in the drive letter, and you make that letter disappear)
> 
> A more correct way to fix this is to replace 'X:/' with 'X/'.

We could do that, but it will need a documentation change.  I think
it's overkill, as the chances of the binaries and the debug directory
to be on different drives are nil, but if others prefer to convert X:/
to X/, I'm fine with that.

> 2) This doesn't take runtime tree location into account. Your debug tree has to
> reflect the absolute, Windows path to the binary (since you're not doing any
> other adjustments, and since gdb gives you an asbolute, Windows path to begin
> with), otherwise gdb won't find the debug files.
> I.e. in the 'c:/some/directory/mingw/lib/debug/some/directory/mingw/bin/foo'
> example above the second '/some/directory' is clearly incidental, as the user
> just used that as a root of an installation.

It is as "incidental" as the same situation of Posix platforms, where
we simply splice the two directories.

> The package maintainer can't know where the binaries will end up,
> and can only predict the '/mingw/bin/foo' part.

Package maintainers shouldn't use global debug directories, because
for starters such directories might not exist at all, and even if they
do, their location cannot be predicted in advance.  Package
maintainers should instead use the .debug subdirectory of where they
put the binaries, that arrangement already works, and is predictable.

> I.e. the path that gdb should look for should be:
> 
> c:/some/directory/mingw/lib/debug/mingw/bin/foo
> 
> or
> 
> c:/some/directory/mingw/lib/debug/bin/foo

This would deviate from what we do on Unix, so I'm opposed to it.
Unless we also change the Unix behavior, of course.
  
LRN April 18, 2019, 9:41 p.m. UTC | #3
On 18.04.2019 21:40, Eli Zaretskii wrote:
>> From: LRN 
>> Date: Thu, 18 Apr 2019 19:19:39 +0300
>>
>> 2) This doesn't take runtime tree location into account. Your debug tree has to
>> reflect the absolute, Windows path to the binary (since you're not doing any
>> other adjustments, and since gdb gives you an asbolute, Windows path to begin
>> with), otherwise gdb won't find the debug files.
>> I.e. in the 'c:/some/directory/mingw/lib/debug/some/directory/mingw/bin/foo'
>> example above the second '/some/directory' is clearly incidental, as the user
>> just used that as a root of an installation.
> 
> It is as "incidental" as the same situation of Posix platforms, where
> we simply splice the two directories.
> 
>> The package maintainer can't know where the binaries will end up,
>> and can only predict the '/mingw/bin/foo' part.
> 
> Package maintainers shouldn't use global debug directories, because
> for starters such directories might not exist at all, and even if they
> do, their location cannot be predicted in advance.  Package
> maintainers should instead use the .debug subdirectory of where they
> put the binaries, that arrangement already works, and is predictable.

Wait, what?
I've just built a "foo" program, it installs into `/mingw/bin/foo`, and i know
that `/mingw/bin/gdb` will look for debug files in `/mingw/lib/debug` (note
that gdb already does this, AFAIR). So it makes sense for me to install debug
file for 'foo' as `/mingw/lib/debug/mingw/bin/foo`. I don't see why debug files
can't be packaged like that.

I was also going to point out that distros do this, but Debian doesn't at the
moment - it uses build-ids instead. But --build-id is relatively new, appeared
in 2011. I distinctly remember that distros used debug directories before then
(unless my memory is starting to give, that is)

> 
>> I.e. the path that gdb should look for should be:
>>
>> c:/some/directory/mingw/lib/debug/mingw/bin/foo
>>
>> or
>>
>> c:/some/directory/mingw/lib/debug/bin/foo
> 
> This would deviate from what we do on Unix, so I'm opposed to it.
> Unless we also change the Unix behavior, of course.
> 

Looking for debug info for `/usr/bin/foo` in `/usr/lib/debug/usr/bin/foo` is
OK, and the paths i described above are exactly the same, not deviating - just
accounting for Windows specifics. The specifics in this case being the fact
that on Windows there's no pre-defined root directory, so packages that want to
have a notion of a root directory have to pick *some* directory and consider it
"root". Usually it's autodetected at runtime as the parent of the bin
subdirectory where the binary is (conveniently, all binaries go into bin
subdir; and yes, this is problematic for plugins - but they usually don't need
to know where the root is). Once you consider the "c:/some/directory" part of
the paths above as "some path to root directory that we don't take into
account", everything becomes clear. (i'm explaining this just for the sake of
the argument, since you already know all this).

Unless you meant strictly the second string (the one with a stripped '/mingw'
prefix), in which case i would agree that this is more-or-less unortodox, even
if somewhat plausible.

As i've said, all these approaches are not mutually-exclusive, gdb can be made
to look in *all* of these places until it finds a debug file.
  
Eli Zaretskii April 19, 2019, 6:48 a.m. UTC | #4
> From: LRN <lrn1986@gmail.com>
> Date: Fri, 19 Apr 2019 00:41:53 +0300
> 
> >> The package maintainer can't know where the binaries will end up,
> >> and can only predict the '/mingw/bin/foo' part.
> > 
> > Package maintainers shouldn't use global debug directories, because
> > for starters such directories might not exist at all, and even if they
> > do, their location cannot be predicted in advance.  Package
> > maintainers should instead use the .debug subdirectory of where they
> > put the binaries, that arrangement already works, and is predictable.
> 
> Wait, what?
> I've just built a "foo" program, it installs into `/mingw/bin/foo`, and i know
> that `/mingw/bin/gdb` will look for debug files in `/mingw/lib/debug` (note
> that gdb already does this, AFAIR). So it makes sense for me to install debug
> file for 'foo' as `/mingw/lib/debug/mingw/bin/foo`. I don't see why debug files
> can't be packaged like that.

You assume that you know how the end-user's GDB was configured.  But
that assumption is false in general.  E.g., I build my own GDB, and I
configure it with a global debug directory that you cannot possibly
know about in advance.

Moreover, the global debug directory is "relocatable", i.e. it is
computed based on the directory where gdb.exe lives, which makes its
exact place even less predictable in advance when an unrelated package
is being prepared.

By contrast, the .debug subdirectory of the binary's installation
directory is searched by default by every GDB build, so it is safer.
(You could also put the debug info file in the same directory as the
binary, but that is less clean, IMO.)

> I was also going to point out that distros do this, but Debian doesn't at the
> moment - it uses build-ids instead. But --build-id is relatively new, appeared
> in 2011. I distinctly remember that distros used debug directories before then
> (unless my memory is starting to give, that is)

The --build-id way is less convenient because AFAIK it requires to
tweak the link command.  It also requires creation of directories with
ugly long human-unreadable names.

> >> I.e. the path that gdb should look for should be:
> >>
> >> c:/some/directory/mingw/lib/debug/mingw/bin/foo
> >>
> >> or
> >>
> >> c:/some/directory/mingw/lib/debug/bin/foo
> > 
> > This would deviate from what we do on Unix, so I'm opposed to it.
> > Unless we also change the Unix behavior, of course.
> > 
> 
> Looking for debug info for `/usr/bin/foo` in `/usr/lib/debug/usr/bin/foo` is
> OK, and the paths i described above are exactly the same, not deviating - just
> accounting for Windows specifics.

They omit the common prefix /some/directory/, unlike on Unix, where
/usr of /usr/bin is not omitted.  That's the deviation I alluded to.
And it runs afoul of the use case that you raised as an objection to
dropping the drive letter: programs by the same name installed in
different places.  If having such programs on different drives is not
rare enough, having them in different directories on the same drive is
even less rare.

> The specifics in this case being the fact
> that on Windows there's no pre-defined root directory, so packages that want to
> have a notion of a root directory have to pick *some* directory and consider it
> "root". Usually it's autodetected at runtime as the parent of the bin
> subdirectory where the binary is (conveniently, all binaries go into bin
> subdir; and yes, this is problematic for plugins - but they usually don't need
> to know where the root is). Once you consider the "c:/some/directory" part of
> the paths above as "some path to root directory that we don't take into
> account", everything becomes clear. (i'm explaining this just for the sake of
> the argument, since you already know all this).

What you say will make sense only if people install software from a
single source, like the MSYS2 project.  But that is a dangerous
assumption, IMO.  It is better to have a system that doesn't make such
an assumption, especially since what I propose is more closely
following what happens on Unix.

> As i've said, all these approaches are not mutually-exclusive, gdb can be made
> to look in *all* of these places until it finds a debug file.

That would need to be a general feature, not limited to Windows.
  
LRN April 19, 2019, 10:06 a.m. UTC | #5
On 19.04.2019 9:48, Eli Zaretskii wrote:
>> From: LRN
>> Date: Fri, 19 Apr 2019 00:41:53 +0300
>>
>> I've just built a "foo" program, it installs into `/mingw/bin/foo`, and i know
>> that `/mingw/bin/gdb` will look for debug files in `/mingw/lib/debug` (note
>> that gdb already does this, AFAIR). So it makes sense for me to install debug
>> file for 'foo' as `/mingw/lib/debug/mingw/bin/foo`. I don't see why debug files
>> can't be packaged like that.
> 
> You assume that you know how the end-user's GDB was configured.  But
> that assumption is false in general.  E.g., I build my own GDB, and I
> configure it with a global debug directory that you cannot possibly
> know about in advance.

If you built your gdb to look for debug files in a completely unpredictable
global directory (i.e. hardcoded 'f:/debugfiles' instead of letting gdb
autodetect the debug directory relative to its own runtime prefix), then your
only choice is to tell gdb at runtime that it should look for debug files in
'f:/debugfiles' (there's probably a command or an envvar that achieves this).
And you will have to place debug files in there by hand (or using some kind of
custom tool), because no one else knows in advance that debug files should go
into 'f:/debugfiles'.
This covers there "where the debug directory root is" part of the lookup. The
second part (path to the binary -> path to the debug file mapping) doesn't
depend on how you configure gdb.

> 
> Moreover, the global debug directory is "relocatable", i.e. it is
> computed based on the directory where gdb.exe lives, which makes its
> exact place even less predictable in advance when an unrelated package
> is being prepared.

If you didn't hardcode "f:/debugfiles", but hardcoded something along the lines
of "/debug/files" (instead of the default "/lib/debug"), then that is indeed
the case. Either way, the conventional "/lib/debug" path is no longer correct,
and the assumption made when debug files were built (the assumption that debug
files should be installed into "/lib/debug") and installed is false. So you are
right - in this case no one but you can know where the debug files should go. I
don't see how your desire to use unconventional debug directories is a valid
justification for gdb to not look in conventional debug directories.

If gdb and debugee are installed completely separately (in different root
directories), then you're right - prefix-aware lookup might not find the debug
files. Well, it *might* find them, if both packages follow FHS (i.e.
"c:/foo/bin/gdb" will look in "c:/foo/debugfiles/bin/bar" for a debug file for
"d:/zool/bin/bar", having concluded (how? i'm starting to doubt that this
scenario is going to work after all) that the adjusted path to 'bar' is
'/bin/bar'), but you certainly won't be able to have different debug files for
two different installs of a debugee in two different root directories.

Though i wonder who would want to do that - install gdb in a:/foo, install
debugee_v1 in b:/bar and install debugee_v2 in c:/zool...i mean, Windows
doesn't make it easy to install multiple things into multiple places, as you
have to add each 'bin' subdir to PATH, otherwise DLLs won't be found. This is
not scalable, and i wouldn't want to encourage anyone to do this.

> 
> By contrast, the .debug subdirectory of the binary's installation
> directory is searched by default by every GDB build, so it is safer.
> (You could also put the debug info file in the same directory as the
> binary, but that is less clean, IMO.)

This is why i didn't raise the issue originally - i actually do place dbg files
in the same directory as their binaries. It's not clean, indeed, but it worked,
works, and will likely always work. So gdb's inability to correctly find debug
files in a separate debug directory is not critical to me.

>>>> I.e. the path that gdb should look for should be:
>>>>
>>>> c:/some/directory/mingw/lib/debug/mingw/bin/foo
>>>>
>>>> or
>>>>
>>>> c:/some/directory/mingw/lib/debug/bin/foo
>>>
>>> This would deviate from what we do on Unix, so I'm opposed to it.
>>> Unless we also change the Unix behavior, of course.
>>>
>>
>> Looking for debug info for `/usr/bin/foo` in `/usr/lib/debug/usr/bin/foo` is
>> OK, and the paths i described above are exactly the same, not deviating - just
>> accounting for Windows specifics.
>> The specifics in this case being the fact
>> that on Windows there's no pre-defined root directory, so packages that want to
>> have a notion of a root directory have to pick *some* directory and consider it
>> "root". Usually it's autodetected at runtime as the parent of the bin
>> subdirectory where the binary is (conveniently, all binaries go into bin
>> subdir; and yes, this is problematic for plugins - but they usually don't need
>> to know where the root is). Once you consider the "c:/some/directory" part of
>> the paths above as "some path to root directory that we don't take into
>> account", everything becomes clear. (i'm explaining this just for the sake of
>> the argument, since you already know all this).
> 
> What you say will make sense only if people install software from a
> single source, like the MSYS2 project.

It also makes sense if everyone follows the convention. It's like FHS - no one
forces you to follow it, but everyone benefits if you do, because everything is
where you'd expect it to be.

>  But that is a dangerous
> assumption, IMO.

Dangerous how?

>  It is better to have a system that doesn't make such
> an assumption, especially since what I propose is more closely
> following what happens on Unix.

See below.

> 
>> As i've said, all these approaches are not mutually-exclusive, gdb can be made
>> to look in *all* of these places until it finds a debug file.
> 
> That would need to be a general feature, not limited to Windows.
> 

And yet that feature is a way to *both* follow an established convention for
Windows-installed Unix-originated programs *and* be able to install debugees in
multiple separate places with separate debug files that are correctly found by
a single instance gdb (which is also installed separately).

If your debugdir is 'c:/path/to/debugdir' and you're debugging
'f:/some/dir/usr/bin/foo' with gdb located in 'd:/programs/mingw/bin/gdb',
start by looking for

c:/path/to/debugdir/f/some/dir/usr/bin/foo

(no ambiguity), then try

c:/path/to/debugdir/some/dir/usr/bin/foo

(ambiguous drive), and then...that's it. If gdb and debugee are in different
root directories, prefix-aware lookup won't work (gdb doesn't know which part
of 'f:/some/dir/usr/bin/' to strip away). If you had gdb in
'x:/zool/usr/bin/gdb' and debugee in 'x:/zool/mingw/bin/foo', you'd also be
able to look for

c:/path/to/debugdir/mingw/bin/foo

This is both a feature and a bug.
A bug - because programs installed in different root directories have no way to
locate each other (as long as they don't all get put into PATH, in which case
you'll spend hours chasing after some delicious heisenbugs).
A feature - because you can have multiple root directories with multiple
independent installs, and they won't interfere with each other in any way (i
have, like, 4 of them at the moment, and the number will increase to 5 soon).

You can't just blindly do what Unix does and expect a good result - this isn't
Unix. There's an extra layer of complexity - arbitrary root directories.
  
Eli Zaretskii April 21, 2019, 12:05 p.m. UTC | #6
Ping!

Would people please voice their opinions regarding preservation of the
drive letter (and removing the colon) vs just dropping the drive
letter altogether?  I'd like to fix this issue for the upcoming
release of GDB 8.3.

> Date: Thu, 18 Apr 2019 21:40:56 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> CC: gdb-patches@sourceware.org
> 
> > From: LRN <lrn1986@gmail.com>
> > Date: Thu, 18 Apr 2019 19:19:39 +0300
> > 
> > While this is *a* fix, the result, AFAIU is:
> > 
> > d:/usr/lib/debug/usr/bin/foo.debug
> > 
> > This is a functional filename, but is it correct?
> 
> It fits the documentation, so yes.
> 
> > 1) Drive letters are lost (meaning that you can't have two paths with equal
> > names on two different drives be mapped to different debug tree paths - they
> > only differ in the drive letter, and you make that letter disappear)
> > 
> > A more correct way to fix this is to replace 'X:/' with 'X/'.
> 
> We could do that, but it will need a documentation change.  I think
> it's overkill, as the chances of the binaries and the debug directory
> to be on different drives are nil, but if others prefer to convert X:/
> to X/, I'm fine with that.
> 
> > 2) This doesn't take runtime tree location into account. Your debug tree has to
> > reflect the absolute, Windows path to the binary (since you're not doing any
> > other adjustments, and since gdb gives you an asbolute, Windows path to begin
> > with), otherwise gdb won't find the debug files.
> > I.e. in the 'c:/some/directory/mingw/lib/debug/some/directory/mingw/bin/foo'
> > example above the second '/some/directory' is clearly incidental, as the user
> > just used that as a root of an installation.
> 
> It is as "incidental" as the same situation of Posix platforms, where
> we simply splice the two directories.
> 
> > The package maintainer can't know where the binaries will end up,
> > and can only predict the '/mingw/bin/foo' part.
> 
> Package maintainers shouldn't use global debug directories, because
> for starters such directories might not exist at all, and even if they
> do, their location cannot be predicted in advance.  Package
> maintainers should instead use the .debug subdirectory of where they
> put the binaries, that arrangement already works, and is predictable.
> 
> > I.e. the path that gdb should look for should be:
> > 
> > c:/some/directory/mingw/lib/debug/mingw/bin/foo
> > 
> > or
> > 
> > c:/some/directory/mingw/lib/debug/bin/foo
> 
> This would deviate from what we do on Unix, so I'm opposed to it.
> Unless we also change the Unix behavior, of course.
>
  
Simon Marchi April 21, 2019, 12:55 p.m. UTC | #7
On 2019-04-21 8:05 a.m., Eli Zaretskii wrote:
> Ping!
> 
> Would people please voice their opinions regarding preservation of the
> drive letter (and removing the colon) vs just dropping the drive
> letter altogether?  I'd like to fix this issue for the upcoming
> release of GDB 8.3.

I first read your patch (which initiated this thread), and my first reaction was
also to be surprised that we would lose the drive letter.  The risk of clash
is low, but why take that risk at all when it's easy enough to keep the drive letter
and just strip the colon?

I don't know the complete history of this, so if you know about distributions that
place debug files in a path without the drive letter (what your patch implements),
then I would be fine if GDB looked in both places, starting with the path including
the drive letter.

Simon
  

Patch

--- gdb/symfile.c~0	2019-03-27 00:52:05.000000000 +0200
+++ gdb/symfile.c	2019-04-18 13:19:05.231697600 +0300
@@ -1443,6 +1443,18 @@  find_separate_debug_file (const char *di
     = dirnames_to_char_ptr_vec (debug_file_directory);
   gdb::unique_xmalloc_ptr<char> canon_sysroot = gdb_realpath (gdb_sysroot);
 
+  /* MS-Windows/MS-DOS don't allow colons in file names; we must strip
+     the drive letter, so that the file name resulting from splicing
+     below will be valid.  */
+  if (HAS_DRIVE_SPEC (dir_notarget))
+    {
+      dir_notarget = STRIP_DRIVE_SPEC (dir_notarget);
+      /* We will append a slash to debugdir, so remove the leading
+	 slash as well.  */
+      if (IS_DIR_SEPARATOR (dir_notarget[0]))
+	dir_notarget++;
+    }
+
   for (const gdb::unique_xmalloc_ptr<char> &debugdir : debugdir_vec)
     {
       debugfile = target_prefix ? "target:" : "";