gdb: obtain a lock before reading a separate debug info file

Message ID CACKH++YupoU28SmdkQsxs4gRS-m4LrN8Oq4ryLGFTZx-Ltu68w@mail.gmail.com
State New
Headers
Series gdb: obtain a lock before reading a separate debug info file |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-arm fail Patch failed to apply
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 fail Patch failed to apply

Commit Message

Rui Ueyama July 16, 2024, 6:55 a.m. UTC
  The mold linker has recently gained the `--separate-debug-file` flag to
create a separate debug info file in the background. We don't want to read
a debug file until it's complete. Since mold locks a debug file while
writing to it, we can avoid this issue by obtaining a shared file lock.
---
 gdb/configure.ac |  2 +-
 gdb/symfile.c    | 13 +++++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

   if (!file_crc_p)
--
2.43.0
  

Comments

Tom Tromey July 17, 2024, 5:57 p.m. UTC | #1
>>>>> "Rui" == Rui Ueyama <rui314@gmail.com> writes:

Thanks for the patch.

Rui> The mold linker has recently gained the `--separate-debug-file` flag to
Rui> create a separate debug info file in the background. We don't want to read
Rui> a debug file until it's complete. Since mold locks a debug file while
Rui> writing to it, we can avoid this issue by obtaining a shared file lock.

Sounds reasonable.

Rui> +#if HAVE_FLOCK
Rui> +  /* The mold linker has a feature to create a separate debug info file in
Rui> +     the background. The linker obtains an exclusive file lock until the
Rui> +     file is ready. We'll obtain a shared lock to avoid reading a
Rui> +     half-written file. This is effectively a no-op for other linkers who
Rui> +     don't lock a file. */
Rui> +  flock(fileno((FILE *)abfd->iostream), LOCK_SH);

Unfortunately I don't think this will work properly, because BFD has a
virtual I/O layer, and gdb uses it.  So, for example, a BFD coming from
a remote target won't have an ordinary FILE* as an iostream.  (I'm not
sure offhand if this field will be NULL in this case or if it will be a
pointer to some other kind of object.)

I also suspect it isn't ok for code outside of BFD to access these bits
in this way.

I'm not totally sure of the best way to approach this.  Adding a new
iovec entry was my first thought, but adding this to the remote protocol
seems like a fair amount of work.  Though perhaps that scenario doesn't
matter and some more local-only approach would be fine (i.e., have the
various gdb implementations of this method just be no-ops).

A change to the iovec would have to be discussed on the binutils list
though, as that's where the BFD maintainers are.

Tom
  
Rui Ueyama July 18, 2024, 7:25 a.m. UTC | #2
Thank you for reviewing!

On Thu, Jul 18, 2024 at 2:57 AM Tom Tromey <tom@tromey.com> wrote:
>
> >>>>> "Rui" == Rui Ueyama <rui314@gmail.com> writes:
>
> Thanks for the patch.
>
> Rui> The mold linker has recently gained the `--separate-debug-file` flag to
> Rui> create a separate debug info file in the background. We don't want to read
> Rui> a debug file until it's complete. Since mold locks a debug file while
> Rui> writing to it, we can avoid this issue by obtaining a shared file lock.
>
> Sounds reasonable.
>
> Rui> +#if HAVE_FLOCK
> Rui> +  /* The mold linker has a feature to create a separate debug info file in
> Rui> +     the background. The linker obtains an exclusive file lock until the
> Rui> +     file is ready. We'll obtain a shared lock to avoid reading a
> Rui> +     half-written file. This is effectively a no-op for other linkers who
> Rui> +     don't lock a file. */
> Rui> +  flock(fileno((FILE *)abfd->iostream), LOCK_SH);
>
> Unfortunately I don't think this will work properly, because BFD has a
> virtual I/O layer, and gdb uses it.  So, for example, a BFD coming from
> a remote target won't have an ordinary FILE* as an iostream.  (I'm not
> sure offhand if this field will be NULL in this case or if it will be a
> pointer to some other kind of object.)
>
> I also suspect it isn't ok for code outside of BFD to access these bits
> in this way.
>
> I'm not totally sure of the best way to approach this.  Adding a new
> iovec entry was my first thought, but adding this to the remote protocol
> seems like a fair amount of work.  Though perhaps that scenario doesn't
> matter and some more local-only approach would be fine (i.e., have the
> various gdb implementations of this method just be no-ops).
>
> A change to the iovec would have to be discussed on the binutils list
> though, as that's where the BFD maintainers are.

Looks like gdb_bfd_open internally handles the raw file descriptor.
So, how about adding a new boolean parameter, which is false by
default, to gdb_bfd_open to let it lock a file with flock() if it's
true? For our purpose, that change should be sufficient. In
particular, I don't think we need to handle remote protocols because
the mold linker can only create separate debug info on a local
filesystem.
  
Andrew Burgess July 19, 2024, 11:20 a.m. UTC | #3
Rui Ueyama <rui314@gmail.com> writes:

> Thank you for reviewing!
>
> On Thu, Jul 18, 2024 at 2:57 AM Tom Tromey <tom@tromey.com> wrote:
>>
>> >>>>> "Rui" == Rui Ueyama <rui314@gmail.com> writes:
>>
>> Thanks for the patch.
>>
>> Rui> The mold linker has recently gained the `--separate-debug-file` flag to
>> Rui> create a separate debug info file in the background. We don't want to read
>> Rui> a debug file until it's complete. Since mold locks a debug file while
>> Rui> writing to it, we can avoid this issue by obtaining a shared file lock.

When you say "in the background", this is only while the link is being
performed, right?  It's not running the debug generation at some point
after the link has completed?

I ask because I'm trying to understand when the problem occurs.  A user
starts debugging while the link is still being performed and ends up
reading an incomplete debug info file.  But doesn't the same problem
exist for the executable itself?

>>
>> Sounds reasonable.
>>
>> Rui> +#if HAVE_FLOCK
>> Rui> +  /* The mold linker has a feature to create a separate debug info file in
>> Rui> +     the background. The linker obtains an exclusive file lock until the
>> Rui> +     file is ready. We'll obtain a shared lock to avoid reading a
>> Rui> +     half-written file. This is effectively a no-op for other linkers who
>> Rui> +     don't lock a file. */
>> Rui> +  flock(fileno((FILE *)abfd->iostream), LOCK_SH);
>>
>> Unfortunately I don't think this will work properly, because BFD has a
>> virtual I/O layer, and gdb uses it.  So, for example, a BFD coming from
>> a remote target won't have an ordinary FILE* as an iostream.  (I'm not
>> sure offhand if this field will be NULL in this case or if it will be a
>> pointer to some other kind of object.)
>>
>> I also suspect it isn't ok for code outside of BFD to access these bits
>> in this way.
>>
>> I'm not totally sure of the best way to approach this.  Adding a new
>> iovec entry was my first thought, but adding this to the remote protocol
>> seems like a fair amount of work.  Though perhaps that scenario doesn't
>> matter and some more local-only approach would be fine (i.e., have the
>> various gdb implementations of this method just be no-ops).
>>
>> A change to the iovec would have to be discussed on the binutils list
>> though, as that's where the BFD maintainers are.
>
> Looks like gdb_bfd_open internally handles the raw file descriptor.
> So, how about adding a new boolean parameter, which is false by
> default, to gdb_bfd_open to let it lock a file with flock() if it's
> true? For our purpose, that change should be sufficient. In
> particular, I don't think we need to handle remote protocols because
> the mold linker can only create separate debug info on a local
> filesystem.

mold might be running on the remote system alongside gdbserver in which
case a remote GDB can still try to fetch the debug file from the remote.

Thanks,
Andrew
  
Rui Ueyama July 20, 2024, 3:23 a.m. UTC | #4
On Fri, Jul 19, 2024 at 8:20 PM Andrew Burgess <aburgess@redhat.com> wrote:
>
> Rui Ueyama <rui314@gmail.com> writes:
>
> > Thank you for reviewing!
> >
> > On Thu, Jul 18, 2024 at 2:57 AM Tom Tromey <tom@tromey.com> wrote:
> >>
> >> >>>>> "Rui" == Rui Ueyama <rui314@gmail.com> writes:
> >>
> >> Thanks for the patch.
> >>
> >> Rui> The mold linker has recently gained the `--separate-debug-file` flag to
> >> Rui> create a separate debug info file in the background. We don't want to read
> >> Rui> a debug file until it's complete. Since mold locks a debug file while
> >> Rui> writing to it, we can avoid this issue by obtaining a shared file lock.
>
> When you say "in the background", this is only while the link is being
> performed, right?  It's not running the debug generation at some point
> after the link has completed?
>
> I ask because I'm trying to understand when the problem occurs.  A user
> starts debugging while the link is still being performed and ends up
> reading an incomplete debug info file.  But doesn't the same problem
> exist for the executable itself?

mold consists of a foreground and a background process. The background
process notifies the foreground process as soon as the main output
file (which is specified by `-o`) is complete, allowing the foreground
process to exit. If `--separate-debug-file` is given, the background
process then continues running to create a separate debug file in the
background. So mold really works in the background after the "build"
is complete.

The main purpose of the feature is to improve developer productivity
in the rapid debug/edit/rebuild cycles, in which you want to run the
executable as soon as possible to see if your fix works as expected.
For production builds, you should pass the `--no-detach` option to the
linker to suppress the background file creation.

> >>
> >> Sounds reasonable.
> >>
> >> Rui> +#if HAVE_FLOCK
> >> Rui> +  /* The mold linker has a feature to create a separate debug info file in
> >> Rui> +     the background. The linker obtains an exclusive file lock until the
> >> Rui> +     file is ready. We'll obtain a shared lock to avoid reading a
> >> Rui> +     half-written file. This is effectively a no-op for other linkers who
> >> Rui> +     don't lock a file. */
> >> Rui> +  flock(fileno((FILE *)abfd->iostream), LOCK_SH);
> >>
> >> Unfortunately I don't think this will work properly, because BFD has a
> >> virtual I/O layer, and gdb uses it.  So, for example, a BFD coming from
> >> a remote target won't have an ordinary FILE* as an iostream.  (I'm not
> >> sure offhand if this field will be NULL in this case or if it will be a
> >> pointer to some other kind of object.)
> >>
> >> I also suspect it isn't ok for code outside of BFD to access these bits
> >> in this way.
> >>
> >> I'm not totally sure of the best way to approach this.  Adding a new
> >> iovec entry was my first thought, but adding this to the remote protocol
> >> seems like a fair amount of work.  Though perhaps that scenario doesn't
> >> matter and some more local-only approach would be fine (i.e., have the
> >> various gdb implementations of this method just be no-ops).
> >>
> >> A change to the iovec would have to be discussed on the binutils list
> >> though, as that's where the BFD maintainers are.
> >
> > Looks like gdb_bfd_open internally handles the raw file descriptor.
> > So, how about adding a new boolean parameter, which is false by
> > default, to gdb_bfd_open to let it lock a file with flock() if it's
> > true? For our purpose, that change should be sufficient. In
> > particular, I don't think we need to handle remote protocols because
> > the mold linker can only create separate debug info on a local
> > filesystem.
>
> mold might be running on the remote system alongside gdbserver in which
> case a remote GDB can still try to fetch the debug file from the remote.

I think we can say in the manual page that that usage is not
supported. `--separate-debug-info` shaves off a few seconds for each
build. The overhead of remote debugging can easily overwhelm the
time-saving with that option. In that case, users should pass the
`--no-detach` option to the linker or not pass
`--separate-debug-info`.
  

Patch

diff --git a/gdb/configure.ac b/gdb/configure.ac
index e70edb74578..d08fefdfa70 100644
--- a/gdb/configure.ac
+++ b/gdb/configure.ac
@@ -1352,7 +1352,7 @@  AC_CHECK_FUNCS([getuid getgid \
                getpgid setsid \
                sigsetmask \
                ttrace wresize setlocale iconvlist libiconvlist btowc \
-               setrlimit getrlimit posix_madvise waitpid \
+               setrlimit getrlimit posix_madvise waitpid flock \
                use_default_colors])
 AM_LANGINFO_CODESET

diff --git a/gdb/symfile.c b/gdb/symfile.c
index b0510b4b409..f2f1eb833b9 100644
--- a/gdb/symfile.c
+++ b/gdb/symfile.c
@@ -61,6 +61,10 @@ 
 #include <chrono>
 #include <algorithm>

+#ifdef HAVE_SYS_FILE_H
+#include <sys/file.h>
+#endif
+
 int (*deprecated_ui_load_progress_hook) (const char *section,
                                         unsigned long num);
 void (*deprecated_show_load_progress) (const char *section,
@@ -1281,6 +1285,15 @@  separate_debug_file_exists (const std::string
&name, unsigned long crc,
   else
     verified_as_different = 0;

+#if HAVE_FLOCK
+  /* The mold linker has a feature to create a separate debug info file in
+     the background. The linker obtains an exclusive file lock until the
+     file is ready. We'll obtain a shared lock to avoid reading a
+     half-written file. This is effectively a no-op for other linkers who
+     don't lock a file. */
+  flock(fileno((FILE *)abfd->iostream), LOCK_SH);
+#endif
+
   file_crc_p = gdb_bfd_crc (abfd.get (), &file_crc);