[9/9] gdb: use reopen_exec_file from reread_symbols

Message ID beb69bea3ed292317afc7822feaf98e69c5022a6.1694858967.git.aburgess@redhat.com
State New
Headers
Series Add executable_changed event to Python API |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Testing passed

Commit Message

Andrew Burgess Sept. 16, 2023, 10:18 a.m. UTC
  This commit fixes an issue that was discovered while writing the tests
for the previous commit.

I noticed that, when GDB restarts an inferior, the executable_changed
event would trigger twice.  The first notification would originate
from:

  #0  exec_file_attach (filename=0x4046680 "/tmp/hello.x", from_tty=0) at ../../src/gdb/exec.c:513
  #1  0x00000000006f3adb in reopen_exec_file () at ../../src/gdb/corefile.c:122
  #2  0x0000000000e6a3f2 in generic_mourn_inferior () at ../../src/gdb/target.c:3682
  #3  0x0000000000995121 in inf_child_target::mourn_inferior (this=0x2fe95c0 <the_amd64_linux_nat_target>) at ../../src/gdb/inf-child.c:192
  #4  0x0000000000995cff in inf_ptrace_target::mourn_inferior (this=0x2fe95c0 <the_amd64_linux_nat_target>) at ../../src/gdb/inf-ptrace.c:125
  #5  0x0000000000a32472 in linux_nat_target::mourn_inferior (this=0x2fe95c0 <the_amd64_linux_nat_target>) at ../../src/gdb/linux-nat.c:3609
  #6  0x0000000000e68a40 in target_mourn_inferior (ptid=...) at ../../src/gdb/target.c:2761
  #7  0x0000000000a323ec in linux_nat_target::kill (this=0x2fe95c0 <the_amd64_linux_nat_target>) at ../../src/gdb/linux-nat.c:3593
  #8  0x0000000000e64d1c in target_kill () at ../../src/gdb/target.c:924
  #9  0x00000000009a19bc in kill_if_already_running (from_tty=1) at ../../src/gdb/infcmd.c:328
  #10 0x00000000009a1a6f in run_command_1 (args=0x0, from_tty=1, run_how=RUN_STOP_AT_MAIN) at ../../src/gdb/infcmd.c:381
  #11 0x00000000009a20a5 in start_command (args=0x0, from_tty=1) at ../../src/gdb/infcmd.c:527
  #12 0x000000000068dc5d in do_simple_func (args=0x0, from_tty=1, c=0x35c7200) at ../../src/gdb/cli/cli-decode.c:95

While the second originates from:

  #0  exec_file_attach (filename=0x3d7a1d0 "/tmp/hello.x", from_tty=0) at ../../src/gdb/exec.c:513
  #1  0x0000000000dfe525 in reread_symbols (from_tty=1) at ../../src/gdb/symfile.c:2517
  #2  0x00000000009a1a98 in run_command_1 (args=0x0, from_tty=1, run_how=RUN_STOP_AT_MAIN) at ../../src/gdb/infcmd.c:398
  #3  0x00000000009a20a5 in start_command (args=0x0, from_tty=1) at ../../src/gdb/infcmd.c:527
  #4  0x000000000068dc5d in do_simple_func (args=0x0, from_tty=1, c=0x35c7200) at ../../src/gdb/cli/cli-decode.c:95

In the first case the call to exec_file_attach first passes through
reopen_exec_file.  The reopen_exec_file performs a modification time
check on the executable file, and only calls exec_file_attach if the
executable has changed on disk since it was last loaded.

However, in the second case things work a little differently.  In this
case GDB is really trying to reread the debug symbol.  As such, we
iterate over the objfiles list, and for each of those we check the
modification time, if the file on disk has changed then we reload the
debug symbols from that file.

However, there is an additional check, if the objfile has the same
name as the executable then we will call exec_file_attach, but we do
so without checking the cached modification time that indicates when
the executable was last reloaded, as a result, we reload the
executable twice.

In this commit I propose that reread_symbols be changed to
unconditionally call reopen_exec_file before performing the objfile
iteration.  This will ensure that, if the executable has changed, then
the executable will be reloaded, however, if the executable has
already been recently reloaded, we will not reload it for a second
time.

After handling the executable, GDB can then iterate over the objfiles
list and reload them in the normal way.

With this done I now see the executable reloaded only once when GDB
restarts an inferior, which means I can remove the kfail that I added
to the gdb.python/py-exec-file.exp test in the previous commit.
---
 gdb/symfile.c                             | 18 +++++-------------
 gdb/testsuite/gdb.python/py-exec-file.exp |  5 -----
 2 files changed, 5 insertions(+), 18 deletions(-)
  

Comments

Tom Tromey Sept. 19, 2023, 2:08 p.m. UTC | #1
>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:
 
Andrew> +  reopen_exec_file ();

This addition could probably use a comment explaining why it is needed.

Andrew> -      new_modtime = new_statbuf.st_mtime;
Andrew> +      long new_modtime = new_statbuf.st_mtime;

Might as well change this to time_t now.

I don't really understand how reread_symbols interacts with target:
files.  It seems like this is just broken.

At AdaCore we carry a patch that changes this code to use bfd_stat.
According to internal notes, this was pushed:

https://sourceware.org/legacy-ml/gdb-patches/2020-01/msg00386.html

... but I don't see it in gdb.  Maybe we never resolved the 'stat' issue
in gnulib?

Anyway, the reason I bring this up is that reopen_exec_file calls
bfd_cache_close_all -- but then the loop in reread_symbols, in the
bfd_stat case, will reopen the files (BFD uses fstat).  This seems
unfortunate.

I don't think gdb has a very clear idea of when bfd_cache_close_all
ought to be called.  It would be good to clear this up.  I'm not very
clear on it myself.  Maybe it is to avoid ETXTBUSY errors -- in which
case it seems like it should be called just before starting the
inferior.  But, ISTR some error on Windows as well, though I don't know
exactly what... so maybe the files need to be guaranteed-closed in other
situations as well.

Tom
  
Andrew Burgess Sept. 28, 2023, 3:23 p.m. UTC | #2
Tom Tromey <tromey@adacore.com> writes:

>>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:
>  
> Andrew> +  reopen_exec_file ();
>
> This addition could probably use a comment explaining why it is
> needed.

I added a comment.

>
> Andrew> -      new_modtime = new_statbuf.st_mtime;
> Andrew> +      long new_modtime = new_statbuf.st_mtime;
>
> Might as well change this to time_t now.

I fixed this.

>
> I don't really understand how reread_symbols interacts with target:
> files.  It seems like this is just broken.

Yup.

>
> At AdaCore we carry a patch that changes this code to use bfd_stat.
> According to internal notes, this was pushed:
>
> https://sourceware.org/legacy-ml/gdb-patches/2020-01/msg00386.html
>
> ... but I don't see it in gdb.  Maybe we never resolved the 'stat' issue
> in gnulib?

I guess not.  I already posted a series to look at this:

  https://inbox.sourceware.org/gdb-patches/cover.1695824275.git.aburgess@redhat.com/

But after reviewing your series I've updated this:

  https://inbox.sourceware.org/gdb-patches/cover.1695909469.git.aburgess@redhat.com/

I didn't fully understand all the discussion w.r.t. gnulib stat vs
system stat, but I'm hoping the change above, which is less that your
originally proposed full change, might be mergable.

Though even after that series reopen_exec_file is still broken for
target: files, I guess I'll see how the above goes before fixing that
too.

> Anyway, the reason I bring this up is that reopen_exec_file calls
> bfd_cache_close_all -- but then the loop in reread_symbols, in the
> bfd_stat case, will reopen the files (BFD uses fstat).  This seems
> unfortunate.
>
> I don't think gdb has a very clear idea of when bfd_cache_close_all
> ought to be called.  It would be good to clear this up.  I'm not very
> clear on it myself.  Maybe it is to avoid ETXTBUSY errors -- in which
> case it seems like it should be called just before starting the
> inferior.  But, ISTR some error on Windows as well, though I don't know
> exactly what... so maybe the files need to be guaranteed-closed in other
> situations as well.

I agree this stuff doesn't seem well defined at all, but I don't think I
made anything particularly worse in this respect, so I'm leaving that as
a problem for the future.

I did take a quick look at when bfd_cache_close_all is currently called,
and we do have a call in symbol_file_add_with_addrs, which is used when
a solib is loaded -- which means in many cases, when an inferior starts
we will end up calling bfd_cache_close_all anyway.  Note: I'm not
arguing that this is an actual well designed solution, just that in many
cases we're not going to hit any problems because of this.

Thanks,
Andrew
  
Tom Tromey Sept. 28, 2023, 6:15 p.m. UTC | #3
>>>>> "Andrew" == Andrew Burgess <aburgess@redhat.com> writes:

Andrew> I didn't fully understand all the discussion w.r.t. gnulib stat vs
Andrew> system stat, but I'm hoping the change above, which is less that your
Andrew> originally proposed full change, might be mergable.

FTR (and IIUC) the issue is that gnulib's stat yields different times on
Windows hosts, because it tries to handle the timezone -- on Unix, stat
reports in UTC, but on Windows it may report in local time.

The result is that BFD will report times that are offset from the times
reported by gnulib.

gnulib doesn't provide a way to change this behavior, so one proposal in
the thread was to hack gnulib.

>> I don't think gdb has a very clear idea of when bfd_cache_close_all
>> ought to be called.

Andrew> I agree this stuff doesn't seem well defined at all, but I don't think I
Andrew> made anything particularly worse in this respect, so I'm leaving that as
Andrew> a problem for the future.

One thing I wonder is whether bfd_cache_close_all is even needed.
gdbserver doesn't do this and AFAIK, gdb just keeps all those remote fds
open.

Maybe we're missing some testing scenario, though, like gdbserver
extended-remote with "gdb --write".

Tom
  
Andrew Burgess Sept. 29, 2023, 10:17 a.m. UTC | #4
Tom Tromey <tromey@adacore.com> writes:

>>>>>> "Andrew" == Andrew Burgess <aburgess@redhat.com> writes:
>
> Andrew> I didn't fully understand all the discussion w.r.t. gnulib stat vs
> Andrew> system stat, but I'm hoping the change above, which is less that your
> Andrew> originally proposed full change, might be mergable.
>
> FTR (and IIUC) the issue is that gnulib's stat yields different times on
> Windows hosts, because it tries to handle the timezone -- on Unix, stat
> reports in UTC, but on Windows it may report in local time.
>
> The result is that BFD will report times that are offset from the times
> reported by gnulib.
>
> gnulib doesn't provide a way to change this behavior, so one proposal in
> the thread was to hack gnulib.
>
>>> I don't think gdb has a very clear idea of when bfd_cache_close_all
>>> ought to be called.
>
> Andrew> I agree this stuff doesn't seem well defined at all, but I don't think I
> Andrew> made anything particularly worse in this respect, so I'm leaving that as
> Andrew> a problem for the future.
>
> One thing I wonder is whether bfd_cache_close_all is even needed.
> gdbserver doesn't do this and AFAIK, gdb just keeps all those remote fds
> open.
>
> Maybe we're missing some testing scenario, though, like gdbserver
> extended-remote with "gdb --write".

So, I touched a bfd_cache_close_all recently[1], and dug into the history
of these calls a little at the time.  They were first added to GDB in
this commit:

  commit ce7d45220e4ed342d4a77fcd2f312e85e1100971
  Date:   Fri Jul 30 12:05:45 2004 +0000

Before this there were no bfd_cache_close_all calls in the GDB tree.

The mailing list thread associated with this commit is:

  https://sourceware.org/pipermail/gdb-patches/2004-July/036407.html

Though this thread seems to reference some older thread that I've not
been able to track down.  The above thread is titled:

  [RFA] win32: bfd_cache_close after kill

Here is the text from the older (unfound) thread that is quoted:

  > When the inferior is killed, it is safe the release the different file
  > handles that BFD keeps open. It is particularly useful on Win32 (and
  > presumably on HP UX) to be able to recompile and restart a new debugging
  > session without quitting GDB...

Now I've never done any development work on Win32, I very occasionally
boot a Windows machine to do a test build of GDB, but beyond that my
knowledge of Windows is like 20+ years old :-/  However, I do have a
vague memory that Windows didn't like you writing to a file that was
opened by some other application?  Maybe this memory is wrong, but that
would seem to align with the above text.

If that is the case, then it would seem (to me) that we want to ensure
BFD is not holding open any files at a time when GDB itself is not
actively doing anything that might read from a file, this would be:

  1. When GDB is sat idle at a prompt, and

  2. When GDB is blocked waiting for an inferior event.

Having a bfd_cache_close_all call in e.g. reopen_exec_file, seems pretty
pointless really, as it's not impossible that GDB will then immediately
re-open the BFD from some other function.

I wonder if we should remove all bfd_cache_close_all calls, and just
have two calls associated with the two states above?

[1] https://inbox.sourceware.org/gdb-patches/6219a1ae148c0b9796f7153b9a6c3b7173fb8f5a.1694706545.git.aburgess@redhat.com/

Thanks,
Andrew
  
Eli Zaretskii Sept. 29, 2023, 2:42 p.m. UTC | #5
> Cc: Tom Tromey <tromey@adacore.com>,  Andrew Burgess via Gdb-patches
>  <gdb-patches@sourceware.org>
> Date: Thu, 28 Sep 2023 12:15:18 -0600
> From: Tom Tromey via Gdb-patches <gdb-patches@sourceware.org>
> 
> >>>>> "Andrew" == Andrew Burgess <aburgess@redhat.com> writes:
> 
> Andrew> I didn't fully understand all the discussion w.r.t. gnulib stat vs
> Andrew> system stat, but I'm hoping the change above, which is less that your
> Andrew> originally proposed full change, might be mergable.
> 
> FTR (and IIUC) the issue is that gnulib's stat yields different times on
> Windows hosts, because it tries to handle the timezone -- on Unix, stat
> reports in UTC, but on Windows it may report in local time.

Are you sure?  That's not what I know: AFAIK on Windows 'stat' reports
times in UTC, not in local time.  At least for some (relatively old)
implementation of MSVCRT.DLL, the Windows libc, for which I have
sources, 'stat' clearly converts the file's times to UTC.  This page:

  https://learn.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2013/14h5k7ff(v=vs.120)

also seems to imply that file times are indeed in UTC.

If we are unsure, we should talk to the Gnulib folks about this, they
probably know more about that.
  
Eli Zaretskii Sept. 29, 2023, 3:18 p.m. UTC | #6
> Cc: Tom Tromey <tromey@adacore.com>, Andrew Burgess via Gdb-patches
>  <gdb-patches@sourceware.org>
> Date: Fri, 29 Sep 2023 11:17:12 +0100
> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
> 
> Here is the text from the older (unfound) thread that is quoted:
> 
>   > When the inferior is killed, it is safe the release the different file
>   > handles that BFD keeps open. It is particularly useful on Win32 (and
>   > presumably on HP UX) to be able to recompile and restart a new debugging
>   > session without quitting GDB...
> 
> Now I've never done any development work on Win32, I very occasionally
> boot a Windows machine to do a test build of GDB, but beyond that my
> knowledge of Windows is like 20+ years old :-/  However, I do have a
> vague memory that Windows didn't like you writing to a file that was
> opened by some other application?  Maybe this memory is wrong, but that
> would seem to align with the above text.

Yes, Windows will not allow you to write to a file opened by another
process.  (It is possible to open a file in a "shared" mode which will
then allow that, but few applications do that, and the default is not
"shared".)

> If that is the case, then it would seem (to me) that we want to ensure
> BFD is not holding open any files at a time when GDB itself is not
> actively doing anything that might read from a file, this would be:
> 
>   1. When GDB is sat idle at a prompt, and
> 
>   2. When GDB is blocked waiting for an inferior event.

Right.  The best would be for GDB to open the file, read from or write
to it, then close it immediately when it's done with it.
  
Tom Tromey Oct. 2, 2023, 2:02 p.m. UTC | #7
>> FTR (and IIUC) the issue is that gnulib's stat yields different times on
>> Windows hosts, because it tries to handle the timezone -- on Unix, stat
>> reports in UTC, but on Windows it may report in local time.

Eli> Are you sure?  That's not what I know: AFAIK on Windows 'stat' reports
Eli> times in UTC, not in local time.

I think so, see https://www.gnu.org/software/gnulib/manual/html_node/stat.html

Tom
  
Tom Tromey Oct. 2, 2023, 2:16 p.m. UTC | #8
>>>>> "Andrew" == Andrew Burgess <aburgess@redhat.com> writes:

Andrew>   1. When GDB is sat idle at a prompt, and

Andrew>   2. When GDB is blocked waiting for an inferior event.

Andrew> Having a bfd_cache_close_all call in e.g. reopen_exec_file, seems pretty
Andrew> pointless really, as it's not impossible that GDB will then immediately
Andrew> re-open the BFD from some other function.

Andrew> I wonder if we should remove all bfd_cache_close_all calls, and just
Andrew> have two calls associated with the two states above?

I'm a little wary of #1 because I'm working on a patch series to move
DWARF reading into the background.  That is, reading would be happening
while the prompt is visible.

Maybe the DWARF reader should be calling bfd_cache_close when it knows
it is done with a particular BFD.  I think really all that's needed is
to have it open while mapping the debug sections.

I suppose the lack of this code in gdbserver is just another
local/remote divergence that nobody has ever noticed.

Tom
  
Eli Zaretskii Oct. 2, 2023, 4:19 p.m. UTC | #9
> From: Tom Tromey <tromey@adacore.com>
> Cc: Tom Tromey <tromey@adacore.com>,  aburgess@redhat.com,
>   gdb-patches@sourceware.org
> Date: Mon, 02 Oct 2023 08:02:23 -0600
> 
> >> FTR (and IIUC) the issue is that gnulib's stat yields different times on
> >> Windows hosts, because it tries to handle the timezone -- on Unix, stat
> >> reports in UTC, but on Windows it may report in local time.
> 
> Eli> Are you sure?  That's not what I know: AFAIK on Windows 'stat' reports
> Eli> times in UTC, not in local time.
> 
> I think so, see https://www.gnu.org/software/gnulib/manual/html_node/stat.html

If you mean this part:

  The st_atime, st_ctime, st_mtime fields are affected by the current
  time zone and by the DST flag of the current time zone on some
  platforms: mingw, MSVC 14 (when the environment variable TZ is set).

then I'm not sure they mean the times are reported as local times.
They just say "are affected".  My interpretation of that is that on
MinGW and MSVC times might be off by, like, 1 hour, if the DST flag at
the file's time and at current system time is different.

But I've CC'ed Bruno, who knows this stuff better, to ask him to
clarify this for us.
  

Patch

diff --git a/gdb/symfile.c b/gdb/symfile.c
index 43fd45c4050..fac8ec19a22 100644
--- a/gdb/symfile.c
+++ b/gdb/symfile.c
@@ -2457,11 +2457,10 @@  remove_symbol_file_command (const char *args, int from_tty)
 void
 reread_symbols (int from_tty)
 {
-  long new_modtime;
-  struct stat new_statbuf;
-  int res;
   std::vector<struct objfile *> new_objfiles;
 
+  reopen_exec_file ();
+
   for (objfile *objfile : current_program_space->objfiles ())
     {
       if (objfile->obfd.get () == NULL)
@@ -2475,6 +2474,8 @@  reread_symbols (int from_tty)
 	 `ar', often called a `static library' on most systems, though
 	 a `shared library' on AIX is also an archive), then you should
 	 stat on the archive name, not member name.  */
+      int res;
+      struct stat new_statbuf;
       if (objfile->obfd->my_archive)
 	res = stat (bfd_get_filename (objfile->obfd->my_archive), &new_statbuf);
       else
@@ -2486,7 +2487,7 @@  reread_symbols (int from_tty)
 		      objfile_name (objfile));
 	  continue;
 	}
-      new_modtime = new_statbuf.st_mtime;
+      long new_modtime = new_statbuf.st_mtime;
       if (new_modtime != objfile->mtime)
 	{
 	  gdb_printf (_("`%ps' has changed; re-reading symbols.\n"),
@@ -2508,15 +2509,6 @@  reread_symbols (int from_tty)
 	  /* We need to do this whenever any symbols go away.  */
 	  clear_symtab_users_cleanup defer_clear_users (0);
 
-	  if (current_program_space->exec_bfd () != NULL
-	      && filename_cmp (bfd_get_filename (objfile->obfd.get ()),
-			       bfd_get_filename (current_program_space->exec_bfd ())) == 0)
-	    {
-	      /* Reload EXEC_BFD without asking anything.  */
-
-	      exec_file_attach (bfd_get_filename (objfile->obfd.get ()), 0);
-	    }
-
 	  /* Keep the calls order approx. the same as in free_objfile.  */
 
 	  /* Free the separate debug objfiles.  It will be
diff --git a/gdb/testsuite/gdb.python/py-exec-file.exp b/gdb/testsuite/gdb.python/py-exec-file.exp
index 5ad3cd7e50f..7aa19a867d7 100644
--- a/gdb/testsuite/gdb.python/py-exec-file.exp
+++ b/gdb/testsuite/gdb.python/py-exec-file.exp
@@ -190,11 +190,6 @@  with_test_prefix "exec changes on disk" {
 
     runto_main
 
-    # There is currently an issue where the executable_changed event
-    # will trigger twice during an inferior restart.  This should be
-    # fixed in the next commit, at which point this kfail can be
-    # removed.
-    setup_kfail "????" *-*-*
     check_exec_change [string_to_regexp $binfile1] True \
 	"check executable_changed state after exec changed on disk"
 }