[RFC,2/3] gdb: wait for inferior after attaching instead of adding a continuation

Message ID dacc536cd44bf7556e86db013ae665b38e7b5d01.1713423838.git.tankut.baris.aktemur@intel.com
State New
Headers
Series Wait for inferior after attaching |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Testing passed

Commit Message

Aktemur, Tankut Baris April 18, 2024, 7:36 a.m. UTC
  When we execute the "attach" command, GDB first makes the target
attach to the given PID.  For all-stop targets, this operation is
expected to generate a stop event.  For non-stop targets, GDB
explicitly sends a stop request.  So, in either case, GDB expects to
receive a stop event, and leaves handling of the stop event to the
general event loop.  But to be able to finish the attach flow, GDB
also adds a continuation to the inferior, which would be executed
after the stop event is handled:

    inferior->add_continuation ([=] ()
      {
        attach_post_wait (from_tty, mode);
      });

Handling the expected stop event after the command completes creates a
risk that events from other targets may interfere.

** Example 1: Multi-target

Let us start a debug session with a native process.

  $ gdb -q -ex "start" /tmp/simple
  Reading symbols from /tmp/simple...
  Temporary breakpoint 1 at 0x116e: file simple.c, line 12.
  Starting program: /tmp/simple
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

  Temporary breakpoint 1, main () at simple.c:12
  12        int z = foo (5);
  (gdb) info threads
    Id   Target Id                                    Frame
  * 1    Thread 0x7ffff7d86740 (LWP 3154968) "simple" main () at simple.c:12

We now add another inferior with a remote target connection.

  (gdb) add-inferior -no-connection
  [New inferior 2]
  Added inferior 2
  (gdb) inferior 2
  [Switching to inferior 2 [<null>] (<noexec>)]
  (gdb) target extended-remote | gdbserver --multi --once -
  Remote debugging using | gdbserver --multi --once -
  Remote debugging using stdio
  (gdb) set sysroot
  (gdb)

And then attach to another process that was started earlier:

  (gdb) attach 981423
  Attaching to process 981423
  Attached; pid = 981423
  [New Thread 981423.981423]
  No unwaited-for children left.
  (gdb)

Note the "No unwaited-for children left." message.  The new inferior's
thread incorrectly appears running from GDB's point of view:
GDB at this point has as incorrect internal state for inferior 2.

  (gdb) info threads
    Id   Target Id                                    Frame
    1.1  Thread 0x7ffff7d86740 (LWP 3154968) "simple" main () at simple.c:12
  * 2.1  Thread 981423.981423 "infinite"              (running)
  (gdb)

Below is the same session with infrun debug logs (this time the native
process' PID is 3159197).

  (gdb) attach 981423
  [infrun] scoped_disable_commit_resumed: reason=attaching
  Attaching to process 981423
  Attached; pid = 981423
  [New Thread 981423.981423]
  [infrun] infrun_debug_show_threads: enter
    [infrun] infrun_debug_show_threads: immediately after attach:
    [infrun] infrun_debug_show_threads:   thread 981423.981423.0, executing = 1, resumed = 0, state = RUNNING
  [infrun] infrun_debug_show_threads: exit
  [infrun] infrun_async: enable=1
  [infrun] clear_proceed_status_thread: 981423.981423.0
  [infrun] reset: reason=attaching
  [infrun] maybe_set_commit_resumed_all_targets: not requesting commit-resumed for target native, no resumed threads
  [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target extended-remote
  [infrun] fetch_inferior_event: enter

After `attach_command` completed, GDB went to the general event loop.

    [infrun] scoped_disable_commit_resumed: reason=handling event
    [infrun] do_target_wait: Found 2 inferiors, starting at #0
    [infrun] random_pending_event_thread: None found.
    [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
    [infrun] print_target_wait_results:   -1.0.0 [process -1],
    [infrun] print_target_wait_results:   status->kind = NO_RESUMED
    [infrun] print_target_wait_results:   from target 1 (native)

A NO_RESUMED wait event was received from the native target.  The
expectation, however, was to receive an event from the remote target.
A stop event is in fact waiting in the event queue of the remote
target.

    [infrun] handle_inferior_event: status->kind = NO_RESUMED
    [infrun] stop_waiting: stop_waiting
    [infrun] stop_all_threads: start: reason=presenting stop to user in all-stop, inf=-1
      [infrun] infrun_debug_show_threads: enter
        [infrun] infrun_debug_show_threads: non-exited threads:
        [infrun] infrun_debug_show_threads:   thread 3159197.3159197.0, executing = 0, resumed = 0, state = STOPPED
        [infrun] infrun_debug_show_threads:   thread 981423.981423.0, executing = 1, resumed = 0, state = RUNNING
      [infrun] infrun_debug_show_threads: exit
      [infrun] stop_all_threads: pass=0, iterations=0
      [infrun] stop_all_threads:   3159197.3159197.0 not executing
      [infrun] stop_all_threads:   981423.981423.0 executing, need stop
      [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
      [infrun] print_target_wait_results:   981423.981423.0 [Thread 981423.981423],
      [infrun] print_target_wait_results:   status->kind = STOPPED, sig = GDB_SIGNAL_0
      [infrun] print_target_wait_results:   from target 2 (extended-remote)
      [infrun] handle_one: status->kind = STOPPED, sig = GDB_SIGNAL_0 981423.981423.0

While handling the NO_RESUMED event, we enter stop_all_threads, and
consume the stop event that was caused by `attach_command`.

      [infrun] infrun_debug_show_threads: enter
        [infrun] infrun_debug_show_threads: threads in the newly created inferior:
        [infrun] infrun_debug_show_threads:   thread 981423.981423.0, executing = 0, resumed = 0, state = RUNNING
      [infrun] infrun_debug_show_threads: exit
      [infrun] stop_all_threads:   3159197.3159197.0 not executing
      [infrun] stop_all_threads:   981423.981423.0 not executing
      [infrun] stop_all_threads: pass=1, iterations=1
      [infrun] stop_all_threads:   3159197.3159197.0 not executing
      [infrun] stop_all_threads:   981423.981423.0 not executing
      [infrun] stop_all_threads: done
    [infrun] stop_all_threads: end: reason=presenting stop to user in all-stop, inf=-1
  No unwaited-for children left.
    [infrun] infrun_async: enable=0
    [infrun] reset: reason=handling event
    [infrun] maybe_set_commit_resumed_all_targets: not requesting commit-resumed for target native, no resumed threads
    [infrun] maybe_set_commit_resumed_all_targets: not requesting commit-resumed for target extended-remote, no resumed threads
  (gdb) [infrun] fetch_inferior_event: exit
  [infrun] fetch_inferior_event: enter
    [infrun] scoped_disable_commit_resumed: reason=handling event
    [infrun] do_target_wait: Found 2 inferiors, starting at #1
    [infrun] random_pending_event_thread: None found.
    [infrun] random_pending_event_thread: None found.
    [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
    [infrun] print_target_wait_results:   -1.0.0 [process -1],
    [infrun] print_target_wait_results:   status->kind = NO_RESUMED
    [infrun] print_target_wait_results:   from target 1 (native)
    [infrun] handle_inferior_event: status->kind = NO_RESUMED
    [infrun] handle_no_resumed: TARGET_WAITKIND_NO_RESUMED (ignoring: bg)
    [infrun] prepare_to_wait: prepare_to_wait
    [infrun] reset: reason=handling event
    [infrun] maybe_set_commit_resumed_all_targets: not requesting commit-resumed for target native, no resumed threads
    [infrun] maybe_set_commit_resumed_all_targets: not requesting commit-resumed for target extended-remote, no resumed threads
  [infrun] fetch_inferior_event: exit

Because NO_RESUMED does not result in a non-null inferior_ptid, the
continuation that was added to inferior 2 is not invoked.

We argue that, because the attach command expects to receive a stop
event, we shall wait for the event before completing the command.
With this patch, we wait on the target of the current inferior.
Hence, we always fetch the event we aim for.  The observed behavior is

  ...
  (gdb) attach 981423
  Attaching to process 981423
  Attached; pid = 981423
  [New Thread 981423.981423]
  Reading symbols from /tmp/infinite...
  Reading symbols from /lib/x86_64-linux-gnu/libstdc++.so.6...
  ...
  0x00007fd42b35282e in sqrtf64 () from /lib/x86_64-linux-gnu/libm.so.6
  (gdb) info threads
    Id   Target Id                                    Frame
    1.1  Thread 0x7ffff7d86740 (LWP 3247871) "simple" main () at simple.c:12
  * 2.1  Thread 981423.981423 "infinite"              0x00007fd42b35282e in sqrtf64 () from /lib/x86_64-linux-gnu/libm.so.6
  (gdb)

** Example 2: Single target

In a less harmful case, the current implementation causes an unnatural
positioning of the GDB prompt in a single target setting when
executing the attach command in async mode using the '&' operator.
E.g.:

  $ gdb -ex "attach 3107608 &"
  Attaching to process 3107608
  [New LWP 3107609]
  [New LWP 3107610]
  [New LWP 3107611]
  [New LWP 3107612]
  (gdb) [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Similarly, if we use the non-stop mode, we see

  $ gdb -ex "set non-stop on" -ex "attach 3107608 &"
  Attaching to process 3107608
  [New LWP 3107609]
  [New LWP 3107610]
  [New LWP 3107611]
  [New LWP 3107612]
  (gdb) Reading symbols from /tmp/multi-thread/infinite...
  Reading symbols from /lib/x86_64-linux-gnu/libstdc++.so.6...
  (No debugging symbols found in /lib/x86_64-linux-gnu/libstdc++.so.6)
  Reading symbols from /lib/x86_64-linux-gnu/libgcc_s.so.1...
  (No debugging symbols found in /lib/x86_64-linux-gnu/libgcc_s.so.1)
  Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...
  (No debugging symbols found in /lib/x86_64-linux-gnu/libc.so.6)
  Reading symbols from /lib/x86_64-linux-gnu/libm.so.6...
  (No debugging symbols found in /lib/x86_64-linux-gnu/libm.so.6)
  Reading symbols from /lib64/ld-linux-x86-64.so.2...
  (No debugging symbols found in /lib64/ld-linux-x86-64.so.2)
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

With this patch, we receive the prompt *after* the libthread_db
messages, giving more natural output:

  $ gdb -ex "attach 3107608 &"
  Attaching to process 3107608
  [New LWP 3107609]
  [New LWP 3107610]
  [New LWP 3107611]
  [New LWP 3107612]
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
  (gdb)

In non-stop mode, similarly, the prompt is placed before the solib list:

  $ gdb -ex "set non-stop on" -ex "attach 3107608 &"
  Attaching to process 3107608
  [New LWP 3107609]
  [New LWP 3107610]
  [New LWP 3107611]
  [New LWP 3107612]
  Reading symbols from /tmp/multi-thread/infinite...
  Reading symbols from /lib/x86_64-linux-gnu/libstdc++.so.6...
  (No debugging symbols found in /lib/x86_64-linux-gnu/libstdc++.so.6)
  Reading symbols from /lib/x86_64-linux-gnu/libgcc_s.so.1...
  (No debugging symbols found in /lib/x86_64-linux-gnu/libgcc_s.so.1)
  Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...
  (No debugging symbols found in /lib/x86_64-linux-gnu/libc.so.6)
  Reading symbols from /lib/x86_64-linux-gnu/libm.so.6...
  (No debugging symbols found in /lib/x86_64-linux-gnu/libm.so.6)
  Reading symbols from /lib64/ld-linux-x86-64.so.2...
  (No debugging symbols found in /lib64/ld-linux-x86-64.so.2)
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
  (gdb)

When using the attach command normally, without the '&' operator, the
observed output of GDB is not changed.

The examples above are for when we use the native target.  When using
the extended-remote target to attach to a process, again no change is
expected in the output if using the normal attach command.  If using
'&', the prompt is placed again after the list of solibs.

** Testing

Regression-tested on X86-64 Linux using the default (unix),
native-gdbserver and native-extended-gdbserver board files.
---
 gdb/infcmd.c | 24 +++++++++++-------------
 gdb/infrun.c | 11 ++---------
 gdb/infrun.h |  9 +++++++++
 3 files changed, 22 insertions(+), 22 deletions(-)
  

Comments

Tom Tromey April 19, 2024, 8:33 p.m. UTC | #1
>>>>> "Tankut" == Tankut Baris Aktemur <tankut.baris.aktemur@intel.com> writes:

Tankut> In a less harmful case, the current implementation causes an unnatural
Tankut> positioning of the GDB prompt in a single target setting when
Tankut> executing the attach command in async mode using the '&' operator.

I'd prefer Pedro review this, since he's the infrun expert, but I was
curious about this point -- it seems to me that "attach &" should be
expected not to wait.

So maybe the issue is that somehow a continuation is being run for the
wrong inferior?  Like in this part of the final patch:

-      /* Do all continuations associated with the whole inferior (not
-	 a particular thread).  */
-      if (inferior_ptid != null_ptid)
-	current_inferior ()->do_all_continuations ();

... do inferior_ptid and current_inferior disagree here?
Or else why does this code get invoked?

Tom
  
Aktemur, Tankut Baris April 22, 2024, 6:48 a.m. UTC | #2
On Friday, April 19, 2024 10:33 PM, Tom Tromey wrote:
> >>>>> "Tankut" == Tankut Baris Aktemur <tankut.baris.aktemur@intel.com> writes:
> 
> Tankut> In a less harmful case, the current implementation causes an unnatural
> Tankut> positioning of the GDB prompt in a single target setting when
> Tankut> executing the attach command in async mode using the '&' operator.
> 
> I'd prefer Pedro review this, since he's the infrun expert, but I was
> curious about this point -- it seems to me that "attach &" should be
> expected not to wait.
> 
> So maybe the issue is that somehow a continuation is being run for the
> wrong inferior?  Like in this part of the final patch:
> 
> -      /* Do all continuations associated with the whole inferior (not
> -	 a particular thread).  */
> -      if (inferior_ptid != null_ptid)
> -	current_inferior ()->do_all_continuations ();
> 
> ... do inferior_ptid and current_inferior disagree here?
> Or else why does this code get invoked?
> 
> Tom

In the "attach &" example, there is only one inferior, actually.  So, there
is no mismatch between inferior_ptid and current_inferior.  Here is a timeline
of important events:

 - the "attach &" command starts executing.
 - target's "attach" op is invoked.
 - GDB sends a stop request to the target.
 - the inferior is added a continuation.
 - the attach command is done, we'll go back to the event loop.
   -> because we're in async mode (&), the GDB prompt is displayed.
 - the event corresponding to the stop request is received and handled.
 - when the stop event's handling is complete, the continuation is invoked.

I'd think that a reasonable question is "why does GDB send a stop request
to the target in async (&) mode?"  I believe that's the behavior in both
sync and async cases so that GDB can finish setting up the inferior properly
(in attach_post_wait).

-Baris


Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
  

Patch

diff --git a/gdb/infcmd.c b/gdb/infcmd.c
index 10a964a90d7..3b3241f7026 100644
--- a/gdb/infcmd.c
+++ b/gdb/infcmd.c
@@ -2699,6 +2699,11 @@  attach_command (const char *args, int from_tty)
 
   mode = async_exec ? ATTACH_POST_WAIT_RESUME : ATTACH_POST_WAIT_STOP;
 
+  /* Attaching may switch to the first/main thread.  Retain this
+     selection after 'wait_for_inferior' and 'attach_post_wait'
+     below.  */
+  scoped_restore_current_thread restore_thread;
+
   /* Some system don't generate traps when attaching to inferior.
      E.g. Mach 3 or GNU hurd.  */
   if (!target_attach_no_wait ())
@@ -2710,23 +2715,18 @@  attach_command (const char *args, int from_tty)
 	 STOP_QUIETLY_NO_SIGSTOP is for.  */
       inferior->control.stop_soon = STOP_QUIETLY_NO_SIGSTOP;
 
-      /* Wait for stop.  */
-      inferior->add_continuation ([=] ()
-	{
-	  attach_post_wait (from_tty, mode);
-	});
-
       /* Let infrun consider waiting for events out of this
 	 target.  */
       inferior->process_target ()->threads_executing = true;
 
       if (!target_is_async_p ())
 	mark_infrun_async_event_handler ();
-      return;
+
+      /* Wait for stop.  */
+      wait_for_inferior (inferior);
     }
-  else
-    attach_post_wait (from_tty, mode);
 
+  attach_post_wait (from_tty, mode);
   disable_commit_resumed.reset_and_commit ();
 }
 
@@ -2768,10 +2768,8 @@  notice_new_inferior (thread_info *thr, bool leave_running, int from_tty)
       inferior->control.stop_soon = STOP_QUIETLY_REMOTE;
 
       /* Wait for stop before proceeding.  */
-      inferior->add_continuation ([=] ()
-	{
-	  attach_post_wait (from_tty, mode);
-	});
+      wait_for_inferior (inferior);
+      attach_post_wait (from_tty, mode);
 
       return;
     }
diff --git a/gdb/infrun.c b/gdb/infrun.c
index 3dbd42f81cb..7a8c78c973e 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -97,8 +97,6 @@  static bool maybe_software_singlestep (struct gdbarch *gdbarch);
 
 static void resume (gdb_signal sig);
 
-static void wait_for_inferior (inferior *inf);
-
 static void restart_threads (struct thread_info *event_thread,
 			     inferior *inf = nullptr);
 
@@ -4375,14 +4373,9 @@  stop_all_threads_if_all_stop_mode ()
     stop_all_threads ("presenting stop to user in all-stop");
 }
 
-/* Wait for control to return from inferior to debugger.
-
-   If inferior gets a signal, we may decide to start it up again
-   instead of returning.  That is why there is a loop in this function.
-   When this function actually returns it means the inferior
-   should be left stopped and GDB should read more commands.  */
+/* See infrun.h.  */
 
-static void
+void
 wait_for_inferior (inferior *inf)
 {
   infrun_debug_printf ("wait_for_inferior ()");
diff --git a/gdb/infrun.h b/gdb/infrun.h
index 0dc26ae2c33..9dea3d1a46f 100644
--- a/gdb/infrun.h
+++ b/gdb/infrun.h
@@ -320,6 +320,15 @@  extern void all_uis_on_sync_execution_starting (void);
    detach.  */
 extern void restart_after_all_stop_detach (process_stratum_target *proc_target);
 
+/* Wait for control to return from inferior to debugger.
+
+   If inferior gets a signal, we may decide to start it up again
+   instead of returning.  That is why there is a loop in this function.
+   When this function actually returns it means the inferior
+   should be left stopped and GDB should read more commands.  */
+extern void wait_for_inferior (inferior *inf);
+
+
 /* RAII object to temporarily disable the requirement for target
    stacks to commit their resumed threads.