[8/8] gdb: disable commit resumed in target_kill

Message ID 20221117194241.1776125-9-simon.marchi@efficios.com
State New
Headers
Series Fix some commit_resumed_state assertion failures (PR 28275) |

Commit Message

Simon Marchi Nov. 17, 2022, 7:42 p.m. UTC
  PR 28275 shows that doing a sequence of:

 - Run inferior in background (run &)
 - kill that inferior
 - Run again

We get into this assertion:

    /home/smarchi/src/binutils-gdb/gdb/target.c:2590: internal-error: target_wait: Assertion `!proc_target->commit_resumed_state' failed.

    #0  internal_error_loc (file=0x5606b344e740 "/home/smarchi/src/binutils-gdb/gdb/target.c", line=2590, fmt=0x5606b344d6a0 "%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:54
    #1  0x00005606b6296475 in target_wait (ptid=..., status=0x7fffb9390630, options=...) at /home/smarchi/src/binutils-gdb/gdb/target.c:2590
    #2  0x00005606b5767a98 in startup_inferior (proc_target=0x5606bfccb2a0 <the_amd64_linux_nat_target>, pid=3884857, ntraps=1, last_waitstatus=0x0, last_ptid=0x0) at /home/smarchi/src/binutils-gdb/gdb/nat/fork-inferior.c:482
    #3  0x00005606b4e6c9c5 in gdb_startup_inferior (pid=3884857, num_traps=1) at /home/smarchi/src/binutils-gdb/gdb/fork-child.c:132
    #4  0x00005606b50f14a5 in inf_ptrace_target::create_inferior (this=0x5606bfccb2a0 <the_amd64_linux_nat_target>, exec_file=0x604000039f50 "/home/smarchi/build/binutils-gdb/gdb/test", allargs="", env=0x61500000a580, from_tty=1)
        at /home/smarchi/src/binutils-gdb/gdb/inf-ptrace.c:105
    #5  0x00005606b53b6d23 in linux_nat_target::create_inferior (this=0x5606bfccb2a0 <the_amd64_linux_nat_target>, exec_file=0x604000039f50 "/home/smarchi/build/binutils-gdb/gdb/test", allargs="", env=0x61500000a580, from_tty=1)
        at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:978
    #6  0x00005606b512b79b in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at /home/smarchi/src/binutils-gdb/gdb/infcmd.c:468
    #7  0x00005606b512c236 in run_command (args=0x0, from_tty=1) at /home/smarchi/src/binutils-gdb/gdb/infcmd.c:526

When running the kill command, commit_resumed_state for the
process_stratum_target (linux-nat, here) is true.  After the kill, when
there are no more threads, commit_resumed_state is still true, as
nothing touches this flag during the kill operation.  During the
subsequent run command, run_command_1 does:

    scoped_disable_commit_resumed disable_commit_resumed ("running");

We would think that this would clear the commit_resumed_state flag of
our native target, but that's not the case, because
scoped_disable_commit_resumed iterates on non-exited inferiors in order
to find active process targets.  And after the kill, the inferior is
exited, and the native target was unpushed from it anyway.  So
scoped_disable_commit_resumed doesn't touch the commit_resumed_state
flag of the native target, it stays true.  When reaching target_wait, in
startup_inferior (to consume the initial expect stop events while the
inferior is starting up and working its way through the shell),
commit_resumed_state is true, breaking the contract saying that
commit_resumed_state is always false when calling the targets' wait
method.

(note: to be correct, I think that startup_inferior should toggle
commit_resumed between the target_wait and target_resume calls, but I'll
ignore that for now)

I can see multiple ways to fix this.  In the end, we need
commit_resumed_state to be cleared by the time we get to that
target_wait.  It could be done at the end of the kill command, or at the
beginning of the run command.

To keep things in a coherent state, I'd like to make it so that after
the kill command, when the target is left with no threads, its
commit_resumed_state flag is left to false.  This way, we can keep
working with the assumption that a target with no threads (and therefore
no running threads) has commit_resumed_state == false.

Do this by adding a scoped_disable_commit_resumed in target_kill.  It
clears the target's commit_resumed_state on entry, and leaves it false
if the target does not have any resumed thread on exit.  That means,
even if the target has another inferior with stopped threads,
commit_resumed_state will be left to false, which makes sense.

Add a test that tries to cover various combinations of actions done
while an inferior is running (and therefore while commit_resumed_state
is true on the process target).

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28275
Change-Id: I8e6fe6dc1f475055921520e58cab68024039a1e9
---
 gdb/target.c                                  |   8 +-
 .../gdb.base/run-control-while-bg-execution.c |  33 +++++
 .../run-control-while-bg-execution.exp        | 118 ++++++++++++++++++
 3 files changed, 157 insertions(+), 2 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.c
 create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.exp
  

Comments

Andrew Burgess Nov. 18, 2022, 1:33 p.m. UTC | #1
Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:

> PR 28275 shows that doing a sequence of:
>
>  - Run inferior in background (run &)
>  - kill that inferior
>  - Run again
>
> We get into this assertion:
>
>     /home/smarchi/src/binutils-gdb/gdb/target.c:2590: internal-error: target_wait: Assertion `!proc_target->commit_resumed_state' failed.
>
>     #0  internal_error_loc (file=0x5606b344e740 "/home/smarchi/src/binutils-gdb/gdb/target.c", line=2590, fmt=0x5606b344d6a0 "%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:54
>     #1  0x00005606b6296475 in target_wait (ptid=..., status=0x7fffb9390630, options=...) at /home/smarchi/src/binutils-gdb/gdb/target.c:2590
>     #2  0x00005606b5767a98 in startup_inferior (proc_target=0x5606bfccb2a0 <the_amd64_linux_nat_target>, pid=3884857, ntraps=1, last_waitstatus=0x0, last_ptid=0x0) at /home/smarchi/src/binutils-gdb/gdb/nat/fork-inferior.c:482
>     #3  0x00005606b4e6c9c5 in gdb_startup_inferior (pid=3884857, num_traps=1) at /home/smarchi/src/binutils-gdb/gdb/fork-child.c:132
>     #4  0x00005606b50f14a5 in inf_ptrace_target::create_inferior (this=0x5606bfccb2a0 <the_amd64_linux_nat_target>, exec_file=0x604000039f50 "/home/smarchi/build/binutils-gdb/gdb/test", allargs="", env=0x61500000a580, from_tty=1)
>         at /home/smarchi/src/binutils-gdb/gdb/inf-ptrace.c:105
>     #5  0x00005606b53b6d23 in linux_nat_target::create_inferior (this=0x5606bfccb2a0 <the_amd64_linux_nat_target>, exec_file=0x604000039f50 "/home/smarchi/build/binutils-gdb/gdb/test", allargs="", env=0x61500000a580, from_tty=1)
>         at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:978
>     #6  0x00005606b512b79b in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at /home/smarchi/src/binutils-gdb/gdb/infcmd.c:468
>     #7  0x00005606b512c236 in run_command (args=0x0, from_tty=1) at /home/smarchi/src/binutils-gdb/gdb/infcmd.c:526
>
> When running the kill command, commit_resumed_state for the
> process_stratum_target (linux-nat, here) is true.  After the kill, when
> there are no more threads, commit_resumed_state is still true, as
> nothing touches this flag during the kill operation.  During the
> subsequent run command, run_command_1 does:
>
>     scoped_disable_commit_resumed disable_commit_resumed ("running");
>
> We would think that this would clear the commit_resumed_state flag of
> our native target, but that's not the case, because
> scoped_disable_commit_resumed iterates on non-exited inferiors in order
> to find active process targets.  And after the kill, the inferior is
> exited, and the native target was unpushed from it anyway.  So
> scoped_disable_commit_resumed doesn't touch the commit_resumed_state
> flag of the native target, it stays true.  When reaching target_wait, in
> startup_inferior (to consume the initial expect stop events while the
> inferior is starting up and working its way through the shell),
> commit_resumed_state is true, breaking the contract saying that
> commit_resumed_state is always false when calling the targets' wait
> method.
>
> (note: to be correct, I think that startup_inferior should toggle
> commit_resumed between the target_wait and target_resume calls, but I'll
> ignore that for now)
>
> I can see multiple ways to fix this.  In the end, we need
> commit_resumed_state to be cleared by the time we get to that
> target_wait.  It could be done at the end of the kill command, or at the
> beginning of the run command.
>
> To keep things in a coherent state, I'd like to make it so that after
> the kill command, when the target is left with no threads, its
> commit_resumed_state flag is left to false.  This way, we can keep
> working with the assumption that a target with no threads (and therefore
> no running threads) has commit_resumed_state == false.
>
> Do this by adding a scoped_disable_commit_resumed in target_kill.  It
> clears the target's commit_resumed_state on entry, and leaves it false
> if the target does not have any resumed thread on exit.  That means,
> even if the target has another inferior with stopped threads,
> commit_resumed_state will be left to false, which makes sense.
>
> Add a test that tries to cover various combinations of actions done
> while an inferior is running (and therefore while commit_resumed_state
> is true on the process target).
>
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28275
> Change-Id: I8e6fe6dc1f475055921520e58cab68024039a1e9
> ---
>  gdb/target.c                                  |   8 +-
>  .../gdb.base/run-control-while-bg-execution.c |  33 +++++
>  .../run-control-while-bg-execution.exp        | 118 ++++++++++++++++++
>  3 files changed, 157 insertions(+), 2 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.c
>  create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.exp
>
> diff --git a/gdb/target.c b/gdb/target.c
> index 4a22885b82f..f5c6212310a 100644
> --- a/gdb/target.c
> +++ b/gdb/target.c
> @@ -907,8 +907,12 @@ add_deprecated_target_alias (const target_info &tinfo, const char *alias)
>  void
>  target_kill (void)
>  {
> -  current_inferior ()->top_target ()->kill ();
> -}
> +
> +  /* Ensure that, if commit resumed for the target is currently true (threads
> +     are running), if we kill the last running inferior, commit resumed ends up
> +     false.  */
> +   scoped_disable_commit_resumed disable ("killing"); current_inferior
> +   ()->top_target ()->kill (); }

I think something went wrong with the formatting here!

I first read this comment before reading the commit message, and,
initially, the comment didn't make any sense to me.

My thinking was, surely the scoped_disable_commit_resumed would set the
commit resumed state to false, but only for the duration of this scope.
When we leave the scope, the state will be set back to true...

...the subtlety of course, is that when we create the
scoped_disable_commit_resumed, the current inferior is non-exited, and
so has its state set to false, but, when we exit the scope, the current
inferior will be exited, so its commit resume state will be left false.

Personally, I think this is a non-obvious detail.  I think it would be
helpful if the comment explained this aspect a little more.

I'm also seeing something weird with the test when run with the
native-extended-gdbserver board, this warning:

  WARNING: Timed out waiting for EOF in server after monitor exit

I've not done a full investigation, but what seems to happen is GDB
sends the 'monitor exit', and gdbserver does indeed try to exit.  It
then looks like pthreads(?) is waiting on some threads?  Or some child
processes?  At the point gdbserver is not exiting, the stack looks like:

  #0  0x00007f7009660374 in wait () from /lib64/libpthread.so.0
  #1  0x000056146a26a858 in reap_children ()
  #2  0x000056146a26b21c in new_job ()
  #3  0x000056146a27740c in update_file ()
  #4  0x000056146a277d64 in update_goal_chain ()
  #5  0x000056146a25b893 in main ()

Do you also see this timeout?

Additionally, I have not (yet) been able to reproduce this when running
the test from the command line - in that case, gdbserver always exits
immediately after getting the 'monitor exit'.

Thanks,
Andrew

>  
>  void
>  target_load (const char *arg, int from_tty)
> diff --git a/gdb/testsuite/gdb.base/run-control-while-bg-execution.c b/gdb/testsuite/gdb.base/run-control-while-bg-execution.c
> new file mode 100644
> index 00000000000..8092fadc8b9
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/run-control-while-bg-execution.c
> @@ -0,0 +1,33 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2020-2022 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +					       //
> +#include <unistd.h>
> +
> +static pid_t mypid = -1;
> +
> +static void
> +after_getpid (void)
> +{
> +}
> +
> +int
> +main (void)
> +{
> +  mypid = getpid ();
> +  after_getpid ();
> +  sleep (30);
> +}
> diff --git a/gdb/testsuite/gdb.base/run-control-while-bg-execution.exp b/gdb/testsuite/gdb.base/run-control-while-bg-execution.exp
> new file mode 100644
> index 00000000000..e637e2bfc98
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/run-control-while-bg-execution.exp
> @@ -0,0 +1,118 @@
> +# Copyright 2022 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# This test aims at testing various operations after getting rid of an inferior
> +# that was running in background, or while we have an inferior running in
> +# background.  The original intent was to expose cases where the commit-resumed
> +# state of the process stratum target was not reset properly after killing an
> +# inferior running in background, which would be a problem when trying to run
> +# again.  The test was expanded to test various combinations of
> +# run-control-related actions done with an inferior running in background.
> +
> +if {[use_gdb_stub]} {
> +    unsupported "test requires running"
> +    return
> +}
> +
> +standard_testfile
> +
> +if {[build_executable "failed to prepare" $testfile $srcfile]} {
> +    return
> +}
> +
> +# Run one variation of the test:
> +#
> +# 1. Start an inferior in the background with "run &"
> +# 2. Do action 1
> +# 3. Do action 2
> +#
> +# Action 1 indicates what to do with the inferior running in background:
> +#
> +#  - kill: kill it
> +#  - detach: detach it
> +#  - add: add a new inferior and switch to it, leave the inferior running in
> +#    background alone
> +#  - none: do nothing, leave the inferior running in background alone
> +#
> +# Action 2 indicates what to do after that:
> +#
> +#  - start: use the start command
> +#  - run: use the run command
> +#  - attach: start a process outside of GDB and attach it
> +proc do_test { action1 action2 } {
> +    save_vars { ::GDBFLAGS } {
> +	append ::GDBFLAGS " -ex \"maintenance set target-non-stop on\""
> +	clean_restart $::binfile
> +    }
> +
> +    # Ensure we are at least after the getpid call, shall we need it.
> +    if { ![runto "after_getpid"] } {
> +	return
> +    }
> +
> +    # Some commands below ask for confirmation.  Turn that off for simplicity.
> +    gdb_test "set confirm off"
> +    gdb_test -no-prompt-anchor "continue &"
> +
> +    if { $action1 == "kill" } {
> +	gdb_test "kill" "Inferior 1 .* killed.*"
> +    } elseif { $action1 == "detach" } {
> +	set child_pid [get_integer_valueof "mypid" -1]
> +	if { $child_pid == -1 } {
> +	    fail "failed to extract child pid"
> +	    return
> +	}
> +
> +	gdb_test "detach" "Inferior 1 .* detached.*" "detach from first instance"
> +
> +	# Kill the detached process, otherwise that makes "monitor exit" hang
> +	# until the process disappears.
> +	#remote_exec target "kill $child_pid"
> +    } elseif { $action1 == "add" } {
> +	gdb_test "add-inferior -exec $::binfile" \
> +	    "Added inferior 2 on connection 1.*" "add-inferior"
> +	gdb_test "inferior 2" "Switching to inferior 2 .*"
> +    } elseif { $action1 == "none" } {
> +
> +    } else {
> +	error "invalid action 1"
> +    }
> +
> +    if { $action2 == "start" } {
> +	gdb_test "start" "Temporary breakpoint $::decimal, main .*"
> +    } elseif { $action2 == "run" } {
> +	gdb_test "break main" "Breakpoint $::decimal at $::hex.*"
> +	gdb_test "run" "Breakpoint $::decimal, main .*"
> +    } elseif { $action2 == "attach" } {
> +	set test_spawn_id [spawn_wait_for_attach $::binfile]
> +	set test_pid [spawn_id_get_pid $test_spawn_id]
> +
> +	if { [gdb_attach $test_pid] } {
> +	    gdb_test "detach" "Inferior $::decimal .* detached.*" \
> +		"detach from second instance"
> +	}
> +
> +	# Detach and kill this inferior so we don't leave it around.
> +	kill_wait_spawned_process $test_spawn_id
> +    } else {
> +	error "invalid action 2"
> +    }
> +}
> +
> +foreach_with_prefix action1 { kill detach add none } {
> +    foreach_with_prefix action2 { start run attach } {
> +	do_test $action1 $action2
> +    }
> +}
> -- 
> 2.37.3
  
Simon Marchi Nov. 19, 2022, 1:16 a.m. UTC | #2
On 11/18/22 08:33, Andrew Burgess via Gdb-patches wrote:
> Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
>> PR 28275 shows that doing a sequence of:
>>
>>  - Run inferior in background (run &)
>>  - kill that inferior
>>  - Run again
>>
>> We get into this assertion:
>>
>>     /home/smarchi/src/binutils-gdb/gdb/target.c:2590: internal-error: target_wait: Assertion `!proc_target->commit_resumed_state' failed.
>>
>>     #0  internal_error_loc (file=0x5606b344e740 "/home/smarchi/src/binutils-gdb/gdb/target.c", line=2590, fmt=0x5606b344d6a0 "%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:54
>>     #1  0x00005606b6296475 in target_wait (ptid=..., status=0x7fffb9390630, options=...) at /home/smarchi/src/binutils-gdb/gdb/target.c:2590
>>     #2  0x00005606b5767a98 in startup_inferior (proc_target=0x5606bfccb2a0 <the_amd64_linux_nat_target>, pid=3884857, ntraps=1, last_waitstatus=0x0, last_ptid=0x0) at /home/smarchi/src/binutils-gdb/gdb/nat/fork-inferior.c:482
>>     #3  0x00005606b4e6c9c5 in gdb_startup_inferior (pid=3884857, num_traps=1) at /home/smarchi/src/binutils-gdb/gdb/fork-child.c:132
>>     #4  0x00005606b50f14a5 in inf_ptrace_target::create_inferior (this=0x5606bfccb2a0 <the_amd64_linux_nat_target>, exec_file=0x604000039f50 "/home/smarchi/build/binutils-gdb/gdb/test", allargs="", env=0x61500000a580, from_tty=1)
>>         at /home/smarchi/src/binutils-gdb/gdb/inf-ptrace.c:105
>>     #5  0x00005606b53b6d23 in linux_nat_target::create_inferior (this=0x5606bfccb2a0 <the_amd64_linux_nat_target>, exec_file=0x604000039f50 "/home/smarchi/build/binutils-gdb/gdb/test", allargs="", env=0x61500000a580, from_tty=1)
>>         at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:978
>>     #6  0x00005606b512b79b in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at /home/smarchi/src/binutils-gdb/gdb/infcmd.c:468
>>     #7  0x00005606b512c236 in run_command (args=0x0, from_tty=1) at /home/smarchi/src/binutils-gdb/gdb/infcmd.c:526
>>
>> When running the kill command, commit_resumed_state for the
>> process_stratum_target (linux-nat, here) is true.  After the kill, when
>> there are no more threads, commit_resumed_state is still true, as
>> nothing touches this flag during the kill operation.  During the
>> subsequent run command, run_command_1 does:
>>
>>     scoped_disable_commit_resumed disable_commit_resumed ("running");
>>
>> We would think that this would clear the commit_resumed_state flag of
>> our native target, but that's not the case, because
>> scoped_disable_commit_resumed iterates on non-exited inferiors in order
>> to find active process targets.  And after the kill, the inferior is
>> exited, and the native target was unpushed from it anyway.  So
>> scoped_disable_commit_resumed doesn't touch the commit_resumed_state
>> flag of the native target, it stays true.  When reaching target_wait, in
>> startup_inferior (to consume the initial expect stop events while the
>> inferior is starting up and working its way through the shell),
>> commit_resumed_state is true, breaking the contract saying that
>> commit_resumed_state is always false when calling the targets' wait
>> method.
>>
>> (note: to be correct, I think that startup_inferior should toggle
>> commit_resumed between the target_wait and target_resume calls, but I'll
>> ignore that for now)
>>
>> I can see multiple ways to fix this.  In the end, we need
>> commit_resumed_state to be cleared by the time we get to that
>> target_wait.  It could be done at the end of the kill command, or at the
>> beginning of the run command.
>>
>> To keep things in a coherent state, I'd like to make it so that after
>> the kill command, when the target is left with no threads, its
>> commit_resumed_state flag is left to false.  This way, we can keep
>> working with the assumption that a target with no threads (and therefore
>> no running threads) has commit_resumed_state == false.
>>
>> Do this by adding a scoped_disable_commit_resumed in target_kill.  It
>> clears the target's commit_resumed_state on entry, and leaves it false
>> if the target does not have any resumed thread on exit.  That means,
>> even if the target has another inferior with stopped threads,
>> commit_resumed_state will be left to false, which makes sense.
>>
>> Add a test that tries to cover various combinations of actions done
>> while an inferior is running (and therefore while commit_resumed_state
>> is true on the process target).
>>
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28275
>> Change-Id: I8e6fe6dc1f475055921520e58cab68024039a1e9
>> ---
>>  gdb/target.c                                  |   8 +-
>>  .../gdb.base/run-control-while-bg-execution.c |  33 +++++
>>  .../run-control-while-bg-execution.exp        | 118 ++++++++++++++++++
>>  3 files changed, 157 insertions(+), 2 deletions(-)
>>  create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.c
>>  create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.exp
>>
>> diff --git a/gdb/target.c b/gdb/target.c
>> index 4a22885b82f..f5c6212310a 100644
>> --- a/gdb/target.c
>> +++ b/gdb/target.c
>> @@ -907,8 +907,12 @@ add_deprecated_target_alias (const target_info &tinfo, const char *alias)
>>  void
>>  target_kill (void)
>>  {
>> -  current_inferior ()->top_target ()->kill ();
>> -}
>> +
>> +  /* Ensure that, if commit resumed for the target is currently true (threads
>> +     are running), if we kill the last running inferior, commit resumed ends up
>> +     false.  */
>> +   scoped_disable_commit_resumed disable ("killing"); current_inferior
>> +   ()->top_target ()->kill (); }
> 
> I think something went wrong with the formatting here!

Woops, fixed.

> 
> I first read this comment before reading the commit message, and,
> initially, the comment didn't make any sense to me.
> 
> My thinking was, surely the scoped_disable_commit_resumed would set the
> commit resumed state to false, but only for the duration of this scope.
> When we leave the scope, the state will be set back to true...
> 
> ...the subtlety of course, is that when we create the
> scoped_disable_commit_resumed, the current inferior is non-exited, and
> so has its state set to false, but, when we exit the scope, the current
> inferior will be exited, so its commit resume state will be left false.
> 
> Personally, I think this is a non-obvious detail.  I think it would be
> helpful if the comment explained this aspect a little more.

I agree, trying to fit it all in a single sentence didn't help.  What
about:

  /* If the commit_resume_state of the to-be-killed-inferior's process stratum
     is true, and this inferior is the last live inferior with resumed threads
     of that target, then we want to leave commit_resume_state to false, as the
     target won't have any resumed threads anymore.  We achieve this with
     this scoped_disable_commit_resumed.  On construction, it will set the flag
     to false.  On destruction, it will only set it to true if there are resumed
     threads left.  */

> 
> I'm also seeing something weird with the test when run with the
> native-extended-gdbserver board, this warning:
> 
>   WARNING: Timed out waiting for EOF in server after monitor exit
> 
> I've not done a full investigation, but what seems to happen is GDB
> sends the 'monitor exit', and gdbserver does indeed try to exit.  It
> then looks like pthreads(?) is waiting on some threads?  Or some child
> processes?  At the point gdbserver is not exiting, the stack looks like:
> 
>   #0  0x00007f7009660374 in wait () from /lib64/libpthread.so.0
>   #1  0x000056146a26a858 in reap_children ()
>   #2  0x000056146a26b21c in new_job ()
>   #3  0x000056146a27740c in update_file ()
>   #4  0x000056146a277d64 in update_goal_chain ()
>   #5  0x000056146a25b893 in main ()

I believe this is a stack trace of the make process, probably the one
invoking dejagnu / expect.

> 
> Do you also see this timeout?
> 
> Additionally, I have not (yet) been able to reproduce this when running
> the test from the command line - in that case, gdbserver always exits
> immediately after getting the 'monitor exit'.

Ah, funny that you dug into that too :).  I noticed it and had a
discussion about that with Pedro.  Here's what we found.  Doing "monitor
exit" does indeed make gdbserver exit, after which (during the delay) it
is zombie (it shows up as <defunct> in ps).

We expect this "eof" expect clause to trigger:

  https://gitlab.com/gnutools/binutils-gdb/-/blob/5e219e0f46055281cfbc9351a3d27a05841be34d/gdb/testsuite/lib/gdbserver-support.exp#L476

Our theory is that the "eof" clause triggers when the pty controlling
GDBserver returns EOF, that is when nobody is connected anymore on the
slave side.  However, there is a child process that is detached during
the test that is still alive, that is still connected to the pty.  So
eof never triggers, we hang until expect gives up.

Actually, I have this workaround in the patch:

+	# Kill the detached process, otherwise that makes "monitor exit" hang
+	# until the process disappears.
+	#remote_exec target "kill $child_pid"

I commented it out to investigate the issue, and forgot to uncomment it.
I'll uncomment it.

I think it would work fine to do the "wait" right after sending the
"monitor exit", just like this:

    if { $is_mi } {
	set monitor_exit "-interpreter-exec console \"monitor exit\""
    } else {
	set monitor_exit "monitor exit"
    }
    gdb_test -nopass "$monitor_exit" "" "monitor exit"
    wait -i $server_spawn_id

We reap gdbserver even if the detach process is still alive.  Worst
case, the pty and detached process stay there until the detached process
exits by itself (which generally happens with our testsuite programs, we
try not to write infinite loops).  The problem is that there is no
timeout mechanism, so it could hang forever.  Imagine the "monitor exit"
fails for some reason, or GDBserver is stuck in a loop while trying to
exit.  That's the point of the current code.

Note that this isn't unique to my test.  I tried putting a fail in the
eof clause and ran the testsuite, I had about 20 such failures when
running the native-extended-gdbserver board.

Maybe the solution is just to make sure the tests don't leave detached
process hanging around.  We can put a fail in there and fix the
offending tests.  Any new test that leaves child processes running will
be easy to spot.

Simon
  

Patch

diff --git a/gdb/target.c b/gdb/target.c
index 4a22885b82f..f5c6212310a 100644
--- a/gdb/target.c
+++ b/gdb/target.c
@@ -907,8 +907,12 @@  add_deprecated_target_alias (const target_info &tinfo, const char *alias)
 void
 target_kill (void)
 {
-  current_inferior ()->top_target ()->kill ();
-}
+
+  /* Ensure that, if commit resumed for the target is currently true (threads
+     are running), if we kill the last running inferior, commit resumed ends up
+     false.  */
+   scoped_disable_commit_resumed disable ("killing"); current_inferior
+   ()->top_target ()->kill (); }
 
 void
 target_load (const char *arg, int from_tty)
diff --git a/gdb/testsuite/gdb.base/run-control-while-bg-execution.c b/gdb/testsuite/gdb.base/run-control-while-bg-execution.c
new file mode 100644
index 00000000000..8092fadc8b9
--- /dev/null
+++ b/gdb/testsuite/gdb.base/run-control-while-bg-execution.c
@@ -0,0 +1,33 @@ 
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2020-2022 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+					       //
+#include <unistd.h>
+
+static pid_t mypid = -1;
+
+static void
+after_getpid (void)
+{
+}
+
+int
+main (void)
+{
+  mypid = getpid ();
+  after_getpid ();
+  sleep (30);
+}
diff --git a/gdb/testsuite/gdb.base/run-control-while-bg-execution.exp b/gdb/testsuite/gdb.base/run-control-while-bg-execution.exp
new file mode 100644
index 00000000000..e637e2bfc98
--- /dev/null
+++ b/gdb/testsuite/gdb.base/run-control-while-bg-execution.exp
@@ -0,0 +1,118 @@ 
+# Copyright 2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This test aims at testing various operations after getting rid of an inferior
+# that was running in background, or while we have an inferior running in
+# background.  The original intent was to expose cases where the commit-resumed
+# state of the process stratum target was not reset properly after killing an
+# inferior running in background, which would be a problem when trying to run
+# again.  The test was expanded to test various combinations of
+# run-control-related actions done with an inferior running in background.
+
+if {[use_gdb_stub]} {
+    unsupported "test requires running"
+    return
+}
+
+standard_testfile
+
+if {[build_executable "failed to prepare" $testfile $srcfile]} {
+    return
+}
+
+# Run one variation of the test:
+#
+# 1. Start an inferior in the background with "run &"
+# 2. Do action 1
+# 3. Do action 2
+#
+# Action 1 indicates what to do with the inferior running in background:
+#
+#  - kill: kill it
+#  - detach: detach it
+#  - add: add a new inferior and switch to it, leave the inferior running in
+#    background alone
+#  - none: do nothing, leave the inferior running in background alone
+#
+# Action 2 indicates what to do after that:
+#
+#  - start: use the start command
+#  - run: use the run command
+#  - attach: start a process outside of GDB and attach it
+proc do_test { action1 action2 } {
+    save_vars { ::GDBFLAGS } {
+	append ::GDBFLAGS " -ex \"maintenance set target-non-stop on\""
+	clean_restart $::binfile
+    }
+
+    # Ensure we are at least after the getpid call, shall we need it.
+    if { ![runto "after_getpid"] } {
+	return
+    }
+
+    # Some commands below ask for confirmation.  Turn that off for simplicity.
+    gdb_test "set confirm off"
+    gdb_test -no-prompt-anchor "continue &"
+
+    if { $action1 == "kill" } {
+	gdb_test "kill" "Inferior 1 .* killed.*"
+    } elseif { $action1 == "detach" } {
+	set child_pid [get_integer_valueof "mypid" -1]
+	if { $child_pid == -1 } {
+	    fail "failed to extract child pid"
+	    return
+	}
+
+	gdb_test "detach" "Inferior 1 .* detached.*" "detach from first instance"
+
+	# Kill the detached process, otherwise that makes "monitor exit" hang
+	# until the process disappears.
+	#remote_exec target "kill $child_pid"
+    } elseif { $action1 == "add" } {
+	gdb_test "add-inferior -exec $::binfile" \
+	    "Added inferior 2 on connection 1.*" "add-inferior"
+	gdb_test "inferior 2" "Switching to inferior 2 .*"
+    } elseif { $action1 == "none" } {
+
+    } else {
+	error "invalid action 1"
+    }
+
+    if { $action2 == "start" } {
+	gdb_test "start" "Temporary breakpoint $::decimal, main .*"
+    } elseif { $action2 == "run" } {
+	gdb_test "break main" "Breakpoint $::decimal at $::hex.*"
+	gdb_test "run" "Breakpoint $::decimal, main .*"
+    } elseif { $action2 == "attach" } {
+	set test_spawn_id [spawn_wait_for_attach $::binfile]
+	set test_pid [spawn_id_get_pid $test_spawn_id]
+
+	if { [gdb_attach $test_pid] } {
+	    gdb_test "detach" "Inferior $::decimal .* detached.*" \
+		"detach from second instance"
+	}
+
+	# Detach and kill this inferior so we don't leave it around.
+	kill_wait_spawned_process $test_spawn_id
+    } else {
+	error "invalid action 2"
+    }
+}
+
+foreach_with_prefix action1 { kill detach add none } {
+    foreach_with_prefix action2 { start run attach } {
+	do_test $action1 $action2
+    }
+}