Message ID | gerrit.1571405222000.I7cec98f40283773b79255d998511da434e9cd408@gnutoolchain-gerrit.osci.io |
---|---|
State | New, archived |
Headers |
Received: (qmail 53772 invoked by alias); 18 Oct 2019 13:27:08 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 53761 invoked by uid 89); 18 Oct 2019 13:27:08 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-19.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3 autolearn=ham version=3.3.1 spammy=23703 X-HELO: mx1.osci.io Received: from polly.osci.io (HELO mx1.osci.io) (8.43.85.229) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 18 Oct 2019 13:27:06 +0000 Received: by mx1.osci.io (Postfix, from userid 994) id 2ADBD20940; Fri, 18 Oct 2019 09:27:05 -0400 (EDT) Received: from gnutoolchain-gerrit.osci.io (gnutoolchain-gerrit.osci.io [8.43.85.239]) by mx1.osci.io (Postfix) with ESMTP id 6C3E2206FC for <gdb-patches@sourceware.org>; Fri, 18 Oct 2019 09:27:03 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by gnutoolchain-gerrit.osci.io (Postfix) with ESMTP id 161B020AF6 for <gdb-patches@sourceware.org>; Fri, 18 Oct 2019 09:27:03 -0400 (EDT) X-Gerrit-PatchSet: 1 Date: Fri, 18 Oct 2019 09:27:03 -0400 From: "Tankut Baris Aktemur (Code Review)" <gerrit@gnutoolchain-gerrit.osci.io> To: gdb-patches@sourceware.org Message-ID: <gerrit.1571405222000.I7cec98f40283773b79255d998511da434e9cd408@gnutoolchain-gerrit.osci.io> Auto-Submitted: auto-generated X-Gerrit-MessageType: newchange Subject: [review] infrun: mark an exited thread non-executing when attempting to stop X-Gerrit-Change-Id: I7cec98f40283773b79255d998511da434e9cd408 X-Gerrit-Change-Number: 133 X-Gerrit-ChangeURL: <https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133> X-Gerrit-Commit: 411a56782f5b12f8bc852b6a163d8fed3df12ca3 References: <gerrit.1571405222000.I7cec98f40283773b79255d998511da434e9cd408@gnutoolchain-gerrit.osci.io> Reply-To: tankut.baris.aktemur@intel.com, gdb-patches@sourceware.org MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/3.0.3 Content-Type: text/plain; charset=UTF-8 |
Commit Message
Simon Marchi (Code Review)
Oct. 18, 2019, 1:27 p.m. UTC
Tankut Baris Aktemur has uploaded a new change for review. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... infrun: mark an exited thread non-executing when attempting to stop In stop_all_threads, GDB sends signals to other threads in an attempt to stop them. While in a typical scenario the expected wait status is TARGET_WAITKIND_STOPPED, it is possible that the thread GDB attempted to stop has already terminated. If so, a waitstatus other than TARGET_WAITKIND_STOPPED would be received. In that case, mark the thread as not-executing and set its state to THREAD_EXITED. If a wait status that denotes thread termination is ignored, GDB goes into an infinite loop in stop_all_threads. E.g.: ~~~ $ gdb ./a.out (gdb) start ... (gdb) add-inferior -exec ./a.out ... (gdb) inferior 2 ... (gdb) start ... (gdb) set schedule-multiple on (gdb) set debug infrun 2 (gdb) continue Continuing. infrun: clear_proceed_status_thread (process 23419) infrun: clear_proceed_status_thread (process 23703) infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT) infrun: proceed: resuming process 23419 infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 23419] at 0x55555555514e infrun: infrun_async(1) infrun: prepare_to_wait infrun: proceed: resuming process 23703 infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 23703] at 0x55555555514e infrun: prepare_to_wait infrun: target_wait (-1.0.0, status) = infrun: 23703.23703.0 [process 23703], infrun: status->kind = exited, status = 42 infrun: handle_inferior_event status->kind = exited, status = 42 [Inferior 2 (process 23703) exited with code 052] infrun: stop_waiting infrun: stop_all_threads infrun: stop_all_threads, pass=0, iterations=0 infrun: process 23419 executing, need stop infrun: target_wait (-1.0.0, status) = infrun: 23419.23419.0 [process 23419], infrun: status->kind = exited, status = 42 infrun: stop_all_threads status->kind = exited, status = 42 process 23419 infrun: process 23419 executing, already stopping infrun: target_wait (-1.0.0, status) = infrun: -1.0.0 [process -1], infrun: status->kind = no-resumed infrun: stop_all_threads status->kind = no-resumed process -1 infrun: process 23419 executing, already stopping infrun: target_wait (-1.0.0, status) = infrun: -1.0.0 [process -1], infrun: status->kind = no-resumed infrun: stop_all_threads status->kind = no-resumed process -1 infrun: process 23419 executing, already stopping infrun: target_wait (-1.0.0, status) = infrun: -1.0.0 [process -1], infrun: status->kind = no-resumed infrun: stop_all_threads status->kind = no-resumed process -1 infrun: process 23419 executing, already stopping infrun: target_wait (-1.0.0, status) = infrun: -1.0.0 [process -1], infrun: status->kind = no-resumed infrun: stop_all_threads status->kind = no-resumed process -1 infrun: process 23419 executing, already stopping ~~~ And this polling goes on forever. This patch prevents the infinite looping behavior. gdb/ChangeLog: 2019-10-18 Tankut Baris Aktemur <tankut.baris.aktemur@intel.com> * infrun.c (stop_all_threads): Do NOT ignore TARGET_WAITKIND_NO_RESUMED, TARGET_WAITKIND_THREAD_EXITED, TARGET_WAITKIND_EXITED, TARGET_WAITKIND_SIGNALLED wait statuses received from threads we attempt to stop; mark the corresponding thread as THREAD_EXITED and not-executing. Change-Id: I7cec98f40283773b79255d998511da434e9cd408 --- M gdb/infrun.c 1 file changed, 9 insertions(+), 0 deletions(-)
Comments
Tankut Baris Aktemur has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 1: > Patch Set 1: Code-Review+1 > > This looks like an old oversight when handling the case of exited threads when we're attempting to stop all of them. Looking at older code, we used to have a message saying a thread had exited while we were stopping it, but it was removed by a cleanup. > > The change looks good for me. Thank you. I noticed that the patch has a problem. It leaves the exited inferior in an alive state with no threads. It becomes not possible to re-run the program. I will send a revision, together with updated tests.
Tankut Baris Aktemur has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 2: Kindly pinging. Thanks. -Baris
Pedro Alves has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 2: > Kindly pinging. > Thanks. Thanks. I'm taking a look. (FWIW, I'm skimmed this before and put it off because it seemingly didn't come with a testcase. But today, while digging deeper, I see that this is actually part of a patch series, which _does_ include tests. Thanks for that. This is an instance of my annoyance with gerrit's lack of cover letter & threading.)
Pedro Alves has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 2: (1 comment) | --- gdb/infrun.c | +++ gdb/infrun.c | @@ -4494,7 +4509,19 @@ stop_all_threads (void) | + if (ws.kind == TARGET_WAITKIND_THREAD_EXITED) | + delete_thread (t); | + else | + { | + /* TARGET_WAITKIND_EXITED or | + TARGET_WAITKIND_SIGNALLED. */ | + /* Need to restore the context because | + handle_inferior_exit switches it. */ | + scoped_restore_current_pspace_and_thread restore; | + handle_inferior_exit (event_ptid, ws); PS2, Line 4518: Sorry, but this doesn't look right. We're inside stop_all_threads, processing some other event, and a process exit event for one of the processes we're trying to stop comes along, and this processes it right away. Once the stop_all_threads dance is done, we go back to handling the original event, and possibly reporting a stop to the user. Meanwhile, whatever state that was set by handle_inferior_exit, like e.g. $_exitcode, is lost, or now incorrect for the reported stop. We also never present a stop on the CLI for that "spurious" process exit. Here: (gdb) continue Continuing. Executing on build: kill -9 29412 29417 (timeout = 300) spawn -ignore SIGHUP kill -9 29412 29417 Program terminated with signal SIGKILL, Killed. The program no longer exists. <<<<<<<<<<< no user-visible stop / prompt here. <<<<<<<<<<<<<<< Program terminated with signal SIGKILL, Killed. The program no longer exists. (gdb) PASS: gdb.multi/multi-kill.exp: iteration 1: back to gdb prompt The fix for this I think must be around leaving the TARGET_WAITKIND_EXITED/TARGET_WAITKIND_SIGNALLED event pending, so that it is processed later when we're out of the stop_all_threads loop and back to dequeuing the next event. gdb/linux-nat.c also has its own "stop all threads temporarily" logic, and that does that -- leaves process exits pending. See wait_lwp: /* If this is the leader exiting, it means the whole process is gone. Store the status to report to the core. Store it in lp->waitstatus, because lp->status would be ambiguous (W_EXITCODE(0,0) == 0). */ store_waitstatus (&lp->waitstatus, status); return 0; | + } | + } | } | else | { | thread_info *t = find_thread_ptid (event_ptid); | if (t == NULL) | t = add_thread (event_ptid); |
Tankut Baris Aktemur has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 2: (1 comment) | --- gdb/infrun.c | +++ gdb/infrun.c | @@ -4494,7 +4509,19 @@ stop_all_threads (void) | + if (ws.kind == TARGET_WAITKIND_THREAD_EXITED) | + delete_thread (t); | + else | + { | + /* TARGET_WAITKIND_EXITED or | + TARGET_WAITKIND_SIGNALLED. */ | + /* Need to restore the context because | + handle_inferior_exit switches it. */ | + scoped_restore_current_pspace_and_thread restore; | + handle_inferior_exit (event_ptid, ws); PS2, Line 4518: Thanks for your comment. I'll work on making the event pending and then submit a revision. | + } | + } | } | else | { | thread_info *t = find_thread_ptid (event_ptid); | if (t == NULL) | t = add_thread (event_ptid); |
Tankut Baris Aktemur has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 2: (1 comment) > Patch Set 2: > > (1 comment) | --- gdb/infrun.c | +++ gdb/infrun.c | @@ -4494,7 +4509,19 @@ stop_all_threads (void) | + if (ws.kind == TARGET_WAITKIND_THREAD_EXITED) | + delete_thread (t); | + else | + { | + /* TARGET_WAITKIND_EXITED or | + TARGET_WAITKIND_SIGNALLED. */ | + /* Need to restore the context because | + handle_inferior_exit switches it. */ | + scoped_restore_current_pspace_and_thread restore; | + handle_inferior_exit (event_ptid, ws); PS2, Line 4518: > The fix for this I think must be around leaving the TARGET_WAITKIND_EXITED/TARGET_WAITKIND_SIGNALLED event pending, so that it is processed later when we're out of the stop_all_threads loop and back to dequeuing the next event. I'd like to ask for your opinion on making the second exit event pending. One problem is, because the event has not been reported to the user yet, the user still thinks that the inferior is alive. So, after getting the prompt because of the first exit event, they may be tempted to do "info threads" or switch to the not-yet-reported- inferior and inspect its state. This triggers a query (e.g. of registers) on the process that is already gone. I tried the following scenario with the current master branch (the patch that I proposed was not applied): ~~~ $ gdb ./a.out (gdb) maint set target-non-stop off (gdb) start ... (gdb) add-inferior -exec ./a.out [New inferior 2] Added inferior 2 ... (gdb) inferior 2 [Switching to inferior 2 [<null>] (/tmp/a.out)] (gdb) start ... (gdb) set schedule-multiple on (gdb) c Continuing. [Inferior 2 (process 16331) exited normally] (gdb) i inferiors Num Description Executable 1 process 16137 /tmp/a.out * 2 <null> /tmp/a.out (gdb) inferior 1 [Switching to inferior 1 [process 16137] (/tmp/a.out)] [Switching to thread 1.1 (process 16137)] Couldn't get registers: No such process. (gdb) i threads Id Target Id Frame Couldn't get registers: No such process. (gdb) c Continuing. Couldn't get registers: No such process. ~~~ If I save the exit event in my patch as a pending event (and omit 'maint set target-non-stop off'), I get essentially the same problem. What is the expected GDB behavior here? Would it be alright to actually print both exit events, followed by the gdb-prompt, where the user can now query $_exitcode or $_exitsignal by switching between inferiors, assuming those special variables are set correctly per inferior? | + } | + } | } | else | { | thread_info *t = find_thread_ptid (event_ptid); | if (t == NULL) | t = add_thread (event_ptid); |
Tankut Baris Aktemur has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 2: (1 comment) | --- gdb/infrun.c | +++ gdb/infrun.c | @@ -4494,7 +4509,19 @@ stop_all_threads (void) | + if (ws.kind == TARGET_WAITKIND_THREAD_EXITED) | + delete_thread (t); | + else | + { | + /* TARGET_WAITKIND_EXITED or | + TARGET_WAITKIND_SIGNALLED. */ | + /* Need to restore the context because | + handle_inferior_exit switches it. */ | + scoped_restore_current_pspace_and_thread restore; | + handle_inferior_exit (event_ptid, ws); PS2, Line 4518: > > The fix for this I think must be around leaving the TARGET_WAITKIND_EXITED/TARGET_WAITKIND_SIGNALLED > event pending, so that it is processed later when we're out of the stop_all_threads loop and back to > dequeuing the next event. > > > I'd like to ask for your opinion on making the second exit event pending. > One problem is, because the event has not been reported to the user yet, > the user still thinks that the inferior is alive. So, after getting the > prompt because of the first exit event, they may be tempted to do "info > threads" or switch to the not-yet-reported-inferior and inspect its state. > This triggers a query (e.g. of registers) on the process that is already > gone. I tried the following scenario with the current master branch > (the patch that I proposed was not applied): Kindly pinging. Thanks, -Baris | + } | + } | } | else | { | thread_info *t = find_thread_ptid (event_ptid); | if (t == NULL) | t = add_thread (event_ptid); |
Hi, (I tried to reply via gerrit, but I couldn't find how to hit "Quote" to quote/reply to this message of yours -- seems like via the web ui you can only reply to the last comment in a thread, which in this case is a ping, not the message I intended to quote...) Sorry for the delay, and thanks for the ping. I've been thinking on and off about this. Below you'll find my current thoughts. On 12/9/19 3:09 PM, Tankut Baris Aktemur (Code Review) wrote: > I'd like to ask for your opinion on making the second exit event > pending. One problem is, because the event has not been reported to > the user yet, the user still thinks that the inferior is alive. So, > after getting the prompt because of the first exit event, they may be > tempted to do "info threads" or switch to the not-yet-reported- > inferior and inspect its state. This triggers a query (e.g. of > registers) on the process that is already gone. I tried the following > scenario with the current master branch (the patch that I proposed was > not applied): > > ~~~ > $ gdb ./a.out > (gdb) maint set target-non-stop off > (gdb) start > ... > (gdb) add-inferior -exec ./a.out > [New inferior 2] > Added inferior 2 > ... > (gdb) inferior 2 > [Switching to inferior 2 [<null>] (/tmp/a.out)] > (gdb) start > ... > (gdb) set schedule-multiple on > (gdb) c > Continuing. > [Inferior 2 (process 16331) exited normally] > (gdb) i inferiors > Num Description Executable > 1 process 16137 /tmp/a.out > * 2 <null> /tmp/a.out > (gdb) inferior 1 > [Switching to inferior 1 [process 16137] (/tmp/a.out)] > [Switching to thread 1.1 (process 16137)] > Couldn't get registers: No such process. > (gdb) i threads > Id Target Id Frame > Couldn't get registers: No such process. > (gdb) c > Continuing. > Couldn't get registers: No such process. > ~~~ > > If I save the exit event in my patch as a pending event (and omit > 'maint set target-non-stop off'), I get essentially the same problem. > What is the expected GDB behavior here? Would it be alright to > actually print both exit events, followed by the gdb-prompt, where the > user can now query $_exitcode or $_exitsignal by switching between > inferiors, assuming those special variables are set correctly per > inferior? I'm really not sure about that. As you've seen, this happens in true all-stop too, which can't report multiple events at the same time, so I think from that angle alone, GDB should cope better with it. Plus, this can happen even if an inferior stopped for some other event while at the same time some other inferior exits. Say, inferior 1 hits a breakpoint, and while stopping everything, inferior 2 exits. And GDB happens to report the breakpoint hit first. And now the user does "info threads" and sees the "No such process" errors. You could maybe think, that then maybe we should prioritize inferior exits over breakpoint hits. But then, what if inferior 1 stops for a breakpoint, gdb manages to stop all threads without inferior 2 exiting, and then a SIGKILL is sent to the supposedly-stopped inferior, from outside GDB? Or to make it even simpler, that SIGKILL use case can even happen in single-inferior debugging. Or, in "set non-stop on" mode, the inferior is running and you so "info threads" just between the process dying, and GDB getting the SIGCHLD and collecting the ptrace event. So I think that the state gets into where the inferior dies before the inferior exit event is reported to the user is just something that GDB needs to cope with well. I.e., report the failures to read registers, in "info threads", "print" etc., and importantly -- _be sure to not get into a state where the user is stuck_. The "not get stuck" part is where I think we should improve things. Your example already shows where we need improvement, in the the last "c". A simplified version, using an external SIGKILL is: $ gdb --args /usr/bin/tail -f /dev/null GNU gdb (GDB) 10.0.50.20200106-git ... Program received signal SIGINT, Interrupt. 0x00007ffff7b08881 in __GI___libc_read (fd=4, buf=0x555555766410, nbytes=26) at ../sysdeps/unix/sysv/linux/read.c:26 26 return SYSCALL_CANCEL (read, fd, buf, nbytes); (gdb) info inferiors Num Description Executable * 1 process 9425 /usr/bin/tail (gdb) shell kill -9 9425 (gdb) flushregs Register cache flushed. Couldn't get registers: No such process. (gdb) info threads Id Target Id Frame Couldn't get registers: No such process. (gdb) c Continuing. Couldn't get registers: No such process. (gdb) This error comes from the regcache_read_pc call from within infrun.c:proceed: ... #4 0x000000000097e04e in perror_with_name (string=0xb213e9 "Couldn't get registers") at /home/pedro/gdb/binutils-gdb/src/gdb/utils.c:612 #5 0x0000000000452156 in amd64_linux_nat_target::fetch_registers (this=0x11a64b0 <the_amd64_linux_nat_target>, regcache=0x1eac160, regnum=16) at /home/pedro/gdb/binutils-gdb/src/gdb/amd64-linux-nat.c:225 #6 0x0000000000901cc4 in target_fetch_registers (regcache=0x1eac160, regno=16) at /home/pedro/gdb/binutils-gdb/src/gdb/target.c:3427 #7 0x0000000000829f94 in regcache::raw_update (this=0x1eac160, regnum=16) at /home/pedro/gdb/binutils-gdb/src/gdb/regcache.c:471 #8 0x000000000082a039 in readable_regcache::raw_read (this=0x1eac160, regnum=16, buf=0x7fffffffcc00 "\302%") at /home/pedro/gdb/binutils-gdb/src/gdb/regcache.c:485 #9 0x000000000082a371 in readable_regcache::cooked_read (this=0x1eac160, regnum=16, buf=0x7fffffffcc00 "\302%") at /home/pedro/gdb/binutils-gdb/src/gdb/regcache.c:577 #10 0x000000000082eefd in readable_regcache::cooked_read<unsigned long, void> (this=0x1eac160, regnum=16, val=0x7fffffffcca8) at /home/pedro/gdb/binutils-gdb/src/gdb/regcache.c:664 #11 0x000000000082a7d8 in regcache_cooked_read_unsigned (regcache=0x1eac160, regnum=16, val=0x7fffffffcca8) at /home/pedro/gdb/binutils-gdb/src/gdb/regcache.c:678 #12 0x000000000082bcf5 in regcache_read_pc (regcache=0x1eac160) at /home/pedro/gdb/binutils-gdb/src/gdb/regcache.c:1182 #13 0x00000000006e6f62 in proceed (addr=0xffffffffffffffff, siggnal=GDB_SIGNAL_DEFAULT) at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:2855 #14 0x00000000006d7af6 in continue_1 (all_threads=0) at /home/pedro/gdb/binutils-gdb/src/gdb/infcmd.c:804 ... I think PTRACE_EVENT_EXIT on Linux could help with at least some of the use cases on Linux, but still, GDB should cope better on systems that do not have that feature. Thanks, Pedro Alves
Tom de Vries has posted comments on this change.
Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133
......................................................................
Patch Set 2:
> I'd like to ask for your opinion on making the second exit event pending
Discussion continued here: https://sourceware.org/ml/gdb-patches/2020-01/msg00212.html .
Tom de Vries has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 2: FTR, I build patchset 2 and tested the example from PR25478, and hit an assert: ... Thread 1 "gdb" hit Breakpoint 1, stop_all_threads () at /data/gdb_versions/devel/src/gdb/infrun.c:4395 4395 int iterations = 0; (master-gdb) c Continuing. infrun: stop_all_threads infrun: stop_all_threads, pass=0, iterations=0 infrun: Thread 0x7ffff7fa6740 (LWP 32692) not executing infrun: Thread 0x7ffff77fe700 (LWP 32696) executing, need stop infrun: target_wait (-1.0.0, status) = infrun: 32692.32696.0 [Thread 0x7ffff77fe700 (LWP 32696)], infrun: status->kind = signalled, signal = GDB_SIGNAL_KILL infrun: stop_all_threads status->kind = signalled, signal = GDB_SIGNAL_KILL Thread 0x7ffff77fe700 (LWP 32696) Program terminated with signal SIGKILL, Killed. The program no longer exists. infrun: stop_all_threads, pass=1, iterations=1 infrun: stop_all_threads done thread.c:95: internal-error: thread_info* inferior_thread(): Assertion `tp' failed. ...
Tom de Vries has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 2: (1 comment) | --- gdb/infrun.c | +++ gdb/infrun.c | @@ -4494,0 +4500,19 @@ stop_all_threads (void) | + | + /* Set the threads as non-executing to avoid infinitely | + waiting for them to stop. */ | + mark_non_executing_threads (event_ptid, ws); | + | + if (ws.kind == TARGET_WAITKIND_NO_RESUMED) | + { | + /* Do nothing. Already marked the threads. */ | + } | + if (ws.kind == TARGET_WAITKIND_THREAD_EXITED) PS2, Line 4509: This doesn't look right. Did you mean to write: else if (ws.kind == TARGET_WAITKIND_THREAD_EXITED) ? | + delete_thread (t); | + else | + { | + /* TARGET_WAITKIND_EXITED or | + TARGET_WAITKIND_SIGNALLED. */ | + /* Need to restore the context because | + handle_inferior_exit switches it. */ | + scoped_restore_current_pspace_and_thread restore; | + handle_inferior_exit (event_ptid, ws);
Tankut Baris Aktemur has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 2: (1 comment) > Patch Set 2: > > FTR, I build patchset 2 and tested the example from PR25478, and hit an assert: > ... I was not able to repeat the problematic scenario with the current master. The multi-target feature seems to have changed the behavior. I'll need to first look into whether I can reproduce the infinite loop. | --- gdb/infrun.c | +++ gdb/infrun.c | @@ -4494,0 +4500,19 @@ stop_all_threads (void) | + | + /* Set the threads as non-executing to avoid infinitely | + waiting for them to stop. */ | + mark_non_executing_threads (event_ptid, ws); | + | + if (ws.kind == TARGET_WAITKIND_NO_RESUMED) | + { | + /* Do nothing. Already marked the threads. */ | + } | + if (ws.kind == TARGET_WAITKIND_THREAD_EXITED) PS2, Line 4509: Oops, yes. Thanks for spotting that. | + delete_thread (t); | + else | + { | + /* TARGET_WAITKIND_EXITED or | + TARGET_WAITKIND_SIGNALLED. */ | + /* Need to restore the context because | + handle_inferior_exit switches it. */ | + scoped_restore_current_pspace_and_thread restore; | + handle_inferior_exit (event_ptid, ws);
Tom de Vries has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 2: > Patch Set 2: > > (1 comment) > > > Patch Set 2: > > > > FTR, I build patchset 2 and tested the example from PR25478, and hit an assert: > > ... > > I was not able to repeat the problematic scenario with the current master. Do you mean with 'problematic scenario' PR25478 as such, or the assert?
Tom de Vries has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 2: > Patch Set 2: > > FTR, I build patchset 2 and tested the example from PR25478, and hit an assert: > ... > Thread 1 "gdb" hit Breakpoint 1, stop_all_threads () at /data/gdb_versions/devel/src/gdb/infrun.c:4395 > 4395 int iterations = 0; > (master-gdb) c > Continuing. > infrun: stop_all_threads > infrun: stop_all_threads, pass=0, iterations=0 > infrun: Thread 0x7ffff7fa6740 (LWP 32692) not executing > infrun: Thread 0x7ffff77fe700 (LWP 32696) executing, need stop > infrun: target_wait (-1.0.0, status) = > infrun: 32692.32696.0 [Thread 0x7ffff77fe700 (LWP 32696)], > infrun: status->kind = signalled, signal = GDB_SIGNAL_KILL > infrun: stop_all_threads status->kind = signalled, signal = GDB_SIGNAL_KILL Thread 0x7ffff77fe700 (LWP 32696) > > Program terminated with signal SIGKILL, Killed. > The program no longer exists. > infrun: stop_all_threads, pass=1, iterations=1 > infrun: stop_all_threads done > thread.c:95: internal-error: thread_info* inferior_thread(): Assertion `tp' failed. > ... I've: - ported patch set 2 to master - integrated the fix in patch set 3 of https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/759 such that the assert above is fixed. - build and reg-tested on x86_64-linux Available at https://github.com/vries/gdb/tree/handle-already-exited-threads-when-attempting-to-stop-try-master-2 . Note: this does not address all the comments on patch set 2.
Tankut Baris Aktemur has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 2: > > > > > > FTR, I build patchset 2 and tested the example from PR25478, and hit an assert: > > > ... > > > > I was not able to repeat the problematic scenario with the current master. > > Do you mean with 'problematic scenario' PR25478 as such, or the assert? Sorry for the vagueness. I was referring to the test scenario I gave in the patch description. Like, scenarios that can be used for testing as in https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/134 without having to stop GDB from outside at 'stop_all_threads'. I just saw that you posted a rebased version of this patch. I'll look at that, thank you. Btw, I think this patch is kind of obsolete after Pedro's comments in https://sourceware.org/ml/gdb-patches/2020-01/msg00212.html because the patch causes two stop event reports at once. -Baris
Tom de Vries has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 2: > Patch Set 2: > I just saw that you posted a rebased version of this patch. I'll look at that, > thank you. That would be great, thanks. BTW, do you happen to have something more up-to-date than patch set 2? If so, could you share this?
Tankut Baris Aktemur has posted comments on this change.
Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133
......................................................................
Patch Set 2:
> BTW, do you happen to have something more up-to-date than patch set 2? If so, could you share this?
Yes. I need to revise the associated tests, and will post it as soon as I can -- hopefully today.
Tankut Baris Aktemur has posted comments on this change.
Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133
......................................................................
Patch Set 3:
> BTW, do you happen to have something more up-to-date than patch set 2? If so, could you share this?
I've uploaded patchset 3. This version uses pending waitstatuses and also includes fixes from
https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/759
I've not fully updated the testcases in
https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/143.
There are still unaddressed comments there.
This patchset assumes that an existing problem that is explained in
https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/766
is fixed. Otherwise the scenario described in the commit message is not reproduced.
Thanks
-Baris
Tom de Vries has posted comments on this change. Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ...................................................................... Patch Set 3: > Patch Set 3: > > > BTW, do you happen to have something more up-to-date than patch set 2? If so, could you share this? > > I've uploaded patchset 3. This version uses pending waitstatuses and also includes fixes from > https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/759 > FTR, this also fixes the gdb/testsuite/gdb.threads/kill-in-stop-all-threads.exp test-case in patch set 4 of https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/759 .
Hi, GDB goes into an infinite loop when it attempts to stop a thread in 'stop_all_threads', if the thread has already died. This patch aims to fix this by handling waitstatuses that denote exiting. This is a port of https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 from Gerrit to email, because Gerrit is planned to be retired. This version of the patch is a revision of patchset 3 in Gerrit, hence I enumerated it as v4. The new revision simplifies the waitstatus handling in PATCH 2/2 and also merges the tests into that same patch (because it is more desirable that the tests accompany the fix rather than coming as a separate commit). PATCH 1/2 was already approved on Gerrit. For a previous discussion, see https://sourceware.org/ml/gdb-patches/2020-01/msg00212.html And also see https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/759 The problematic scenario explained and tested in PATCH 2/2 is not reproducible if the bug mentioned in https://sourceware.org/ml/gdb-patches/2020-02/msg00122.html is not addressed. Therefore the patch proposed here has to be applied on top of a solution for that. Such a branch is available at https://github.com/barisaktemur/gdb/tree/thread-exit-in-stop-all-threads-v4 Thanks. Baris Tankut Baris Aktemur (2): gdb/infrun: extract out a code piece into 'mark_non_executing_threads' function gdb/infrun: handle already-exited threads when attempting to stop gdb/infrun.c | 115 ++++++++++++++++--------- gdb/testsuite/gdb.multi/multi-exit.c | 22 +++++ gdb/testsuite/gdb.multi/multi-exit.exp | 81 +++++++++++++++++ gdb/testsuite/gdb.multi/multi-kill.c | 34 ++++++++ gdb/testsuite/gdb.multi/multi-kill.exp | 84 ++++++++++++++++++ 5 files changed, 296 insertions(+), 40 deletions(-) create mode 100644 gdb/testsuite/gdb.multi/multi-exit.c create mode 100644 gdb/testsuite/gdb.multi/multi-exit.exp create mode 100644 gdb/testsuite/gdb.multi/multi-kill.c create mode 100644 gdb/testsuite/gdb.multi/multi-kill.exp
Tankut Baris Aktemur has abandoned this change. ( https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/133 ) Change subject: infrun: handle already-exited threads when attempting to stop ...................................................................... Abandoned Migrated to https://sourceware.org/ml/gdb-patches/2020-02/msg00153.html
diff --git a/gdb/infrun.c b/gdb/infrun.c index 66a066f..01fcbf6 100644 --- a/gdb/infrun.c +++ b/gdb/infrun.c @@ -4383,6 +4383,15 @@ { /* All resumed threads exited or one thread/process exited/signalled. */ + thread_info *t = find_thread_ptid (event_ptid); + if (t != nullptr) + { + t->stop_requested = 0; + t->executing = 0; + t->resumed = 0; + t->control.may_range_step = 0; + t->state = THREAD_EXITED; + } } else {