From patchwork Mon Dec 12 20:30:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 61852 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1ABEB3875B4B for ; Mon, 12 Dec 2022 20:34:47 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) by sourceware.org (Postfix) with ESMTPS id 93B90384C369 for ; Mon, 12 Dec 2022 20:31:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 93B90384C369 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wr1-f44.google.com with SMTP id bx10so13478726wrb.0 for ; Mon, 12 Dec 2022 12:31:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lvCk84JnjAQW+HEkdgIzNJgIjpf34BVPiJ+Yyzv5nFc=; b=n6rl5YMCqJ7yz5tjXlEsokT5yXotO9NEEZgc2T3W2v3y3qmGe/HHBTJZDRASeK1ljo ynWzLtbgx2ZrH+mS6fjZsbL1BL1lRp+aOJ8QPKOwDo0iT1tY5Iajp7FH+JPNvXJTx2rl 4AJuF/Nquf0NE8F1GUAlhoXDhKkgMeWJwTN0XBqtMsC2pQ/GtCm8N8Bxwx8uRgt+/q7T gl7E4/59q5N1Wu1Tuz/buuNiN7FN+Kkx29ljbPb8EDaE3xjJcQlrNhsBIHTZXhO5GFeq aYA+oQW0gjqK5roMuDkmtu8cVMVPgiIgF+zUrVQB4oozK5k31zWsNB+/oFEtPKA6LbRt 0mTA== X-Gm-Message-State: ANoB5pmn+8Et4ryMW8o+JB7hkldAWZRxDBBA78bfMiaeCUMjfeogCnh5 e8nZbFBj1h+VZcmI/Wdi4GxX7DTvbOKvdA== X-Google-Smtp-Source: AA0mqf729HmIBjiMMxEqMX1A73DamWyLICGOeBLv3Qjt7fUEDLCLJC5XNIqq5SmKB6TUIEOVrIEyfg== X-Received: by 2002:adf:f74d:0:b0:242:55e6:dc9 with SMTP id z13-20020adff74d000000b0024255e60dc9mr10409558wrp.8.1670877090864; Mon, 12 Dec 2022 12:31:30 -0800 (PST) Received: from localhost ([2001:8a0:f912:6700:afd9:8b6d:223f:6170]) by smtp.gmail.com with ESMTPSA id l5-20020a5d6685000000b0024194bba380sm9768414wru.22.2022.12.12.12.31.30 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 12 Dec 2022 12:31:30 -0800 (PST) From: Pedro Alves To: gdb-patches@sourceware.org Subject: [PATCH 24/31] Report thread exit event for leader if reporting thread exit events Date: Mon, 12 Dec 2022 20:30:54 +0000 Message-Id: <20221212203101.1034916-25-pedro@palves.net> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20221212203101.1034916-1-pedro@palves.net> References: <20221212203101.1034916-1-pedro@palves.net> MIME-Version: 1.0 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Sender: "Gdb-patches" If GDB sets the GDB_THREAD_OPTION_EXIT option on a thread, then if the thread disappears from the thread list, GDB expects to shortly see a thread exit event for it. See e.g., here, in remote_target::update_thread_list(): /* Do not remove the thread if we've requested to be notified of its exit. For example, the thread may be displaced stepping, infrun will need to handle the exit event, and displaced stepping info is recorded in the thread object. If we deleted the thread now, we'd lose that info. */ if ((tp->thread_options () & GDB_THREAD_OPTION_EXIT) != 0) continue; There's one scenario that is deleting a thread from the remote/gdbserver thread list without ever reporting a corresponding thread exit event though -- check_zombie_leaders. This can lead to GDB getting confused. For example, with a following patch that enables GDB_THREAD_OPTION_EXIT whenever schedlock is enabled, we'd see this regression: $ make check RUNTESTFLAGS="--target_board=native-extended-gdbserver" TESTS="gdb.threads/no-unwaited-for-left.exp" ... Running src/gdb/testsuite/gdb.threads/no-unwaited-for-left.exp ... FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout) ... some more cascading FAILs ... gdb.log shows: (gdb) continue Continuing. FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout) A passing run would have resulted in: (gdb) continue Continuing. No unwaited-for children left. (gdb) PASS: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits note how the leader thread is not listed in the remote-reported XML thread list below: (gdb) set debug remote 1 (gdb) set debug infrun 1 (gdb) info threads Id Target Id Frame * 1 Thread 1163850.1163850 "no-unwaited-for" main () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/no-unwaited-for-left.c:65 3 Thread 1163850.1164130 "no-unwaited-for" [remote] Sending packet: $Hgp11c24a.11c362#39 (gdb) c Continuing. [infrun] clear_proceed_status_thread: 1163850.1163850.0 ... [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=1, current thread [1163850.1163850.0] at 0x55555555534f [remote] Sending packet: $QPassSignals:#f3 [remote] Packet received: OK [remote] Sending packet: $QThreadOptions;3:p11c24a.11c24a#f3 [remote] Packet received: OK ... [infrun] target_set_thread_options: [options for Thread 1163850.1163850 are now 0x3] ... [infrun] do_target_resume: resume_ptid=1163850.1163850.0, step=0, sig=GDB_SIGNAL_0 [remote] Sending packet: $vCont;c:p11c24a.11c24a#98 [infrun] prepare_to_wait: prepare_to_wait [infrun] reset: reason=handling event [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target extended-remote [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target extended-remote [infrun] fetch_inferior_event: exit [infrun] fetch_inferior_event: enter [infrun] scoped_disable_commit_resumed: reason=handling event [infrun] random_pending_event_thread: None found. [remote] wait: enter [remote] Packet received: N [remote] wait: exit [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) = [infrun] print_target_wait_results: -1.0.0 [process -1], [infrun] print_target_wait_results: status->kind = NO_RESUMED [infrun] handle_inferior_event: status->kind = NO_RESUMED [remote] Sending packet: $Hgp0.0#ad [remote] Packet received: OK [remote] Sending packet: $qXfer:threads:read::0,1000#92 [remote] Packet received: l\n\n\n [infrun] handle_no_resumed: TARGET_WAITKIND_NO_RESUMED (ignoring: found resumed) ... ... however, infrun decided there was a resumed thread still, so ignored the TARGET_WAITKIND_NO_RESUMED event. Debugging GDB, we see that the "found resumed" thread that GDB finds, is the leader thread. Even though that thread is not on the remote-reported thread list, it is still on the GDB thread list, due to the special case in remote.c mentioned above. This commit addresses the issue by fixing GDBserver to report a thread exit event for the zombie leader too, i.e., making GDBserver respect the "if thread has GDB_THREAD_OPTION_EXIT set, report a thread exit" invariant. To do that, it takes a bit more code than one would imagine off hand, due to the fact that we currently always report LWP exit pending events as TARGET_WAITKIND_EXITED, and then decide whether to convert it to TARGET_WAITKIND_THREAD_EXITED just before reporting the event to GDBserver core. For the zombie leader scenario described, we need to record early on that we want to report a THREAD_EXITED event, and then make sure that decision isn't lost along the way to reporting the event to GDBserver core. Change-Id: I1e68fccdbc9534434dee07163d3fd19744c8403b Reviewed-By: Andrew Burgess --- gdbserver/linux-low.cc | 75 ++++++++++++++++++++++++++++++++++++------ gdbserver/linux-low.h | 5 +-- 2 files changed, 68 insertions(+), 12 deletions(-) diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc index bf6dc1d995a..c197846810c 100644 --- a/gdbserver/linux-low.cc +++ b/gdbserver/linux-low.cc @@ -279,7 +279,8 @@ int using_threads = 1; static int stabilizing_threads; static void unsuspend_all_lwps (struct lwp_info *except); -static void mark_lwp_dead (struct lwp_info *lwp, int wstat); +static void mark_lwp_dead (struct lwp_info *lwp, int wstat, + bool thread_event); static int lwp_is_marked_dead (struct lwp_info *lwp); static int kill_lwp (unsigned long lwpid, int signo); static void enqueue_pending_signal (struct lwp_info *lwp, int signal, siginfo_t *info); @@ -1800,10 +1801,12 @@ iterate_over_lwps (ptid_t filter, return get_thread_lwp (thread); } -void +bool linux_process_target::check_zombie_leaders () { - for_each_process ([this] (process_info *proc) + bool new_pending_event = false; + + for_each_process ([&] (process_info *proc) { pid_t leader_pid = pid_of (proc); lwp_info *leader_lp = find_lwp_pid (ptid_t (leader_pid)); @@ -1872,9 +1875,19 @@ linux_process_target::check_zombie_leaders () "(it exited, or another thread execd), " "deleting it.", leader_pid); - delete_lwp (leader_lp); + + thread_info *leader_thread = get_lwp_thread (leader_lp); + if (report_exit_events_for (leader_thread)) + { + mark_lwp_dead (leader_lp, W_EXITCODE (0, 0), true); + new_pending_event = true; + } + else + delete_lwp (leader_lp); } }); + + return new_pending_event; } /* Callback for `find_thread'. Returns the first LWP that is not @@ -2333,7 +2346,7 @@ linux_process_target::filter_event (int lwpid, int wstat) /* Since events are serialized to GDB core, and we can't report this one right now. Leave the status pending for the next time we're able to report it. */ - mark_lwp_dead (child, wstat); + mark_lwp_dead (child, wstat, false); return; } else @@ -2646,7 +2659,8 @@ linux_process_target::wait_for_event_filtered (ptid_t wait_ptid, /* Check for zombie thread group leaders. Those can't be reaped until all other threads in the thread group are. */ - check_zombie_leaders (); + if (check_zombie_leaders ()) + goto retry; auto not_stopped = [&] (thread_info *thread) { @@ -2893,6 +2907,17 @@ linux_process_target::filter_exit_event (lwp_info *event_child, struct thread_info *thread = get_lwp_thread (event_child); ptid_t ptid = ptid_of (thread); + if (ourstatus->kind () == TARGET_WAITKIND_THREAD_EXITED) + { + /* We're reporting a thread exit for the leader. The exit was + detected by check_zombie_leaders. */ + gdb_assert (is_leader (thread)); + gdb_assert (report_exit_events_for (thread)); + + delete_lwp (event_child); + return ptid; + } + /* Note we must filter TARGET_WAITKIND_SIGNALLED as well, otherwise if a non-leader thread exits with a signal, we'd report it to the core which would interpret it as the whole-process exiting. @@ -3012,7 +3037,20 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus, { if (WIFEXITED (w)) { - ourstatus->set_exited (WEXITSTATUS (w)); + /* If we already have the exit recorded in waitstatus, use + it. This will happen when we detect a zombie leader, + when we had GDB_THREAD_OPTION_EXIT enabled for it. We + want to report its exit as TARGET_WAITKIND_THREAD_EXITED, + as the whole process hasn't exited yet. */ + const target_waitstatus &ws = event_child->waitstatus; + if (ws.kind () != TARGET_WAITKIND_IGNORE) + { + gdb_assert (ws.kind () == TARGET_WAITKIND_EXITED + || ws.kind () == TARGET_WAITKIND_THREAD_EXITED); + *ourstatus = ws; + } + else + ourstatus->set_exited (WEXITSTATUS (w)); threads_debug_printf ("ret = %s, exited with retcode %d", @@ -3720,8 +3758,15 @@ suspend_and_send_sigstop (thread_info *thread, lwp_info *except) send_sigstop (thread, except); } +/* Mark LWP dead, with WSTAT as exit status pending to report later. + If THREAD_EVENT is true, interpret WSTAT as a thread exit event + instead of a process exit event. This is meaningful for the leader + thread, as we normally report a process-wide exit event when we see + the leader exit, and a thread exit event when we see any other + thread exit. */ + static void -mark_lwp_dead (struct lwp_info *lwp, int wstat) +mark_lwp_dead (struct lwp_info *lwp, int wstat, bool thread_event) { /* Store the exit status for later. */ lwp->status_pending_p = 1; @@ -3730,9 +3775,19 @@ mark_lwp_dead (struct lwp_info *lwp, int wstat) /* Store in waitstatus as well, as there's nothing else to process for this event. */ if (WIFEXITED (wstat)) - lwp->waitstatus.set_exited (WEXITSTATUS (wstat)); + { + if (thread_event) + lwp->waitstatus.set_thread_exited (WEXITSTATUS (wstat)); + else + lwp->waitstatus.set_exited (WEXITSTATUS (wstat)); + } else if (WIFSIGNALED (wstat)) - lwp->waitstatus.set_signalled (gdb_signal_from_host (WTERMSIG (wstat))); + { + gdb_assert (!thread_event); + lwp->waitstatus.set_signalled (gdb_signal_from_host (WTERMSIG (wstat))); + } + else + gdb_assert_not_reached ("unknown status kind"); /* Prevent trying to stop it. */ lwp->stopped = 1; diff --git a/gdbserver/linux-low.h b/gdbserver/linux-low.h index 33a14e15173..ffbc3c6f095 100644 --- a/gdbserver/linux-low.h +++ b/gdbserver/linux-low.h @@ -574,8 +574,9 @@ class linux_process_target : public process_stratum_target /* Detect zombie thread group leaders, and "exit" them. We can't reap their exits until all other threads in the group have - exited. */ - void check_zombie_leaders (); + exited. Returns true if we left any new event pending, false + otherwise. */ + bool check_zombie_leaders (); /* Convenience function that is called when we're about to return an event to the core. If the event is an exit or signalled event,