From patchwork Mon Nov 13 15:04:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 79765 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CDCF4392C29A for ; Mon, 13 Nov 2023 15:08:28 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) by sourceware.org (Postfix) with ESMTPS id 6514638754BD for ; Mon, 13 Nov 2023 15:05:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6514638754BD Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6514638754BD Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.85.128.44 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699887954; cv=none; b=vbFBDYGUl4t9ygYjv8vd1twNDIxXdnbZmVEckUhYtTB7SnVtR+brlKJ1IzArcv2pXxT4k66tv2BWIKG0nDQav4sXV8Ua1f4+7Ct9awskpiZYpdjGz1NXuk5XTjNYc9BAZzexzpUa3wFViNNZo2WxcxE8lXejEeWf5WEJgU+bY1k= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699887954; c=relaxed/simple; bh=RGnyclKp8nI+LE2XRzc1pjueXKTD5aCeyMc3JI5Djl4=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=g9l5Xlsv/yTDGeuIn0lf+YrlstoZKAufu8T/YtyqNITnXesHiMvB0Xj8eUMB8ky3sZX69WeYd9cHQG8a+vYsFqca3auxSevmRBQHeMvgIjU2kH0QuvBYLo3RDkfy9HYa5ojptew0e4KtPpA6+7FazkQHtWKVU0Qfdp9tRatlHvE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-f44.google.com with SMTP id 5b1f17b1804b1-407da05f05aso33758525e9.3 for ; Mon, 13 Nov 2023 07:05:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699887951; x=1700492751; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wpF/jAcD+7CsJhFKskgqnvhOrR5HYB/VSs75klJ29Zw=; b=UJoOLM6UG/f/DJZWkLIHFc1ZwbrU85arlLy0atnNVqHFW2mxcWJRx1sl/FTqpG1aTR ghJjnU+PiylrBQQeTf0i1uBOnfJqZAQp3AAQ8eywXzKl/rJdLnmw5AGt+BtlY42MENnW br6nH7DDcv1oR5p9TDpBzzQ/+WcBhDUspnKl98x1ZnraLqhoPAEVHXLXr+rzDRdTKPCH 00tj41b1veQlQFhXzdYJbiM+exa7r8mI8Hilv7YwtbJkO0JTbgljvM3JYy67uKPTZ77v Y1gN+HqL9yOM/YXwrs1edEeG1qiEmm6og4qvoJWWluBzrPeCL8ZzPNb/z/5uDiPNcOfC qMwQ== X-Gm-Message-State: AOJu0YwXZnyvKDC52+rpysdfeKkBUppihhqpDu20Q63g6pXXKahV/jnF 7Yho/bSH1fUvelyUITU/ddVOOtSb16M= X-Google-Smtp-Source: AGHT+IEuiql5qFZfcYPPZmLADgNVlbm4nINsa034mDNf6XoOoYJIco0R9hxPcS7+Z5jlZCos6md3Hw== X-Received: by 2002:a05:600c:2e4c:b0:408:3f87:cba with SMTP id q12-20020a05600c2e4c00b004083f870cbamr5759357wmf.39.1699887950629; Mon, 13 Nov 2023 07:05:50 -0800 (PST) Received: from localhost ([2001:8a0:f91e:1a00:8060:1e54:fb28:9635]) by smtp.gmail.com with UTF8SMTPSA id p11-20020a05600c1d8b00b003fe15ac0934sm9366889wms.1.2023.11.13.07.05.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 13 Nov 2023 07:05:50 -0800 (PST) From: Pedro Alves To: gdb-patches@sourceware.org Cc: Andrew Burgess Subject: [FYI/pushed v4 18/25] stop_all_threads: (re-)enable async before waiting for stops Date: Mon, 13 Nov 2023 15:04:20 +0000 Message-Id: <20231113150427.477431-19-pedro@palves.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231113150427.477431-1-pedro@palves.net> References: <20231113150427.477431-1-pedro@palves.net> MIME-Version: 1.0 X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Running the gdb.threads/step-over-thread-exit-while-stop-all-threads.exp testcase added later in the series against gdbserver, after the TARGET_WAITKIND_NO_RESUMED fix from the following patch, would run into an infinite loop in stop_all_threads, leading to a timeout: FAIL: gdb.threads/step-over-thread-exit-while-stop-all-threads.exp: displaced-stepping=off: target-non-stop=on: iter 0: continue (timeout) The is really a latent bug, and it is about the fact that stop_all_threads stops listening to events from a target as soon as it sees a TARGET_WAITKIND_NO_RESUMED, ignoring that TARGET_WAITKIND_NO_RESUMED may be delayed. handle_no_resumed knows how to handle delayed no-resumed events, but stop_all_threads was never taught to. In more detail, here's what happens with that testcase: #1 - Multiple threads report breakpoint hits to gdb. #2 - gdb picks one events, and it's for thread 1. All other stops are left pending. thread 1 needs to move past a breakpoint, so gdb stops all threads to start an inline step over for thread 1. While stopping threads, some of the threads that were still running report events that are also left pending. #2 - gdb steps thread 1 #3 - Thread 1 exits while stepping (it steps over an exit syscall), gdbserver reports thread exit for thread 1 #4 - Thread 1 was the last resumed thread, so gdbserver also reports no-resumed: [remote] Notification received: Stop:w0;p3445d0.3445d3 [remote] Sending packet: $vStopped#55 [remote] Packet received: N [remote] Sending packet: $vStopped#55 [remote] Packet received: OK #5 - gdb processes the thread exit for thread 1, finishes the step over and restarts threads. #6 - gdb picks the next event to process out of one of the resumed threads with pending events: [infrun] random_resumed_with_pending_wait_status: Found 32 events, selecting #11 #7 - This is again a breakpoint hit and the breakpoint needs to be stepped over too, so gdb starts a step-over dance again. #8 - We reach stop_all_threads, which finds that some threads need to be stopped. #9 - wait_one finally consumes the no-resumed event queue by #4. Seeing this, wait_one disable target async, to stop listening for events out of the remote target. #10 - We still haven't seen all the stops expected, so stop_all_threads tries another iteration. #11 - Because the remote target is no longer async, and there are no other targets, wait_one return no-resumed immediately without polling the remote target. #12 - We still haven't seen all the stops expected, so stop_all_threads tries another iteration. goto #11, looping forever. Fix this by explicitly enabling/re-enabling target async on targets that can async, before waiting for stops. Reviewed-By: Andrew Burgess Change-Id: Ie3ffb0df89635585a6631aa842689cecc989e33f --- gdb/infrun.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/gdb/infrun.c b/gdb/infrun.c index e0125e11b32..409c2249b13 100644 --- a/gdb/infrun.c +++ b/gdb/infrun.c @@ -5202,6 +5202,8 @@ wait_one () if (nfds == 0) { /* No waitable targets left. All must be stopped. */ + infrun_debug_printf ("no waitable targets left"); + target_waitstatus ws; ws.set_no_resumed (); return {nullptr, minus_one_ptid, std::move (ws)}; @@ -5461,6 +5463,83 @@ handle_one (const wait_one_event &event) return false; } +/* Helper for stop_all_threads. wait_one waits for events until it + sees a TARGET_WAITKIND_NO_RESUMED event. When it sees one, it + disables target_async for the target to stop waiting for events + from it. TARGET_WAITKIND_NO_RESUMED can be delayed though, + consider, debugging against gdbserver: + + #1 - Threads 1-5 are running, and thread 1 hits a breakpoint. + + #2 - gdb processes the breakpoint hit for thread 1, stops all + threads, and steps thread 1 over the breakpoint. while + stopping threads, some other threads reported interesting + events, which were left pending in the thread's objects + (infrun's queue). + + #2 - Thread 1 exits (it stepped an exit syscall), and gdbserver + reports the thread exit for thread 1. The event ends up in + remote's stop reply queue. + + #3 - That was the last resumed thread, so gdbserver reports + no-resumed, and that event also ends up in remote's stop + reply queue, queued after the thread exit from #2. + + #4 - gdb processes the thread exit event, which finishes the + step-over, and so gdb restarts all threads (threads with + pending events are left marked resumed, but aren't set + executing). The no-resumed event is still left pending in + the remote stop reply queue. + + #5 - Since there are now resumed threads with pending breakpoint + hits, gdb picks one at random to process next. + + #5 - gdb picks the breakpoint hit for thread 2 this time, and that + breakpoint also needs to be stepped over, so gdb stops all + threads again. + + #6 - stop_all_threads counts number of expected stops and calls + wait_one once for each. + + #7 - The first wait_one call collects the no-resumed event from #3 + above. + + #9 - Seeing the no-resumed event, wait_one disables target async + for the remote target, to stop waiting for events from it. + wait_one from here on always return no-resumed directly + without reaching the target. + + #10 - stop_all_threads still hasn't seen all the stops it expects, + so it does another pass. + + #11 - Since the remote target is not async (disabled in #9), + wait_one doesn't wait on it, so it won't see the expected + stops, and instead returns no-resumed directly. + + #12 - stop_all_threads still haven't seen all the stops, so it + does another pass. goto #11, looping forever. + + To handle this, we explicitly (re-)enable target async on all + targets that can async every time stop_all_threads goes wait for + the expected stops. */ + +static void +reenable_target_async () +{ + for (inferior *inf : all_inferiors ()) + { + process_stratum_target *target = inf->process_target (); + if (target != nullptr + && target->threads_executing + && target->can_async_p () + && !target->is_async_p ()) + { + switch_to_inferior_no_thread (inf); + target_async (1); + } + } +} + /* See infrun.h. */ void @@ -5587,6 +5666,8 @@ stop_all_threads (const char *reason, inferior *inf) if (pass > 0) pass = -1; + reenable_target_async (); + for (int i = 0; i < waits_needed; i++) { wait_one_event event = wait_one ();