From patchwork Thu Nov 20 05:12:23 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joel Brobecker X-Patchwork-Id: 3807 Received: (qmail 8749 invoked by alias); 20 Nov 2014 05:12:30 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 8738 invoked by uid 89); 20 Nov 2014 05:12:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00 autolearn=ham version=3.3.2 X-HELO: rock.gnat.com Received: from rock.gnat.com (HELO rock.gnat.com) (205.232.38.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Thu, 20 Nov 2014 05:12:27 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by filtered-rock.gnat.com (Postfix) with ESMTP id BBC3C116B4F for ; Thu, 20 Nov 2014 00:12:25 -0500 (EST) Received: from rock.gnat.com ([127.0.0.1]) by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id GCPG3FTKy2-F for ; Thu, 20 Nov 2014 00:12:25 -0500 (EST) Received: from joel.gnat.com (localhost.localdomain [127.0.0.1]) by rock.gnat.com (Postfix) with ESMTP id 4D260116B36 for ; Thu, 20 Nov 2014 00:12:25 -0500 (EST) Received: by joel.gnat.com (Postfix, from userid 1000) id 328A240F79; Thu, 20 Nov 2014 09:12:23 +0400 (RET) Date: Thu, 20 Nov 2014 09:12:23 +0400 From: Joel Brobecker To: gdb-patches@sourceware.org Subject: Re: RFC: skip_inline_frames failed assertion resuming from breakpoint on LynxOS Message-ID: <20141120051223.GA23720@adacore.com> References: <20141120051109.GR5774@adacore.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20141120051109.GR5774@adacore.com> User-Agent: Mutt/1.5.21 (2010-09-15) [Fixing ENOPATCH... sigh.] > I was wondering what you guys would think of a patch like this. > I am a bit uncertain, because I don't understand everything > that is happening - and the problem is that this is happening > with a fairly massive and complex program that I don't have access > to, on a system that is also fairly opaque. When I'm lucky, getting > answers is only very hard. > > I am still trying to reproduce the problem locally in order to > find out more, but I couldn't understand why, in principle, > one thread couldn't receive multiple notifications during > the same single-step if the system decides to queue up signals? > If that were the case, wouldn't the attached patch make sense? > (currently untested against the program that triggered the issue, > as I think I understand how inline-frame works, and what it does, > but I am not sure I get it all). Thanks again! From f7ad35aa92a7007194582b1e23a110fc06b50cd1 Mon Sep 17 00:00:00 2001 From: Joel Brobecker Date: Thu, 20 Nov 2014 08:38:08 +0400 Subject: [PATCH] skip_inline_frames failed assertion resuming from breakpoint on LynxOS A user reported a failed assertion while debugging their program on a LynxOS system (thus via GDBserver), when trying to resume the program's execution after having reached a breakpoint: (gdb) continue [...] ../../src/gdb/inline-frame.c:339: internal-error: skip_inline_frames: Assertion `find_inline_frame_state (ptid) == NULL' failed. Turning infrun debug traces helps understand a little better what happens: (gdb) continue Continuing. infrun: clear_proceed_status_thread (Thread 126) [...] infrun: clear_proceed_status_thread (Thread 142) [...] infrun: clear_proceed_status_thread (Thread 146) infrun: clear_proceed_status_thread (Thread 125) infrun: proceed (addr=0xffffffff, signal=GDB_SIGNAL_DEFAULT, step=0) infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 142] at 0x10684838 infrun: wait_for_inferior () infrun: target_wait (-1, status) = infrun: 42000 [Thread 146], infrun: status->kind = stopped, signal = GDB_SIGNAL_REALTIME_34 infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x10a187f4 infrun: context switch infrun: Switching context from Thread 142 to Thread 146 infrun: random signal (GDB_SIGNAL_REALTIME_34) infrun: switching back to stepped thread infrun: Switching context from Thread 146 to Thread 142 infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 142] at 0x10684838 infrun: prepare_to_wait [...handling of similar events for threads 145, 144 and 143 snipped...] infrun: prepare_to_wait infrun: target_wait (-1, status) = infrun: 42000 [Thread 146], infrun: status->kind = stopped, signal = GDB_SIGNAL_REALTIME_34 infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x10a187f4 infrun: context switch infrun: Switching context from Thread 142 to Thread 146 ../../src/gdb/inline-frame.c:339: internal-error: skip_inline_frames: Assertion `find_inline_frame_state (ptid) == NULL' failed. It all happens while we're trying to single-step out of the breakpoint. We keep resuming the inferior trying to single-step the thread that hit the breakpoint, but each time we get a notification that another thread received a particular signal. This is OK until the same thread actually received a signal a second time, without having actually run further (same PC). That's when we hit the assertion in skip_inline_frames. This patch avoids the assertion by recognizing that a thread can indeed potentially receive multiple events without changing PC, and by therefore changing skip_inline_frames to return immediately if there we have already computed the inline_state for this thread's PC. gdb/ChangeLog: * inline-frame.c (skip_inline_frames): Do not raise a failed assertion if find_inline_frame_state finds an inlined frame state for PTID. Return early instead. Tested on x86_64-linux. --- gdb/inline-frame.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/gdb/inline-frame.c b/gdb/inline-frame.c index cecb2af..c60820c 100644 --- a/gdb/inline-frame.c +++ b/gdb/inline-frame.c @@ -307,6 +307,24 @@ skip_inline_frames (ptid_t ptid) int skip_count = 0; struct inline_state *state; + if (find_inline_frame_state (ptid) != NULL) + { + /* This thread is receiving multiple notifications without + making progress in its execution (same PC). + + This was seen happening on LynxOS where a program appears + to have a number of signals being queued then delivered + while trying to single-step a thread out of a breakpoint. + The single-step operation makes no progress until all signals + get delivered first, which can result in the same thread + receiving multiple signals during the same single-step + attempt. + + We have already computed the inline_state for that thread, + so there is no need to redo it again. */ + return; + } + /* This function is called right after reinitializing the frame cache. We try not to do more unwinding than absolutely necessary, for performance. */ @@ -335,7 +353,6 @@ skip_inline_frames (ptid_t ptid) } } - gdb_assert (find_inline_frame_state (ptid) == NULL); state = allocate_inline_frame_state (ptid); state->skipped_frames = skip_count; state->saved_pc = this_pc; -- 1.9.1