From patchwork Sat Dec 13 15:46:38 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Joel Brobecker <brobecker@adacore.com>
X-Patchwork-Id: 4233
Received: (qmail 14487 invoked by alias); 13 Dec 2014 15:46:42 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Delivered-To: mailing list gdb-patches@sourceware.org
Received: (qmail 14473 invoked by uid 89); 13 Dec 2014 15:46:41 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,
	BAYES_00 autolearn=ham version=3.3.2
X-HELO: rock.gnat.com
Received: from rock.gnat.com (HELO rock.gnat.com) (205.232.38.15) by
	sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with
	(AES256-SHA encrypted) ESMTPS; Sat, 13 Dec 2014 15:46:39 +0000
Received: from localhost (localhost.localdomain [127.0.0.1])	by
	filtered-rock.gnat.com (Postfix) with ESMTP id 675C2116478;
	Sat, 13 Dec 2014 10:46:37 -0500 (EST)
Received: from rock.gnat.com ([127.0.0.1])	by localhost (rock.gnat.com
	[127.0.0.1]) (amavisd-new, port 10024)	with LMTP id
	VJH+2Ob8fsqn; Sat, 13 Dec 2014 10:46:37 -0500 (EST)
Received: from joel.gnat.com (localhost.localdomain [127.0.0.1])	by
	rock.gnat.com (Postfix) with ESMTP id 564F9116474;
	Sat, 13 Dec 2014 10:46:37 -0500 (EST)
Received: by joel.gnat.com (Postfix, from userid 1000)	id 2174040164;
	Sat, 13 Dec 2014 10:46:38 -0500 (EST)
Date: Sat, 13 Dec 2014 10:46:38 -0500
From: Joel Brobecker <brobecker@adacore.com>
To: Pedro Alves <palves@redhat.com>
Cc: gdb-patches@sourceware.org
Subject: Re: RFC: skip_inline_frames failed assertion resuming from
	breakpoint on LynxOS
Message-ID: <20141213154638.GK5457@adacore.com>
References: <20141120051109.GR5774@adacore.com>
	<20141120051223.GA23720@adacore.com>
	<546DBA9F.9060600@redhat.com> <20141120171137.GT5774@adacore.com>
	<546F173A.3070204@redhat.com>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <546F173A.3070204@redhat.com>
User-Agent: Mutt/1.5.21 (2010-09-15)

Hi Pedro,

> The main issue is that we're trying to move the thread past a
> breakpoint.  Barring displaced stepping support, to move the
> thread past the breakpoint, we have to remove the breakpoint from
> the target temporarily.  But then we _cannot_ resume other threads
> but the one that is stopped at the breakpoint, because then those
> other threads could fly by the removed breakpoint and miss it.

Attached is a patch that does just that, tested on ppc-lynx5 and
ppc-lynx178.  I waited a while before posting it here, because
I wanted to put it in observation for a while first...

gdb/gdbserver/ChangeLog:

        * lynx-low.c (lynx_resume): Use PTRACE_SINGLESTEP_ONE if N == 1.
        Remove FIXME comment about assumption about N.

OK to commit?

Note that parallel to that, I came across another issue, which I am
going to call a limitation for now: consider the case where we have
2 threads, A and B, and we are tring to next/step some code in thread
A. While doing so, thread B receives a signal, and therefore reports
it to GDB. GDB sees that this signal is configured as
nostop/noprint/pass, so presumably, you would think that we'd resume
the inferior passing that signal to thread B. However, how do you do
that while at the same time stepping thread A?

IIRC, what happens currently in this case is that GDB keeps trying
to resume/step thread A, and the kernel keeps telling GDB "no,
thread B just received a signal", and so GDB and the kernel go
into that infinite loop where nothing advances. I'm not quite sure
why we keep getting the signal for thread B, if it's a new signal
each time, or if it's about the signal not being passed back (the
program I saw this in is fairly large and complicated).

In any case, I don't see how we could improve this situation
without settting sss-like breakpoints... Something I'm not really
eager to do, at least for now, since "set scheduler-locking step"
seems to work around the issue.

Thanks!

From ea7e173463120d24417a7706f98fff850f9aaa1a Mon Sep 17 00:00:00 2001
From: Joel Brobecker <brobecker@adacore.com>
Date: Tue, 25 Nov 2014 11:12:10 -0500
Subject: [PATCH] [gdbserver/lynxos] Use PTRACE_SINGLESTEP_ONE when
 single-stepping one thread.

Currently, when we receive a request to single-step one single thread
(Eg, when single-stepping out of a breakpoint), we use the
PTRACE_SINGLESTEP pthread request, which does single-step
the corresponding thread, but also resumes execution of all
other threads in the inferior.

This causes problems when debugging programs where another thread
receives multiple debug events while trying to single-step a specific
thread out of a breakpoint (with infrun traces turned on):

    (gdb) continue
    Continuing.
    infrun: clear_proceed_status_thread (Thread 126)
    [...]
    infrun: clear_proceed_status_thread (Thread 142)
    [...]
    infrun: clear_proceed_status_thread (Thread 146)
    infrun: clear_proceed_status_thread (Thread 125)
    infrun: proceed (addr=0xffffffff, signal=GDB_SIGNAL_DEFAULT, step=0)
    infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 142] at 0x10684838
    infrun: wait_for_inferior ()
    infrun: target_wait (-1, status) =
    infrun:   42000 [Thread 146],
    infrun:   status->kind = stopped, signal = GDB_SIGNAL_REALTIME_34
    infrun: infwait_normal_state
    infrun: TARGET_WAITKIND_STOPPED
    infrun: stop_pc = 0x10a187f4
    infrun: context switch
    infrun: Switching context from Thread 142 to Thread 146
    infrun: random signal (GDB_SIGNAL_REALTIME_34)
    infrun: switching back to stepped thread
    infrun: Switching context from Thread 146 to Thread 142
    infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 142] at 0x10684838
    infrun: prepare_to_wait
    [...handling of similar events for threads 145, 144 and 143 snipped...]
    infrun: prepare_to_wait
    infrun: target_wait (-1, status) =
    infrun:   42000 [Thread 146],
    infrun:   status->kind = stopped, signal = GDB_SIGNAL_REALTIME_34
    infrun: infwait_normal_state
    infrun: TARGET_WAITKIND_STOPPED
    infrun: stop_pc = 0x10a187f4
    infrun: context switch
    infrun: Switching context from Thread 142 to Thread 146
    ../../src/gdb/inline-frame.c:339: internal-error: skip_inline_frames: Assertion `find_inline_frame_state (ptid) == NULL' failed.

What happens is that GDB keeps sending requests to resume one specific
thread, and keeps receiving debugging events for other threads.
Things break down when the one of the other threads receives a debug
event for the second time (thread 146 in the example above).

This patch fixes the problem by making sure that only one thread
gets resumed, thus preventing the other threads from generating
an unexpected event.

gdb/gdbserver/ChangeLog:

        * lynx-low.c (lynx_resume): Use PTRACE_SINGLESTEP_ONE if N == 1.
        Remove FIXME comment about assumption about N.
---
 gdb/gdbserver/lynx-low.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gdb/gdbserver/lynx-low.c b/gdb/gdbserver/lynx-low.c
index 6178e03..3b83669 100644
--- a/gdb/gdbserver/lynx-low.c
+++ b/gdb/gdbserver/lynx-low.c
@@ -320,10 +320,11 @@ lynx_attach (unsigned long pid)
 static void
 lynx_resume (struct thread_resume *resume_info, size_t n)
 {
-  /* FIXME: Assume for now that n == 1.  */
   ptid_t ptid = resume_info[0].thread;
-  const int request = (resume_info[0].kind == resume_step
-                       ? PTRACE_SINGLESTEP : PTRACE_CONT);
+  const int request
+    = (resume_info[0].kind == resume_step
+       ? (n == 1 ? PTRACE_SINGLESTEP_ONE : PTRACE_SINGLESTEP)
+       : PTRACE_CONT);
   const int signal = resume_info[0].sig;
 
   /* If given a minus_one_ptid, then try using the current_process'
-- 
1.9.1