Message ID | 20190207094016.368-9-tom@tromey.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 75553 invoked by alias); 7 Feb 2019 09:40:38 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 75475 invoked by uid 89); 7 Feb 2019 09:40:37 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=intercepts, Hx-languages-length:2003 X-HELO: gateway20.websitewelcome.com Received: from gateway20.websitewelcome.com (HELO gateway20.websitewelcome.com) (192.185.68.24) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 07 Feb 2019 09:40:36 +0000 Received: from cm14.websitewelcome.com (cm14.websitewelcome.com [100.42.49.7]) by gateway20.websitewelcome.com (Postfix) with ESMTP id B9608400D0048 for <gdb-patches@sourceware.org>; Thu, 7 Feb 2019 03:40:34 -0600 (CST) Received: from box5379.bluehost.com ([162.241.216.53]) by cmsmtp with SMTP id rgAcgooXR2qH7rgAcgNzOc; Thu, 07 Feb 2019 03:40:34 -0600 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tromey.com; s=default; h=References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Sender:Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=CIXGwPUVJWZNDErSJrdU8ldcQPxSjMePbEKNZjv1Qf4=; b=dyPnXw2VehJpl9wvBR+DSGFH74 AmVaQN3Zlf3NWutywPqKgtyovG896Nntb0W5QDnWLVnNR02oAIdttPbEiRRiT4WCGUfRrMIMLxOf5 5cF5zPCN3eRLEbIBsqZjSFx8t; Received: from dhcp-guest.act-europe.fr ([194.98.77.127]:61727 helo=bapiya.act-europe.fr) by box5379.bluehost.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.91) (envelope-from <tom@tromey.com>) id 1grgAb-003W3S-Rb; Thu, 07 Feb 2019 03:40:34 -0600 From: Tom Tromey <tom@tromey.com> To: gdb-patches@sourceware.org Cc: Tom Tromey <tromey@adacore.com> Subject: [PATCH 8/8] Special-case wildcard requests in ravenscar-thread.c Date: Thu, 7 Feb 2019 02:40:16 -0700 Message-Id: <20190207094016.368-9-tom@tromey.com> In-Reply-To: <20190207094016.368-1-tom@tromey.com> References: <20190207094016.368-1-tom@tromey.com> |
Commit Message
Tom Tromey
Feb. 7, 2019, 9:40 a.m. UTC
From: Tom Tromey <tromey@adacore.com>
ravenscar-thread.c intercepts resume and wait target requests and
replaces the requested ptid with the ptid of the underlying CPU.
However, this is incorrect when a request is made with a wildcard
ptid.
This patch adds a special case to ravenscar-thread.c for
minus_one_ptid. I don't believe a special case for process wildcards
is necessary, so I have not added that.
gdb/ChangeLog
2019-02-07 Tom Tromey <tromey@adacore.com>
* ravenscar-thread.c (ravenscar_thread_target::resume)
(ravenscar_thread_target::wait): Special case wildcard requests.
---
gdb/ChangeLog | 5 +++++
gdb/ravenscar-thread.c | 20 ++++++++++++++++----
2 files changed, 21 insertions(+), 4 deletions(-)
Comments
On 02/07/2019 09:40 AM, Tom Tromey wrote: > From: Tom Tromey <tromey@adacore.com> > > ravenscar-thread.c intercepts resume and wait target requests and > replaces the requested ptid with the ptid of the underlying CPU. > However, this is incorrect when a request is made with a wildcard > ptid. Can you clarify a bit more what went wrong? IIRC, the base target always has one thread/cpu only, anyway? What difference does the patch make in practice? The reason I'm wondering is because the patch makes me wonder about a step request with no sched-lock, which is the default (*). In that case, you'll have: - the thread to step is in inferior_ptid - ptid == minus_one_ptid (indicating everything resumes / no schedlock) So seems like after this patch the lower layer might get a request to step an unknown inferior_ptid? Might not happen by luck/accident. I'd suspect that you'd want to do instead: inferior_ptid = m_base_ptid; /* If we see a wildcard resume, we simply pass that on. Otherwise, arrange to resume the base ptid. */ if (ptid != minus_one_ptid) ptid = m_base_ptid; beneath ()->resume (ptid, step, siggnal); I.e., always flip inferior_ptid. (*) since ravenscar-thread.c doesn't know how to interact with the ravenscar runtime to specify which threads are resumable, "schedlock on" most certainly doesn't work properly. E.g., "task 1 stops; set scheduler-locking on; task 2; step" seemingly will step task 1 instead of 2, AFAICT. Thanks, Pedro Alves
> > ravenscar-thread.c intercepts resume and wait target requests and > > replaces the requested ptid with the ptid of the underlying CPU. > > However, this is incorrect when a request is made with a wildcard > > ptid. > > Can you clarify a bit more what went wrong? > IIRC, the base target always has one thread/cpu only, anyway? What > difference does the patch make in practice? This happens when debugging application on a multi-CPU target, using QEMU, where some tasks are allocated to one CPU, and others are allocated to the other. When stopping, QEMU tells us about one thread per CPU, so we start with one possible base_ptid per CPU. Previously, sending a resume order for one thread actually resumed execution on all the CPUs. But during an upgrade, which behavior was changed so that sending a resume order for one thread only wakes the corresponding CPU up. At the user level, we noticed the issue because we had a test were we insert a breakpoint one some code which is only run from, say, CPU #2, whereas we unfortunately resumed the execution after having stopped somewhere in CPU #1. As a result, we sent an order to resume CPU #1, which starves CPU #2 forever, because the code in CPU #1 waits for some of the Ada tasks allocated to CPU #2 (and we never reach our breakpoint either). > The reason I'm wondering is because the patch makes me wonder about > a step request with no sched-lock, which is the default (*). > In that case, you'll have: > > - the thread to step is in inferior_ptid > - ptid == minus_one_ptid (indicating everything resumes / no schedlock) > > So seems like after this patch the lower layer might get a request to step > an unknown inferior_ptid? Might not happen by luck/accident. > I'd suspect that you'd want to do instead: > > inferior_ptid = m_base_ptid; > /* If we see a wildcard resume, we simply pass that on. Otherwise, > arrange to resume the base ptid. */ > if (ptid != minus_one_ptid) > ptid = m_base_ptid; > beneath ()->resume (ptid, step, siggnal); > > I.e., always flip inferior_ptid. > > (*) since ravenscar-thread.c doesn't know how to interact with > the ravenscar runtime to specify which threads are resumable, > "schedlock on" most certainly doesn't work properly. E.g., > "task 1 stops; set scheduler-locking on; task 2; step" > seemingly will step task 1 instead of 2, AFAICT. That's correct (unless if you are in the particular situation where you have one task per CPU).
On 02/09/2019 04:41 AM, Joel Brobecker wrote: >>> ravenscar-thread.c intercepts resume and wait target requests and >>> replaces the requested ptid with the ptid of the underlying CPU. >>> However, this is incorrect when a request is made with a wildcard >>> ptid. >> >> Can you clarify a bit more what went wrong? >> IIRC, the base target always has one thread/cpu only, anyway? What >> difference does the patch make in practice? > > This happens when debugging application on a multi-CPU target, > using QEMU, where some tasks are allocated to one CPU, and others > are allocated to the other. > > When stopping, QEMU tells us about one thread per CPU, so we start > with one possible base_ptid per CPU. Previously, sending a resume > order for one thread actually resumed execution on all the CPUs. > But during an upgrade, which behavior was changed so that sending > a resume order for one thread only wakes the corresponding CPU up. Interesting -- that clarifies the "why changed that made this noticeable now" question I had in my mind. It's the sort of thing that I think is worth it to include in the commit log. > > At the user level, we noticed the issue because we had a test were > we insert a breakpoint one some code which is only run from, say, > CPU #2, whereas we unfortunately resumed the execution after having > stopped somewhere in CPU #1. As a result, we sent an order to resume > CPU #1, which starves CPU #2 forever, because the code in CPU #1 > waits for some of the Ada tasks allocated to CPU #2 (and we never > reach our breakpoint either). I see. For some reason I recalled that Ravenscar didn't support SMP. > >> The reason I'm wondering is because the patch makes me wonder about >> a step request with no sched-lock, which is the default (*). >> In that case, you'll have: >> >> - the thread to step is in inferior_ptid >> - ptid == minus_one_ptid (indicating everything resumes / no schedlock) >> >> So seems like after this patch the lower layer might get a request to step >> an unknown inferior_ptid? Might not happen by luck/accident. >> I'd suspect that you'd want to do instead: >> >> inferior_ptid = m_base_ptid; >> /* If we see a wildcard resume, we simply pass that on. Otherwise, >> arrange to resume the base ptid. */ >> if (ptid != minus_one_ptid) >> ptid = m_base_ptid; >> beneath ()->resume (ptid, step, siggnal); >> >> I.e., always flip inferior_ptid. This comment still applies, I think. BTW, the main reason I asked is because I have a change in the multi-target branch that makes infrun switch inferior_ptid to null_ptid before handling events, which exposed a number of bad assumptions that could lead to accessing the wrong target. Ravenscar may end up affected, but I'm not certain. >> >> (*) since ravenscar-thread.c doesn't know how to interact with >> the ravenscar runtime to specify which threads are resumable, >> "schedlock on" most certainly doesn't work properly. E.g., >> "task 1 stops; set scheduler-locking on; task 2; step" >> seemingly will step task 1 instead of 2, AFAICT. > > That's correct (unless if you are in the particular situation where > you have one task per CPU). Thanks, Pedro Alves
>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:
Pedro> The reason I'm wondering is because the patch makes me wonder about
Pedro> a step request with no sched-lock, which is the default (*).
Pedro> In that case, you'll have:
I think the default is "replay", which is normally like "off" -- but
there's a Fedora patch to change the default to "step"?
Anyway, not super important...
Pedro> So seems like after this patch the lower layer might get a request to step
Pedro> an unknown inferior_ptid? Might not happen by luck/accident.
Pedro> I'd suspect that you'd want to do instead:
[...]
Yes, I agree. Thanks for pointing this out. I've made this change.
Tom
>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:
Pedro> Interesting -- that clarifies the "why changed that made this
Pedro> noticeable now" question I had in my mind. It's the sort of thing that
Pedro> I think is worth it to include in the commit log.
I'm copying Joel's text into the commit message :)
Pedro> BTW, the main reason I asked is because I have a change in the multi-target
Pedro> branch that makes infrun switch inferior_ptid to null_ptid before
Pedro> handling events, which exposed a number of bad assumptions that could lead
Pedro> to accessing the wrong target. Ravenscar may end up affected, but I'm
Pedro> not certain.
FWIW, ravenscar-thread.c doesn't seem to intrinsically care about
inferior_ptid -- it is just setting it for the benefit of any lower
layers on the target stack. So, if the rest of the stack is changed so
as not to rely on this, then ravenscar-thread can follow.
I'm re-testing this series using the AdaCore internal test suite, which
exercises ravenscar-thread; I'll push them if this goes well.
thanks,
Tom
On 02/15/2019 08:39 PM, Tom Tromey wrote: >>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes: > > Pedro> The reason I'm wondering is because the patch makes me wonder about > Pedro> a step request with no sched-lock, which is the default (*). > Pedro> In that case, you'll have: > > I think the default is "replay", which is normally like "off" Yes, replay is basically "off", except when you're replaying it behaves like "step". What I mean is that by default when you step all threads run unlocked. > -- but > there's a Fedora patch to change the default to "step"? Yeah. On my list of patches to drop from Fedora, actually. > > Anyway, not super important... > > Pedro> So seems like after this patch the lower layer might get a request to step > Pedro> an unknown inferior_ptid? Might not happen by luck/accident. > Pedro> I'd suspect that you'd want to do instead: > [...] > > Yes, I agree. Thanks for pointing this out. I've made this change. Thanks, Pedro Alves
diff --git a/gdb/ravenscar-thread.c b/gdb/ravenscar-thread.c index 2e8d76d4ae8..4b07b00aae2 100644 --- a/gdb/ravenscar-thread.c +++ b/gdb/ravenscar-thread.c @@ -323,8 +323,14 @@ void ravenscar_thread_target::resume (ptid_t ptid, int step, enum gdb_signal siggnal) { - inferior_ptid = m_base_ptid; - beneath ()->resume (m_base_ptid, step, siggnal); + /* If we see a wildcard resume, we simply pass that on. Otherwise, + arrange to resume the base ptid. */ + if (ptid != minus_one_ptid) + { + inferior_ptid = m_base_ptid; + ptid = m_base_ptid; + } + beneath ()->resume (ptid, step, siggnal); } ptid_t @@ -334,8 +340,12 @@ ravenscar_thread_target::wait (ptid_t ptid, { ptid_t event_ptid; - inferior_ptid = m_base_ptid; - event_ptid = beneath ()->wait (m_base_ptid, status, 0); + if (ptid != minus_one_ptid) + { + inferior_ptid = m_base_ptid; + ptid = m_base_ptid; + } + event_ptid = beneath ()->wait (ptid, status, 0); /* Find any new threads that might have been created, and update inferior_ptid to the active thread. @@ -350,6 +360,8 @@ ravenscar_thread_target::wait (ptid_t ptid, this->update_thread_list (); this->update_inferior_ptid (); } + else + inferior_ptid = m_base_ptid; return inferior_ptid; }