From patchwork Tue Jul 7 18:04:38 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 7574 Received: (qmail 9656 invoked by alias); 7 Jul 2015 18:04:45 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 9644 invoked by uid 89); 7 Jul 2015 18:04:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_00, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=no version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Tue, 07 Jul 2015 18:04:42 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (Postfix) with ESMTPS id 2D91091DAF; Tue, 7 Jul 2015 18:04:41 +0000 (UTC) Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t67I4cRH013336; Tue, 7 Jul 2015 14:04:39 -0400 Message-ID: <559C14B6.5020800@redhat.com> Date: Tue, 07 Jul 2015 19:04:38 +0100 From: Pedro Alves User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Joel Brobecker , Simon Marchi CC: gdb-patches Subject: Re: Should this be on the blocker list for the 7.10 release? References: <559AE482.1010109@ericsson.com> <20150707132459.GA16734@adacore.com> <559BFBBD.4000303@redhat.com> In-Reply-To: <559BFBBD.4000303@redhat.com> On 07/07/2015 05:18 PM, Pedro Alves wrote: > On 07/07/2015 02:24 PM, Joel Brobecker wrote: > >> Not sure. I think Pedro would be in a better position to answer. >> For now, I've put this issue as a "maybe" for 7.10; so we will not >> release until this is fixed, or we explicitly decide it's OK for 7.10. >> >> Pedro? > > Let me take a look and understand this better. OK, the issue is that the new clone thread is found while inside the linux_stop_and_wait_all_lwps call in this new bit of code in linux-thread-db.c: linux_stop_and_wait_all_lwps (); ALL_LWPS (lp) if (ptid_get_pid (lp->ptid) == pid) thread_from_lwp (lp->ptid); linux_unstop_all_lwps (); We reach linux_handle_extended_wait with the "stopping" parameter set to 1, and because of that we don't mark the new lwp as resumed. As consequence, the subsequent resume_stopped_resumed_lwps (called first from that linux_unstop_all_lwps) never resumes the new LWP... There's lots of cruft in linux_handle_extended_wait that no longer makes sense. This seems to fix your github test for me, and causes no testsuite regressions. Did you try converting your test case to a proper GDB test? That'd be much appreciated. --- From a4f205a18dffaff3344b31e9b8009b1c0de8ba80 Mon Sep 17 00:00:00 2001 From: Pedro Alves Date: Tue, 7 Jul 2015 17:42:52 +0100 Subject: [PATCH] fix --- gdb/linux-nat.c | 91 +++++++++++++++++++++++++-------------------------------- 1 file changed, 40 insertions(+), 51 deletions(-) diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c index be429f8..ea38ebb 100644 --- a/gdb/linux-nat.c +++ b/gdb/linux-nat.c @@ -2086,43 +2086,7 @@ linux_handle_extended_wait (struct lwp_info *lp, int status, new_lp = add_lwp (ptid_build (ptid_get_pid (lp->ptid), new_pid, 0)); new_lp->cloned = 1; new_lp->stopped = 1; - - if (WSTOPSIG (status) != SIGSTOP) - { - /* This can happen if someone starts sending signals to - the new thread before it gets a chance to run, which - have a lower number than SIGSTOP (e.g. SIGUSR1). - This is an unlikely case, and harder to handle for - fork / vfork than for clone, so we do not try - but - we handle it for clone events here. We'll send - the other signal on to the thread below. */ - - new_lp->signalled = 1; - } - else - { - struct thread_info *tp; - - /* When we stop for an event in some other thread, and - pull the thread list just as this thread has cloned, - we'll have seen the new thread in the thread_db list - before handling the CLONE event (glibc's - pthread_create adds the new thread to the thread list - before clone'ing, and has the kernel fill in the - thread's tid on the clone call with - CLONE_PARENT_SETTID). If that happened, and the core - had requested the new thread to stop, we'll have - killed it with SIGSTOP. But since SIGSTOP is not an - RT signal, it can only be queued once. We need to be - careful to not resume the LWP if we wanted it to - stop. In that case, we'll leave the SIGSTOP pending. - It will later be reported as GDB_SIGNAL_0. */ - tp = find_thread_ptid (new_lp->ptid); - if (tp != NULL && tp->stop_requested) - new_lp->last_resume_kind = resume_stop; - else - status = 0; - } + new_lp->resumed = 1; /* If the thread_db layer is active, let it record the user level thread id and status, and add the thread to GDB's @@ -2136,19 +2100,23 @@ linux_handle_extended_wait (struct lwp_info *lp, int status, } /* Even if we're stopping the thread for some reason - internal to this module, from the user/frontend's - perspective, this new thread is running. */ + internal to this module, from the perspective of infrun + and the user/frontend, this new thread is running until + it next reports a stop. */ set_running (new_lp->ptid, 1); - if (!stopping) - { - set_executing (new_lp->ptid, 1); - /* thread_db_attach_lwp -> lin_lwp_attach_lwp forced - resume_stop. */ - new_lp->last_resume_kind = resume_continue; - } + set_executing (new_lp->ptid, 1); - if (status != 0) + if (WSTOPSIG (status) != SIGSTOP) { + /* This can happen if someone starts sending signals to + the new thread before it gets a chance to run, which + have a lower number than SIGSTOP (e.g. SIGUSR1). + This is an unlikely case, and harder to handle for + fork / vfork than for clone, so we do not try - but + we handle it for clone events here. */ + + new_lp->signalled = 1; + /* We created NEW_LP so it cannot yet contain STATUS. */ gdb_assert (new_lp->status == 0); @@ -2162,7 +2130,6 @@ linux_handle_extended_wait (struct lwp_info *lp, int status, new_lp->status = status; } - new_lp->resumed = !stopping; return 1; } @@ -3673,9 +3640,31 @@ resume_stopped_resumed_lwps (struct lwp_info *lp, void *data) { ptid_t *wait_ptid_p = data; - if (lp->stopped - && lp->resumed - && !lwp_status_pending_p (lp)) + if (!lp->stopped) + { + if (debug_linux_nat) + fprintf_unfiltered (gdb_stdlog, + "RSRL: NOT resuming stopped-resumed LWP %s, " + "not stopped\n", + target_pid_to_str (lp->ptid)); + } + else if (!lp->resumed) + { + if (debug_linux_nat) + fprintf_unfiltered (gdb_stdlog, + "RSRL: NOT resuming stopped-resumed LWP %s, " + "not resumed\n", + target_pid_to_str (lp->ptid)); + } + else if (lwp_status_pending_p (lp)) + { + if (debug_linux_nat) + fprintf_unfiltered (gdb_stdlog, + "RSRL: NOT resuming stopped-resumed LWP %s, " + "has pending status\n", + target_pid_to_str (lp->ptid)); + } + else { struct regcache *regcache = get_thread_regcache (lp->ptid); struct gdbarch *gdbarch = get_regcache_arch (regcache);