From patchwork Fri Jan 29 20:43:43 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Antoine Tremblay X-Patchwork-Id: 10671 Received: (qmail 130834 invoked by alias); 29 Jan 2016 20:43:48 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 130821 invoked by uid 89); 29 Jan 2016 20:43:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00, SPF_PASS autolearn=ham version=3.3.2 spammy=Checking, PCs, cooked, filtering X-HELO: usplmg21.ericsson.net Received: from usplmg21.ericsson.net (HELO usplmg21.ericsson.net) (198.24.6.65) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Fri, 29 Jan 2016 20:43:46 +0000 Received: from EUSAAHC008.ericsson.se (Unknown_Domain [147.117.188.96]) by usplmg21.ericsson.net (Symantec Mail Security) with SMTP id 8B.E3.32102.CEECBA65; Fri, 29 Jan 2016 21:43:25 +0100 (CET) Received: from [142.133.110.95] (147.117.188.8) by smtp-am.internal.ericsson.com (147.117.188.98) with Microsoft SMTP Server id 14.3.248.2; Fri, 29 Jan 2016 15:43:43 -0500 Subject: Re: Move threads out of jumppad without single step To: Yao Qi , Pedro Alves References: <86zixzvhj1.fsf@gmail.com> <565C6043.4040106@redhat.com> <864mg2v1s5.fsf@gmail.com> <56A8F4AE.5040305@ericsson.com> CC: , From: Antoine Tremblay Message-ID: <56ABCEFF.4090506@ericsson.com> Date: Fri, 29 Jan 2016 15:43:43 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <56A8F4AE.5040305@ericsson.com> X-IsSubscribed: yes On 01/27/2016 11:47 AM, Antoine Tremblay wrote: > > > On 12/01/2015 06:36 AM, Yao Qi wrote: >> Pedro Alves writes: >> >>> You may be able to handle this by retrieving state from the saved >>> registers >>> buffer in the jump pad, similar to how gdb_collect cooks up a >>> regcache, though >>> unlike gdb_collect, you'll have to handle the case of the thread >>> stopping >>> midway through that register saving too (some registers already >>> saved, some not >>> yet). >> >> Compute the next PCs on the basis of cooked up regcache from stack is >> what I intended, but I didn't consider the case thread is stopped in the >> middle way of register saving. >> >>> >>> So I assume it's much simpler to just run to [1] as well, and then issue >>> a normal software single-step when you get there. >> >> Then, looks we have to use software single step. >> > > Hi, > I'm testing using software single stepping to move threads out of the > the jump pad and I've ran into a problem which I'm really unsure how to > fix and since the run control stuff is quite hard to follow help is > appreciated. > > Some context : > > I'm using the following program to test on ARM : > > #include "trace-common.h" > #include > > static void > begin (void) > {} > static void > end (void) > {} > > int > main () > { > begin (); > FAST_TRACEPOINT_LABEL(set_point); > FAST_TRACEPOINT_LABEL(other_point); > end (); > return 0; > } > > To compile : > gcc -marm -Wl,--no-as-needed move-out.c libinproctrace.so -g > -Wl,-rpath,$ORIGIN -lm -o move-out > > My gdb version is a soon to be posted fast tracepoint branch for ARM: > https://github.com/hexa00/binutils-gdb/commit/14912518f37abc2eb4594b42ca161c912ef6b6cd > > > Running gdbserver under gdb as such : > > gdb gdbserver --debug -ex "break linux_stabilize_threads" -ex "run > --once :7777 ./move-out > > and gdb with the following commands: > set pagination off > set non-stop on > set remotetimeout unlimited > tar rem :7777 > break main > break end > c > #break on last jump pad instruction > break *gdb_agent_gdb_jump_pad_buffer + 43*4 > ftrace set_point > tstart > c > #delete the breakpoint to fool gdbserver a bit. > delete 3 > ftrace other_point > > In the logs I can see : > > stop_all_lwps done, setting stopping_threads back to !stopping > <<<< exiting stop_all_lwps > Checking whether LWP 5148 needs to move out of the jump pad. > fast_tracepoint_collecting > in jump pad of tpoint (4, 85d8); jump_pad(33000, 330b0); adj_insn(330ac, > 9393939393939393) > fast_tracepoint_collecting, returning need-single-step > (330ac-9393939393939393) > Checking whether LWP 5148 needs to move out of the jump pad...it does > LWP 5148 needs stabilizing (in jump pad) > Resuming lwp 5148 (continue, signal 0, stop not expected) > lwp 5148 wants to get out of fast tracepoint jump pad single-stepping > stop pc is 0x330ac > pc is 0x330ac > Writing f001f0e7 to 0x000085dc in process 5148 > stop pc is 0x330ac > continue from pc 0x330ac > Checking whether LWP 5650 needs to move out of the jump pad. > sigchld_handler > fast_tracepoint_collecting > fast_tracepoint_collecting: not collecting (and nobody is). > Checking whether LWP 5650 needs to move out of the jump pad...no > >>>> entering linux_wait_1 > linux_wait_1: [] > my_waitpid (-1, 0x40000001) > my_waitpid (-1, 0x40000001): status(57f), 5148 > LWFE: waitpid(-1, ...) returned 5148, ERRNO-OK > LLW: waitpid 5148 received Trace/breakpoint trap (stopped) > stop pc is 0x85dc > pc is 0x85dc > CSBB: LWP 5148.5148 stopped by software breakpoint > my_waitpid (-1, 0x40000001) > my_waitpid (-1, 0x40000001): status(ffffffff), 0 > LWFE: waitpid(-1, ...) returned 0, ERRNO-OK > leader_pid=5148, leader_lp!=NULL=1, num_lwps=2, zombie=0 > LLW: exit (no unwaited-for LWP) > linux_wait_1 ret = null_ptid, TARGET_WAITKIND_NO_RESUMED > <<<< exiting linux_wait_1 > ../../../gdb/gdbserver/linux-low.c:1922: A problem internal to GDBserver > has been detected. > unsuspend LWP 5148, suspended=-1 > > > The main problem seems to be that as we enter linux_wait_1 in > stabilize_threads gdbserver gets the stopped event of the single step > breakpoint : > > LLW: waitpid 5148 received Trace/breakpoint trap (stopped) > stop pc is 0x85dc > pc is 0x85dc > CSBB: LWP 5148.5148 stopped by software breakpoint > > but this event is filtered out by: linux-low.c:2683 > > /* ... and find an LWP with a status to report to the core, if > any. */ > event_thread = (struct thread_info *) > find_inferior (&all_threads, status_pending_p_callback, &filter_ptid); > > Here status_pending_p_callback find that lwp_resumed is true and returns > 0 thus filtering out the event. > > Thus we go back to linux_stabilize_threads do nothing, end the loop > since lwp is now stopped. > > And then try to unsuspend a thread that was not suspended and hit the > assert. > > Ideas on how this should be fixed ? > > Thanks, > Antoine > > I managed to fix it like so : if (lp->status_pending_p I've tested this in all stop/non stop and it works properly. Basically what happens is that if stabilize_threads is not called in the context of linux_resume and that gdbserver needs to report an event, it won't since last_resume_kind can be resume_stop. In the current case gdbserver is in cmd_qtdp, the last command was continue (vCont;c) in all stop mode so last_resume_kind is resume_stop. So when going in linux_wait, the event is filtered out by : event_thread = (struct thread_info *) find_inferior (&all_threads, status_pending_p_callback, &filter_ptid); Since status_pending_p_callback returns false. Note that this fix may not the best one... but it may be some progress... Any ideas are welcome, otherwise I will add it to my patch set and there can be more discussion at review. Thanks, Antoine --- a/gdb/gdbserver/linux-low.c +++ b/gdb/gdbserver/linux-low.c @@ -1693,7 +1693,10 @@ status_pending_p_callback (struct inferior_list_entry *entry, void *arg) if (!ptid_match (ptid_of (thread), ptid)) return 0; - if (!lwp_resumed (lp)) + /* If we are stabilizing threads, threads have been stopped except the + ones that are moving out of the jump pad. The events of those threads + need to be reported whatever the last_resume_kind is. */ + if (!lwp_resumed (lp) && !stabilizing_threads) return 0;