From patchwork Thu Nov 13 14:58:48 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 3709 Received: (qmail 28845 invoked by alias); 13 Nov 2014 14:58:56 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 28833 invoked by uid 89); 13 Nov 2014 14:58:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 13 Nov 2014 14:58:53 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id sADEwoai023632 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 13 Nov 2014 09:58:51 -0500 Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id sADEwm2D018883; Thu, 13 Nov 2014 09:58:49 -0500 Message-ID: <5464C728.8090204@redhat.com> Date: Thu, 13 Nov 2014 14:58:48 +0000 From: Pedro Alves User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.1 MIME-Version: 1.0 To: Don Breazeal , gdb-patches@sourceware.org Subject: Re: [PATCH 00/16 v3] Linux extended-remote fork and exec events References: <1408580964-27916-1-git-send-email-donb@codesourcery.com> <1414798134-11536-1-git-send-email-donb@codesourcery.com> <5464B44B.3090406@redhat.com> <5464B6A0.9020602@redhat.com> In-Reply-To: <5464B6A0.9020602@redhat.com> On 11/13/2014 01:48 PM, Pedro Alves wrote: > On 11/13/2014 01:38 PM, Pedro Alves wrote: >> On 10/31/2014 11:28 PM, Don Breazeal wrote: >>> >>> - gdb.threads/thread-execl.exp gives a couple of failures related to >>> scheduler locking. As with the previous item, after spending some >>> time on this I concluded that pursuing it further now would be >>> feature-creep, and that this should be tracked with a bug report. >> >> Do you have more details on this? >> >> Looking at the exec race you mentioned, I thought that thread-execl.exp should >> expose it, given that the point of the test is exactly a thread other than >> the main thread execing. But then I stumbled on the fact that running it with >> your series on top of currently mainline often crashes gdb: >> >> $ make check RUNTESTFLAGS="--target_board=native-extended-gdbserver thread-execl.exp" >> ... >> Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.threads/thread-execl.exp ... >> ERROR: Process no longer exists >> >> === gdb Summary === >> >> # of expected passes 9 >> # of unresolved testcases 1 >> >> Odd that this doesn't trigger with native testing. > > Hmm, here's what valgrind shows (against gdbserver): > $ valgrind ./gdb -data-directory=data-directory ./testsuite/gdb.threads/thread-execl -ex "tar extended-remote :9999" -ex "b thread_execler" -ex "c" -ex "set scheduler-locking on" > ... > Breakpoint 1, thread_execler (arg=0x0) at /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.threads/thread-execl.c:29 > 29 if (execl (image, image, NULL) == -1) > (gdb) n > Thread 32509.32509 is executing new program: /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/thread-execl > [New Thread 32509.32532] > ==32510== Invalid read of size 4 > ==32510== at 0x5AA7D8: delete_breakpoint (breakpoint.c:13989) > ==32510== by 0x6285D3: delete_thread_breakpoint (thread.c:100) > ==32510== by 0x628603: delete_step_resume_breakpoint (thread.c:109) > ==32510== by 0x61622B: delete_thread_infrun_breakpoints (infrun.c:2928) > ==32510== by 0x6162EF: for_each_just_stopped_thread (infrun.c:2958) > ==32510== by 0x616311: delete_just_stopped_threads_infrun_breakpoints (infrun.c:2969) > ==32510== by 0x616C96: fetch_inferior_event (infrun.c:3267) > ==32510== by 0x63A2DE: inferior_event_handler (inf-loop.c:57) > ==32510== by 0x4E0E56: remote_async_serial_handler (remote.c:11877) > ==32510== by 0x4AF620: run_async_handler_and_reschedule (ser-base.c:137) > ==32510== by 0x4AF6F0: fd_event (ser-base.c:182) > ==32510== by 0x63806D: handle_file_event (event-loop.c:762) > ==32510== Address 0xcf333e0 is 16 bytes inside a block of size 200 free'd > ==32510== at 0x4A07577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) > ==32510== by 0x77CB74: xfree (common-utils.c:98) > ==32510== by 0x5AA954: delete_breakpoint (breakpoint.c:14056) > ==32510== by 0x5988BD: update_breakpoints_after_exec (breakpoint.c:3765) > ==32510== by 0x61360F: follow_exec (infrun.c:1091) > ==32510== by 0x6186FA: handle_inferior_event (infrun.c:4061) > ==32510== by 0x616C55: fetch_inferior_event (infrun.c:3261) > ==32510== by 0x63A2DE: inferior_event_handler (inf-loop.c:57) > ==32510== by 0x4E0E56: remote_async_serial_handler (remote.c:11877) > ==32510== by 0x4AF620: run_async_handler_and_reschedule (ser-base.c:137) > ==32510== by 0x4AF6F0: fd_event (ser-base.c:182) > ==32510== by 0x63806D: handle_file_event (event-loop.c:762) > ==32510== > [Switching to Thread 32509.32532] > > Breakpoint 1, thread_execler (arg=0x0) at /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.threads/thread-execl.c:29 > 29 if (execl (image, image, NULL) == -1) > (gdb) Ah. The breakpoint in question is the step-resume breakpoint of the non-main thread, the one we "nexted". And the issue is that with native debugging, the target deletes all threads from GDB's list _before_ the exec event is reported: ... infrun: stop_pc = 0x400640 infrun: stepped into subroutine infrun: inserting step-resume breakpoint at 0x40076f <<<<<< infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0x7ffff7fc4700 (LWP 555)] at 0x400640 infrun: prepare_to_wait infrun: target_wait (-1, status) = infrun: -1 [process -1], infrun: status->kind = ignore infrun: TARGET_WAITKIND_IGNORE infrun: prepare_to_wait infrun: target_wait (-1, status) = infrun: -1 [process -1], infrun: status->kind = ignore infrun: TARGET_WAITKIND_IGNORE infrun: prepare_to_wait [Thread 0x7ffff7fc4700 (LWP 555) exited] Breakpoint 3, delete_thread (ptid=...) at /home/pedro/gdb/mygit/src/gdb/thread.c:371 371 delete_thread_1 (ptid, 0 /* not silent */); (top-gdb) But when remote debugging, there are no thread exit events, so GDB never deletes the thread that was "nexted". And that thread still has a dangling reference to the step-resume breakpoint. With this hack: --- gdb/linux-nat.c | 2 ++ gdb/linux-thread-db.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c index e81a560..df1d6e7 100644 --- a/gdb/linux-nat.c +++ b/gdb/linux-nat.c @@ -892,7 +892,9 @@ exit_lwp (struct lwp_info *lp) if (print_thread_events) printf_unfiltered (_("[%s exited]\n"), target_pid_to_str (lp->ptid)); +#if 0 delete_thread (lp->ptid); +#endif } delete_lwp (lp->ptid); diff --git a/gdb/linux-thread-db.c b/gdb/linux-thread-db.c index c49b567..59f8ec1 100644 --- a/gdb/linux-thread-db.c +++ b/gdb/linux-thread-db.c @@ -597,6 +597,8 @@ enable_thread_event_reporting (void) td_err_e err; struct thread_db_info *info; + return; + info = get_thread_db_info (ptid_get_pid (inferior_ptid)); /* We cannot use the thread event reporting facility if these