From patchwork Wed Feb 14 15:55:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 25935 Received: (qmail 80735 invoked by alias); 14 Feb 2018 15:55:54 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 80709 invoked by uid 89); 14 Feb 2018 15:55:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy= X-HELO: mx1.redhat.com Received: from mx3-rdu2.redhat.com (HELO mx1.redhat.com) (66.187.233.73) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 14 Feb 2018 15:55:51 +0000 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 75E1640201A0; Wed, 14 Feb 2018 15:55:50 +0000 (UTC) Received: from [127.0.0.1] (ovpn04.gateway.prod.ext.ams2.redhat.com [10.39.146.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id D917810102FE; Wed, 14 Feb 2018 15:55:49 +0000 (UTC) Subject: [PATCH] Fix GDB crash after Quit thrown from unwinder sniffer (Re: [RFA] Remove cleanup from frame_prepare_for_sniffer) To: Tom Tromey , gdb-patches@sourceware.org References: <20171008225243.21921-1-tom@tromey.com> From: Pedro Alves Message-ID: Date: Wed, 14 Feb 2018 15:55:49 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <20171008225243.21921-1-tom@tromey.com> Hi Tom, On 10/08/2017 11:52 PM, Tom Tromey wrote: > Currently frame_prepare_for_sniffer returns a cleanup. This patch > changes it to return void, and exposes frame_cleanup_after_sniffer to > the caller. > > Normally I would write an RAII class for this sort of thing; but > because there was just a single caller of frame_prepare_for_sniffer, > and because this caller is already using try/catch, I thought it > seemed ok to require explicit calls in this instance. > > Regression tested by the buildbot. > > gdb/ChangeLog > 2017-10-08 Tom Tromey > > * frame-unwind.c (frame_unwind_try_unwinder): Update. > * frame.h (frame_cleanup_after_sniffer): Declare. > (frame_prepare_for_sniffer): Return void. > * frame.c (frame_cleanup_after_sniffer): No longer static. Change > type of argument. > (frame_prepare_for_sniffer): Return void. I think this caused a regression. See fix below. WDYT? From 8db37aa3966407c9190820a75e23a75688af9f95 Mon Sep 17 00:00:00 2001 From: Pedro Alves Date: Wed, 14 Feb 2018 15:11:58 +0000 Subject: [PATCH] Fix GDB crash after Quit thrown from unwinder sniffer I ran into a GDB crash in gdb.base/bp-cmds-continue-ctrl-c.exp in my multi-target branch, which turns out exposed a bug that exists in master too. That testcase has a breakpoint with a "continue" command associated. Then the breakpoint is constantly being hit. At the same time, the testcase is continualy interrupting the program with Ctrl-C, and re-resuming it, in a loop. Running that testcase manually under Valgrind, after a few sequences of 'Ctrl-C' + 'continue', I got: Breakpoint 1, Quit (gdb) ==21270== Invalid read of size 8 ==21270== at 0x4D8185: pyuw_this_id(frame_info*, void**, frame_id*) (py-unwind.c:461) ==21270== by 0x6D426A: compute_frame_id(frame_info*) (frame.c:505) ==21270== by 0x6D43B7: get_frame_id(frame_info*) (frame.c:537) ==21270== by 0x84F3B8: scoped_restore_current_thread::scoped_restore_current_thread() (thread.c:1678) ==21270== by 0x718E3D: fetch_inferior_event(void*) (infrun.c:4076) ==21270== by 0x7067C9: inferior_event_handler(inferior_event_type, void*) (inf-loop.c:43) ==21270== by 0x45BEF9: handle_target_event(int, void*) (linux-nat.c:4419) ==21270== by 0x6C4255: handle_file_event(file_handler*, int) (event-loop.c:733) ==21270== by 0x6C47F8: gdb_wait_for_event(int) (event-loop.c:859) ==21270== by 0x6C3666: gdb_do_one_event() (event-loop.c:322) ==21270== by 0x6C3712: start_event_loop() (event-loop.c:371) ==21270== by 0x746801: captured_command_loop() (main.c:329) ==21270== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==21270== ==21270== ==21270== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==21270== Access not within mapped region at address 0x0 ==21270== at 0x4D8185: pyuw_this_id(frame_info*, void**, frame_id*) (py-unwind.c:461) ==21270== by 0x6D426A: compute_frame_id(frame_info*) (frame.c:505) ==21270== by 0x6D43B7: get_frame_id(frame_info*) (frame.c:537) ==21270== by 0x84F3B8: scoped_restore_current_thread::scoped_restore_current_thread() (thread.c:1678) ==21270== by 0x718E3D: fetch_inferior_event(void*) (infrun.c:4076) ==21270== by 0x7067C9: inferior_event_handler(inferior_event_type, void*) (inf-loop.c:43) ==21270== by 0x45BEF9: handle_target_event(int, void*) (linux-nat.c:4419) ==21270== by 0x6C4255: handle_file_event(file_handler*, int) (event-loop.c:733) ==21270== by 0x6C47F8: gdb_wait_for_event(int) (event-loop.c:859) ==21270== by 0x6C3666: gdb_do_one_event() (event-loop.c:322) ==21270== by 0x6C3712: start_event_loop() (event-loop.c:371) ==21270== by 0x746801: captured_command_loop() (main.c:329) ==21270== If you believe this happened as a result of a stack ==21270== overflow in your program's main thread (unlikely but ==21270== possible), you can try to increase the size of the ==21270== main thread stack using the --main-stacksize= flag. ==21270== The main thread stack size used in this run was 8388608. ==21270== Above, when we get to compute_frame_id, fi->unwind is non-NULL, meaning, we found an unwinder, in this case the Python unwinder, but somehow, fi->prologue_cache is left NULL. pyuw_this_id then crashes because it assumes fi->prologue_cache is non-NULL: static void pyuw_this_id (struct frame_info *this_frame, void **cache_ptr, struct frame_id *this_id) { *this_id = ((cached_frame_info *) *cache_ptr)->frame_id; ^^^^^^^^^^ '*cache_ptr' here is 'fi->prologue_cache'. There's a quit() call in pyuw_sniffer that I believe is the one that sometimes triggers the crash above. The crash can be reproduced easily with this hack to force a quit out of the python unwinder: --- a/gdb/python/py-unwind.c +++ b/gdb/python/py-unwind.c @@ -497,6 +497,8 @@ pyuw_sniffer (const struct frame_unwind *self, struct frame_info *this_frame, struct gdbarch *gdbarch = (struct gdbarch *) (self->unwind_data); cached_frame_info *cached_frame; + quit (); + gdbpy_enter enter_py (gdbarch, current_language); TRACE_PY_UNWIND (3, "%s (SP=%s, PC=%s)\n", __FUNCTION__, After that quit is thrown, any subsequent operation that involves unwinding results in GDB crashing with SIGSEGV like above. The problem is that this commit: commit 30a9c02feff56bd58a276c2a7262f364baa558ac CommitDate: Sun Oct 8 23:16:42 2017 -0600 Subject: Remove cleanup from frame_prepare_for_sniffer missed that we need to call frame_cleanup_after_sniffer before rethrowing the exception too. Without the fix, the "bt" added to gdb.base/bp-cmds-continue-ctrl-c.exp in this commit makes GDB crash: Running src/gdb/testsuite/gdb.base/bp-cmds-continue-ctrl-c.exp ... ERROR: Process no longer exists gdb/ChangeLog: yyyy-mm-dd Pedro Alves * frame-unwind.c (frame_unwind_try_unwinder): Always call frame_cleanup_after_sniffer on exception. gdb/testsuite/ChangeLog: yyyy-mm-dd Pedro Alves * gdb.base/bp-cmds-continue-ctrl-c.exp (do_test): Test "bt" after getting a "Quit". --- gdb/frame-unwind.c | 3 ++- gdb/testsuite/gdb.base/bp-cmds-continue-ctrl-c.exp | 13 +++++++++++++ 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/gdb/frame-unwind.c b/gdb/frame-unwind.c index 66a28ae1c9..e6e63539ad 100644 --- a/gdb/frame-unwind.c +++ b/gdb/frame-unwind.c @@ -110,13 +110,14 @@ frame_unwind_try_unwinder (struct frame_info *this_frame, void **this_cache, /* Catch all exceptions, caused by either interrupt or error. Reset *THIS_CACHE. */ *this_cache = NULL; + frame_cleanup_after_sniffer (this_frame); + if (ex.error == NOT_AVAILABLE_ERROR) { /* This usually means that not even the PC is available, thus most unwinders aren't able to determine if they're the best fit. Keep trying. Fallback prologue unwinders should always accept the frame. */ - frame_cleanup_after_sniffer (this_frame); return 0; } throw_exception (ex); diff --git a/gdb/testsuite/gdb.base/bp-cmds-continue-ctrl-c.exp b/gdb/testsuite/gdb.base/bp-cmds-continue-ctrl-c.exp index 8c2fa77d71..43600fcf81 100644 --- a/gdb/testsuite/gdb.base/bp-cmds-continue-ctrl-c.exp +++ b/gdb/testsuite/gdb.base/bp-cmds-continue-ctrl-c.exp @@ -89,6 +89,19 @@ proc do_test {} { } -re "Quit\r\n$gdb_prompt $" { send_log "$internal_pass (Quit)\n" + + # Check that if we managed to quit somewhere deep in + # the unwinders, we can still unwind again. + set ok 0 + gdb_test_multiple "bt" "$internal_pass (bt)" { + -re "#0.*$gdb_prompt $" { + send_log "$internal_pass (bt)\n" + set ok 1 + } + } + if {!$ok} { + return + } } -re "Quit\r\n\r\nCommand aborted.\r\n$gdb_prompt $" { send_log "$internal_pass (Command aborted)\n"