From patchwork Sun Feb 4 00:06:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bartosz Nitka X-Patchwork-Id: 25787 Received: (qmail 32462 invoked by alias); 4 Feb 2018 00:07:38 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 32278 invoked by uid 89); 4 Feb 2018 00:07:37 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.9 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=n2, glorious X-HELO: mail-wr0-f172.google.com Received: from mail-wr0-f172.google.com (HELO mail-wr0-f172.google.com) (209.85.128.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 04 Feb 2018 00:07:34 +0000 Received: by mail-wr0-f172.google.com with SMTP id t94so14541505wrc.5 for ; Sat, 03 Feb 2018 16:07:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=HS2u2i6Qh6s1gG97ZoOWnjMkonVM3rbVjahtDjmmEZo=; b=RYSuPgJ+uWSstfhjeesC9XyVJ3CHK4yQ/gZoWGT5OKN+sisMN+dug7i3WsaYS2kSxc SOCah6XUZ5b/8Mbq0wC/rUhVc+E2xxlw111FEi1DqX7qBCy7mxH8M5L0yDxUINTPwS5y IZn3pwKuS0MDmdWa0GGQnlXWFcPMx0RSqafUuYICzLYgWZnR5yEJ4MkzPFbdBqZzQZjE Gr75jB/SV/Cnja6LlkDCgMId5591zKNed+IE2dcM77HUaIN3jHRJu5vSbudHK6/CJ8CW EeSoRS2IYFaAe4z6zpjt7kAUGvKjgySQwJsto6Xw78Ewi1/UPD6YaLwaXmmC9nWHy6hP hbnA== X-Gm-Message-State: AKwxytfGBKNFljKgyzExVLeH1HJDllTXbAJqfEF0e6mb9pok2VDclRfn xmWJnRdbkVYuhi5F3+Hz603dBNXn X-Google-Smtp-Source: AH8x224TnhEMaFA3GbobRfdESJEC/+1Sc6YB+VRuqy/uS9GyXy5yFn8MV86RELN/nlUUZ0fD/Y6/7A== X-Received: by 10.223.135.155 with SMTP id b27mr25046557wrb.164.1517702852550; Sat, 03 Feb 2018 16:07:32 -0800 (PST) Received: from localhost.localdomain (cpc130678-camd16-2-0-cust893.know.cable.virginm.net. [82.37.251.126]) by smtp.gmail.com with ESMTPSA id g13sm2867278wrh.51.2018.02.03.16.07.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 03 Feb 2018 16:07:31 -0800 (PST) From: Bartosz Nitka To: gdb-patches@sourceware.org Cc: Bartosz Nitka Subject: [PATCH v3 1/1] Don't rewind PC for GHC generated frames Date: Sun, 4 Feb 2018 00:06:47 +0000 Message-Id: <20180204000647.19188-2-niteria@gmail.com> In-Reply-To: <20180204000647.19188-1-niteria@gmail.com> References: <20180204000647.19188-1-niteria@gmail.com> GHC - the Haskell compiler generates code that violates one of GDB's assumptions. GDB assumes that the address in a frame was generated by the call instruction and that it is the address right after the call instruction (I'm paraphrasing the comment in get_frame_address_in_block). So to get an address in the same block as the call instruction, one has to substract 1. This is doubly beneficial because some functions are "noreturn" and don't have further instructions after call, so GDB would be looking at gibberish. GHC generates completely different code. It uses jumps instead of call and manages the stack itself. Furthermore every piece of code is preceeded by some metadata called the Info Table. If we substract from the program counter it ends up pointing to the Info Table, which is undesirable. GHC has a workaround for this [1] that works most of the time, it basically lies in the DWARF data and extends the function one byte backwards. That helps with making unwinding succeed most of the time, but then the address is also used for looking up symbols and they can't be resolved. This change disables program counter rewinding for compilation units generated by GHC. Some additional context can be found here [2]. The impact. Let's take an example from [2]. Here's the example Haskell program (fib.hs): fib :: Int -> Int fib 0 = 0 fib 1 = 1 fib n = fib (n-1) + fib (n-2) main :: IO () main = print $ fib 20 If we run it with current GDB master we get: (gdb) br fib.hs:3 Breakpoint 1 at 0x405588: file fib.hs, line 3. (gdb) r Breakpoint 1, Main_zdwfib_info () at fib.hs:3 3 fib 1 = 1 (gdb) bt #0 Main_zdwfib_info () at fib.hs:3 #1 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #2 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #3 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #4 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #5 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #6 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #7 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #8 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #9 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #10 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #11 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #12 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #13 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #14 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #15 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #16 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #17 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #18 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #19 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4 #20 0x00000000004056d8 in Main_main2_info () at fib.hs:4 #21 0x000000000040bab0 in base_GHCziIOziHandleziText_zdwwriteBlocks_info () at libraries/base/GHC/IO/Handle/Text.hs:595 #22 0x0000000000488bf0 in ?? () at rts/Exception.cmm:332 Backtrace stopped: previous frame identical to this frame (corrupt stack?) Here's an analogous session after this patch: (gdb) br fib.hs:3 Breakpoint 1 at 0x405588: file fib.hs, line 3. (gdb) r Breakpoint 1, Main_zdwfib_info () at fib.hs:3 3 fib 1 = 1 (gdb) bt #0 Main_zdwfib_info () at fib.hs:3 #1 Main_zdwfib_info () at fib.hs:4 #2 Main_zdwfib_info () at fib.hs:4 #3 Main_zdwfib_info () at fib.hs:4 #4 Main_zdwfib_info () at fib.hs:4 #5 Main_zdwfib_info () at fib.hs:4 #6 Main_zdwfib_info () at fib.hs:4 #7 Main_zdwfib_info () at fib.hs:4 #8 Main_zdwfib_info () at fib.hs:4 #9 Main_zdwfib_info () at fib.hs:4 #10 Main_zdwfib_info () at fib.hs:4 #11 Main_zdwfib_info () at fib.hs:4 #12 Main_zdwfib_info () at fib.hs:4 #13 Main_zdwfib_info () at fib.hs:4 #14 Main_zdwfib_info () at fib.hs:4 #15 Main_zdwfib_info () at fib.hs:4 #16 Main_zdwfib_info () at fib.hs:4 #17 Main_zdwfib_info () at fib.hs:4 #18 Main_zdwfib_info () at fib.hs:4 #19 Main_zdwfib_info () at fib.hs:4 #20 0x00000000004056d8 in Main_main2_info () at fib.hs:4 #21 base_GHCziIOziHandleziText_zdwwriteBlocks_info () at libraries/base/GHC/IO/Handle/Text.hs:582 #22 stg_catch_frame_info () at rts/Exception.cmm:370 #23 stg_stop_thread_info () at rts/StgStartup.cmm:42 #24 0x00000000004a62ab in c2MH_str () #25 0x00010000006c94c0 in ?? () #26 0x00000000006e5e60 in ?? () #27 0x00007fffffffdae8 in ?? () #28 0x00007fffffffdcb0 in ?? () #29 0x00000000006c94c0 in Main_main3_closure () #30 0x00007fffffffdbd0 in ?? () #31 0x0000000000405480 in ?? () #32 0x00007fffffffdcb0 in ?? () #33 0x0000000000000000 in ?? () There are a couple of things to note here. First is that the unwinding got further, even a bit too far. It should have ended on frame #23 on stg_stop_thread_info. That's the first thing pushed on the stack when running a ligthweight thread. I'm not sure yet how GHC is supposed to signal that it's the last thing on the stack and if it's trying to do that. Second thing is that we get the correct line number for frame #22 *and* the symbol gets successfully resolved. Before the patch we were substracting one, making the symbol lookup fail. I think that's also what makes GDB give a cleaner output on frames #0 - #19. Note that to get the same results as me, you have to compile the compiler with debugging symbols as described on [2]. [1] https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/nativeGen/Dwarf/Types.hs;e9ae0cae9eb6a340473b339b5711ae76c6bdd045$399-417 [2] https://ghc.haskell.org/trac/ghc/wiki/DWARF gdb/ChangeLog: * dwarf2read.c (process_full_comp_unit): Populate * producer_is_ghc. * frame.c (get_frame_address_in_block): Don't rewind the program counter for code generated by GHC. * symtab.h (struct compunit_symtab): Add producer_is_ghc. --- gdb/ChangeLog | 7 +++++++ gdb/dwarf2read.c | 4 ++++ gdb/frame.c | 9 ++++++++- gdb/symtab.h | 3 +++ 4 files changed, 22 insertions(+), 1 deletion(-) diff --git a/gdb/ChangeLog b/gdb/ChangeLog index 3ce980c8c3..6b35bf34b6 100644 --- a/gdb/ChangeLog +++ b/gdb/ChangeLog @@ -1,3 +1,10 @@ +2018-02-01 Bartosz Nitka + + * dwarf2read.c (process_full_comp_unit): Populate producer_is_ghc. + * frame.c (get_frame_address_in_block): Don't rewind the program + counter for code generated by GHC. + * symtab.h (struct compunit_symtab): Add producer_is_ghc. + 2018-02-01 Yao Qi * arm-tdep.c (arm_record_data_proc_misc_ld_str): Rewrite it. diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c index 51d0f39f75..2516c48741 100644 --- a/gdb/dwarf2read.c +++ b/gdb/dwarf2read.c @@ -10501,6 +10501,10 @@ process_full_comp_unit (struct dwarf2_per_cu_data *per_cu, cust->epilogue_unwind_valid = 1; cust->call_site_htab = cu->call_site_htab; + + if (startswith (cu->producer, + "The Glorious Glasgow Haskell Compilation System")) + cust->producer_is_ghc = 1; } if (dwarf2_per_objfile->using_index) diff --git a/gdb/frame.c b/gdb/frame.c index 1384ecca4f..9ff0dcb130 100644 --- a/gdb/frame.c +++ b/gdb/frame.c @@ -2458,7 +2458,14 @@ get_frame_address_in_block (struct frame_info *this_frame) && (get_frame_type (this_frame) == NORMAL_FRAME || get_frame_type (this_frame) == TAILCALL_FRAME || get_frame_type (this_frame) == INLINE_FRAME)) - return pc - 1; + { + /* GHC intermixes metadata (info tables) with code, going back is + guaranteed to land us in the metadata. */ + struct compunit_symtab *cust = find_pc_compunit_symtab (pc); + if (cust != NULL && cust->producer_is_ghc) + return pc; + return pc - 1; + } return pc; } diff --git a/gdb/symtab.h b/gdb/symtab.h index f9d52e7697..c164e5ba5f 100644 --- a/gdb/symtab.h +++ b/gdb/symtab.h @@ -1432,6 +1432,9 @@ struct compunit_symtab instruction). This is supported by GCC since 4.5.0. */ unsigned int epilogue_unwind_valid : 1; + /* This CU was produced by Glasgow Haskell Compiler */ + unsigned int producer_is_ghc : 1; + /* struct call_site entries for this compilation unit or NULL. */ htab_t call_site_htab;