From patchwork Mon Mar 13 21:46:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Burgess X-Patchwork-Id: 66327 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C7DD23858C33 for ; Mon, 13 Mar 2023 21:47:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C7DD23858C33 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1678744069; bh=DpT6Jd52PRe21Umngx/qgs0+dKBnK5fKyLHEdUt7LJ0=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=gsyR5LrvwvIf+JlgZDUQo/DGQ/Jk7Ud5MUAfTtMwDCpleZcK08/uvTAf4YMwa8vqW L5778Q0mdXCDUjkVKsglPSrIXGJ8HW8SqBwbjVEqj7fgymzkq6BWAO8qS8/4I3chuh e68WFDj54VS2btm+nN5LFBkBtfwtyCaBa2+BhLDM= X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 30DF13858423 for ; Mon, 13 Mar 2023 21:46:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 30DF13858423 Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-262-PM_lOLkUNsGSpo0pehfajw-1; Mon, 13 Mar 2023 17:46:28 -0400 X-MC-Unique: PM_lOLkUNsGSpo0pehfajw-1 Received: by mail-ed1-f72.google.com with SMTP id t14-20020a056402240e00b004fb36e6d670so6851248eda.5 for ; Mon, 13 Mar 2023 14:46:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678743987; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DpT6Jd52PRe21Umngx/qgs0+dKBnK5fKyLHEdUt7LJ0=; b=BazYjf1rk4YaGy20j6zKOW3FbYFGzOAslcjZozMnjsZXr6apXIBaKNO2O040Jl8DXI tilj3MoQraZK5AXPOVLyvOjwyArcX0NYMZqotOReIUj39q8dtKLTLGb7vCkblUEoIMDF T3IFzkOrYfWXmgiZtaqk5tARCDh0E499QbDqFN8G3Bg6ptCgFDXCMGbQCEztpqkkxTh9 QvNOd5VoUhyz6a5cGuQooTCwD297S54DfidFFKyJOYEVvZ8O1Ckw9v2TThHcmZcsJJmx WtHbXGDc1+UQavuaCx0hV5BZ95d5UhoNH2HGTko4G/nKk23bccEJh06jt0xca47jh4Rh e7DA== X-Gm-Message-State: AO0yUKUY1L7zscJj+eLKn1NNK2HCJKdKD8oSM5XjM8a1y1LDeohDuJPA 3V8VwbTpP2i6v+HRhjeWEBoKQVQzq90iLmXfAuv8r6i7kHBFJOzuWzk4sywkiLs/J5QVJGa3bZJ kBlXblY1MYGh+4mQB8Fegibqx6ClrcFK6Wdn2AQukxcnxZAbCEiqglrqVU0SByKf6QjPfToNDP+ IPWpiYuA== X-Received: by 2002:a17:907:7d92:b0:8b1:2614:edfe with SMTP id oz18-20020a1709077d9200b008b12614edfemr46165ejc.9.1678743987209; Mon, 13 Mar 2023 14:46:27 -0700 (PDT) X-Google-Smtp-Source: AK7set9Fbwq0sGISrgovd2bVlZIJUuJavva1nVRX9tfxzgSXBWcKhxy3n5k45rk9KeietMbemcDB3w== X-Received: by 2002:a17:907:7d92:b0:8b1:2614:edfe with SMTP id oz18-20020a1709077d9200b008b12614edfemr46149ejc.9.1678743986857; Mon, 13 Mar 2023 14:46:26 -0700 (PDT) Received: from localhost (95.72.115.87.dyn.plus.net. [87.115.72.95]) by smtp.gmail.com with ESMTPSA id g17-20020a1709061e1100b00922547486f9sm263512ejj.146.2023.03.13.14.46.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Mar 2023 14:46:26 -0700 (PDT) To: gdb-patches@sourceware.org Cc: Andrew Burgess Subject: [PATCH 3/3] gdb/riscv: fix regressions in gdb.base/unwind-on-each-insn.exp Date: Mon, 13 Mar 2023 21:46:19 +0000 Message-Id: X-Mailer: git-send-email 2.25.4 In-Reply-To: References: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Andrew Burgess via Gdb-patches From: Andrew Burgess Reply-To: Andrew Burgess Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Sender: "Gdb-patches" This commit builds on the previous one to fix all the remaining failures in gdb.base/unwind-on-each-insn.exp for RISC-V. The problem we have in gdb.base/unwind-on-each-insn.exp is that, when we are in the function epilogue, the previous frame and stack pointer values are being restored, and so, the values that we calculated during the function prologue are no longer suitable. Here's an example from the function 'bar' in the mentioned test. This was compiled for 64-bit RISC-V with compressed instruction support: Dump of assembler code for function bar: 0x000000000001018a <+0>: add sp,sp,-32 0x000000000001018c <+2>: sd ra,24(sp) 0x000000000001018e <+4>: sd fp,16(sp) 0x0000000000010190 <+6>: add fp,sp,32 0x0000000000010192 <+8>: sd a0,-24(fp) 0x0000000000010196 <+12>: ld a0,-24(fp) 0x000000000001019a <+16>: jal 0x10178 0x000000000001019e <+20>: nop 0x00000000000101a0 <+22>: ld ra,24(sp) 0x00000000000101a2 <+24>: ld fp,16(sp) 0x00000000000101a4 <+26>: add sp,sp,32 0x00000000000101a6 <+28>: ret End of assembler dump. When we are at address 0x101a4 the previous instruction has restored the frame-pointer, as such GDB's (current) preference for using the frame-pointer as the frame base address is clearly not going to work. We need to switch to using the stack-pointer instead. At address 0x101a6 the previous instruction has restored the stack-pointer value. Currently GDB will not understand this and so will still assume the stack has been decreased by 32 bytes in this function. My proposed solution is to add some additional code that scans the instructions at the current $pc. We're looking for this pattern: ld fp,16(sp) add sp,sp,32 ret Obviously the immediates can change, but the basic pattern indicates that the function is in the process of restoring state before returning. With this implemented then gdb.base/unwind-on-each-insn.exp now fully passes. Obviously what I've implemented is just a heuristic. It's not going to work for every function. If the compiler reorders the instructions, or merges the epilogue back into the function body then GDB is once again going to get the frame-id wrong. I'm OK with that. Remember, this is for debugging code without debug information, and (in our imagined situation) with more aggressive levels of optimisation being used. Obviously GDB is going to struggle in these situations. My thinking is, lets get something in place now. Then, later, if possible, we might be able to improve the logic to cover more situations -- if there's an interest in doing so. But I figure we need something in place as a starting point. After this commit gdb.base/unwind-on-each-insn.exp passes with no failures on RV64. --- gdb/riscv-tdep.c | 235 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 231 insertions(+), 4 deletions(-) diff --git a/gdb/riscv-tdep.c b/gdb/riscv-tdep.c index 5ef76555918..771d892df26 100644 --- a/gdb/riscv-tdep.c +++ b/gdb/riscv-tdep.c @@ -1986,6 +1986,214 @@ riscv_insn::decode (struct gdbarch *gdbarch, CORE_ADDR pc) } } +/* Return true if INSN represents an instruction something like: + + ld fp,IMMEDIATE(sp) + + That is, a load from stack-pointer plus some immediate offset, with the + result stored into the frame pointer. We also accept 'lw' as well as + 'ld'. */ + +static bool +is_insn_load_of_fp_from_sp (const struct riscv_insn &insn) +{ + return ((insn.opcode () == riscv_insn::LD + || insn.opcode () == riscv_insn::LW) + && insn.rd () == RISCV_FP_REGNUM + && insn.rs1 () == RISCV_SP_REGNUM); +} + +/* Return true if INSN represents an instruction something like: + + add sp,sp,IMMEDIATE + + That is, an add of an immediate to the value in the stack pointer + register, with the result stored back to the stack pointer register. */ + +static bool +is_insn_addi_of_sp_to_sp (const struct riscv_insn &insn) +{ + return ((insn.opcode () == riscv_insn::ADDI + || insn.opcode () == riscv_insn::ADDIW) + && insn.rd () == RISCV_SP_REGNUM + && insn.rs1 () == RISCV_SP_REGNUM); +} + +/* Is the instruction in code memory prior to address PC a load from stack + instruction? Return true if it is, otherwise, return false. + + This is a best effort that is used as part of the function prologue + scanning logic. With compressed instructions and arbitrary control + flow in the inferior, we can never be certain what the instruction + prior to PC is. + + This function first looks for a compressed instruction, then looks for + a 32-bit non-compressed instruction. */ + +static bool +previous_insn_is_load_fp_from_stack (struct gdbarch *gdbarch, CORE_ADDR pc) +{ + struct riscv_insn insn; + insn.decode (gdbarch, pc - 2); + gdb_assert (insn.length () > 0); + + if (insn.length () != 2 || !is_insn_load_of_fp_from_sp (insn)) + { + insn.decode (gdbarch, pc - 4); + gdb_assert (insn.length () > 0); + + if (insn.length () != 4 || !is_insn_load_of_fp_from_sp (insn)) + return false; + } + + riscv_unwinder_debug_printf + ("previous instruction at %s (length %d) was 'ld'", + core_addr_to_string (pc - insn.length ()), insn.length ()); + return true; +} + +/* Is the instruction in code memory prior to address PC an add of an + immediate to the stack pointer, with the result being written back into + the stack pointer? Return true and set *PREV_PC to the address of the + previous instruction if we believe the previous instruction is such an + add, otherwise return false and *PREV_PC is undefined. + + This is a best effort that is used as part of the function prologue + scanning logic. With compressed instructions and arbitrary control + flow in the inferior, we can never be certain what the instruction + prior to PC is. + + This function first looks for a compressed instruction, then looks for + a 32-bit non-compressed instruction. */ + +static bool +previous_insn_is_add_imm_to_sp (struct gdbarch *gdbarch, CORE_ADDR pc, + CORE_ADDR *prev_pc) +{ + struct riscv_insn insn; + insn.decode (gdbarch, pc - 2); + gdb_assert (insn.length () > 0); + + if (insn.length () != 2 || !is_insn_addi_of_sp_to_sp (insn)) + { + insn.decode (gdbarch, pc - 4); + gdb_assert (insn.length () > 0); + + if (insn.length () != 4 || !is_insn_addi_of_sp_to_sp (insn)) + return false; + } + + riscv_unwinder_debug_printf + ("previous instruction at %s (length %d) was 'add'", + core_addr_to_string (pc - insn.length ()), insn.length ()); + *prev_pc = pc - insn.length (); + return true; +} + +/* Try to spot when PC is located in an exit sequence for a particular + function. Detecting an exit sequence involves a limited amount of + scanning backwards through the disassembly, and so, when considering + compressed instructions, we can never be certain that we have + disassembled the preceding instructions correctly. On top of that, we + can't be certain that the inferior arrived at PC by passing through the + preceding instructions. + + With all that said, we know that using prologue scanning to figure a + functions unwind information starts to fail when we consider returns + from an instruction -- we must pass through some instructions that + restore the previous state prior to the final return instruction, and + with state partially restored, our prologue derived unwind information + is no longer valid. + + This function then, aims to spot instruction sequences like this: + + ld fp, IMM_1(sp) + add sp, sp, IMM_2 + ret + + The first instruction restores the previous frame-pointer value, the + second restores the previous stack pointer value, and the final + instruction is the actual return. + + We need to consider that some or all of these instructions might be + compressed. + + This function makes the assumption that, when the inferior reaches the + 'ret' instruction the stack pointer will have been restored to its value + on entry to this function. This assumption will be true in most well + formed programs. + + Return true if we detect that we are in such an instruction sequence, + that is PC points at one of the three instructions given above. In this + case, set *OFFSET to IMM_2 if PC points to either of the first + two instructions (the 'ld' or 'add'), otherwise set *OFFSET to 0. + + Otherwise, this function returns false, and the contents of *OFFSET are + undefined. */ + +static bool +riscv_detect_end_of_function (struct gdbarch *gdbarch, CORE_ADDR pc, + int *offset) +{ + *offset = 0; + + /* We only want to scan a maximum of 3 instructions. */ + for (int i = 0; i < 3; ++i) + { + struct riscv_insn insn; + insn.decode (gdbarch, pc); + gdb_assert (insn.length () > 0); + + if (is_insn_load_of_fp_from_sp (insn)) + { + riscv_unwinder_debug_printf ("found 'ld' instruction at %s", + core_addr_to_string (pc)); + if (i > 0) + return false; + pc += insn.length (); + } + else if (is_insn_addi_of_sp_to_sp (insn)) + { + riscv_unwinder_debug_printf ("found 'add' instruction at %s", + core_addr_to_string (pc)); + if (i > 1) + return false; + if (i == 0) + { + if (!previous_insn_is_load_fp_from_stack (gdbarch, pc)) + return false; + + i = 1; + } + *offset = insn.imm_signed (); + pc += insn.length (); + } + else if (insn.opcode () == riscv_insn::JALR + && insn.rs1 () == RISCV_RA_REGNUM + && insn.rs2 () == RISCV_ZERO_REGNUM) + { + riscv_unwinder_debug_printf ("found 'ret' instruction at %s", + core_addr_to_string (pc)); + gdb_assert (i != 1); + if (i == 0) + { + CORE_ADDR prev_pc; + if (!previous_insn_is_add_imm_to_sp (gdbarch, pc, &prev_pc)) + return false; + if (!previous_insn_is_load_fp_from_stack (gdbarch, prev_pc)) + return false; + i = 2; + } + + pc += insn.length (); + } + else + return false; + } + + return true; +} + /* The prologue scanner. This is currently only used for skipping the prologue of a function when the DWARF information is not sufficient. However, it is written with filling of the frame cache in mind, which @@ -2000,6 +2208,7 @@ riscv_scan_prologue (struct gdbarch *gdbarch, struct riscv_unwind_cache *cache) { CORE_ADDR cur_pc, next_pc, after_prologue_pc; + CORE_ADDR original_end_pc = end_pc; CORE_ADDR end_prologue_addr = 0; /* Find an upper limit on the function prologue using the debug @@ -2031,10 +2240,7 @@ riscv_scan_prologue (struct gdbarch *gdbarch, next_pc = cur_pc + insn.length (); /* Look for common stack adjustment insns. */ - if ((insn.opcode () == riscv_insn::ADDI - || insn.opcode () == riscv_insn::ADDIW) - && insn.rd () == RISCV_SP_REGNUM - && insn.rs1 () == RISCV_SP_REGNUM) + if (is_insn_addi_of_sp_to_sp (insn)) { /* Handle: addi sp, sp, -i or: addiw sp, sp, -i */ @@ -2171,6 +2377,27 @@ riscv_scan_prologue (struct gdbarch *gdbarch, cache->frame_base_offset = -regs[RISCV_SP_REGNUM].k; } + /* Check to see if we are located near to a return instruction in + this function. If we are then the one or both of the stack + pointer and frame pointer may have been restored to their previous + value. If we can spot this situation then we can adjust which + register and offset we use for the frame base. */ + if (cache->frame_base_reg != RISCV_SP_REGNUM + || cache->frame_base_offset != 0) + { + int sp_offset; + + if (riscv_detect_end_of_function (gdbarch, original_end_pc, + &sp_offset)) + { + riscv_unwinder_debug_printf + ("in function epilogue at %s, stack offset is %d", + core_addr_to_string (original_end_pc), sp_offset); + cache->frame_base_reg= RISCV_SP_REGNUM; + cache->frame_base_offset = sp_offset; + } + } + /* Assign offset from old SP to all saved registers. As we don't have the previous value for the frame base register at this point, we store the offset as the address in the trad_frame, and