Message ID | 1452188697-23870-2-git-send-email-antoine.tremblay@ericsson.com |
---|---|
State | New |
Headers | show |
Antoine Tremblay <antoine.tremblay@ericsson.com> writes: Hi Antoine, > The reason for this is that the target's stack pointer is unavailable > when examining the trace buffer. What we are seeing is due to the > 'tfind' command creating a sentinel frame and unwinding it. If an > exception is thrown, we are left with the sentinel frame being displayed > at level #-1. The exception is thrown when the prologue unwinder tries > to read the stack pointer to construct an ID for the frame. > > This patch fixes this and similar issues by making all the arm unwinders > catch NOT_AVAILABLE_ERROR exceptions when either register or memory is > unreadable and report back to the frame core code with UNWIND_UNAVAILABLE. > > Note this commit log adapted from 7dfa3edc033c443036d9f2a3e01120f7fb54f498 > which fixed a similar issue for aarch64. It is right to follow aarch64 patch, but I am wondering whether we can do it better. Nowadays, the unwind termination due to unavailable memory is handled in unwinders in each arch backend. However, as we support more and more arch for tracepoint, can we handle the unwind termination in target independent code? The initial work of unwind termination due to unavailable memory was done by Pedro https://www.sourceware.org/ml/gdb-patches/2011-02/msg00611.html in a way that each unwinder was taught to terminate with UNWIND_UNAVAILABLE. At that moment, only x86 supports tracepoint, so it was reasonable to handle UNWIND_UNAVAILABLE inside unwinders of one arch. Now, the situation changes, because we have more and more arch need tracepoint support, if we can handle UNWIND_UNAVAILABLE in the callers of each unwinder, each unwinder doesn't have to worry about the unavailable at all. In fact, GDB has done that way when calling unwinder->sniffer, in frame_unwind_try_unwinder TRY { res = unwinder->sniffer (unwinder, this_frame, this_cache); } CATCH (ex, RETURN_MASK_ERROR) { if (ex.error == NOT_AVAILABLE_ERROR) { /* This usually means that not even the PC is available, thus most unwinders aren't able to determine if they're the best fit. Keep trying. Fallback prologue unwinders should always accept the frame. */ do_cleanups (old_cleanup); return 0; } throw_exception (ex); } END_CATCH we can wrap methods of 'struct frame_unwind' with try/catch, and handle NOT_AVAILABLE_ERROR properly. In this way, each unwinder doesn't have to worry about unavailable memory at all. Pedro, what do you think? Did you try this approach in the rest of 9 different ways :) (you said you "implemented this differently in about 10 different ways" in your email) ?
Yao Qi writes: > Antoine Tremblay <antoine.tremblay@ericsson.com> writes: > > Hi Antoine, > >> The reason for this is that the target's stack pointer is unavailable >> when examining the trace buffer. What we are seeing is due to the >> 'tfind' command creating a sentinel frame and unwinding it. If an >> exception is thrown, we are left with the sentinel frame being displayed >> at level #-1. The exception is thrown when the prologue unwinder tries >> to read the stack pointer to construct an ID for the frame. >> >> This patch fixes this and similar issues by making all the arm unwinders >> catch NOT_AVAILABLE_ERROR exceptions when either register or memory is >> unreadable and report back to the frame core code with UNWIND_UNAVAILABLE. >> >> Note this commit log adapted from 7dfa3edc033c443036d9f2a3e01120f7fb54f498 >> which fixed a similar issue for aarch64. > > It is right to follow aarch64 patch, but I am wondering whether we can > do it better. > > Nowadays, the unwind termination due to unavailable memory is handled in > unwinders in each arch backend. However, as we support more and more > arch for tracepoint, can we handle the unwind termination in target > independent code? > > The initial work of unwind termination due to unavailable memory was > done by Pedro https://www.sourceware.org/ml/gdb-patches/2011-02/msg00611.html > in a way that each unwinder was taught to terminate with > UNWIND_UNAVAILABLE. At that moment, only x86 supports tracepoint, so it > was reasonable to handle UNWIND_UNAVAILABLE inside unwinders of one arch. Now, > the situation changes, because we have more and more arch need > tracepoint support, if we can handle UNWIND_UNAVAILABLE in the callers > of each unwinder, each unwinder doesn't have to worry about the > unavailable at all. In fact, GDB has done that way when calling unwinder->sniffer, > in frame_unwind_try_unwinder > > TRY > { > res = unwinder->sniffer (unwinder, this_frame, this_cache); > } > CATCH (ex, RETURN_MASK_ERROR) > { > if (ex.error == NOT_AVAILABLE_ERROR) > { > /* This usually means that not even the PC is available, > thus most unwinders aren't able to determine if they're > the best fit. Keep trying. Fallback prologue unwinders > should always accept the frame. */ > do_cleanups (old_cleanup); > return 0; > } > throw_exception (ex); > } > END_CATCH > > we can wrap methods of 'struct frame_unwind' with try/catch, and handle > NOT_AVAILABLE_ERROR properly. In this way, each unwinder doesn't have > to worry about unavailable memory at all. > > Pedro, what do you think? Did you try this approach in the rest of 9 > different ways :) (you said you "implemented this differently in about > 10 different ways" in your email) ? Ping, Pedro ?
On 02/12/2016 02:46 PM, Yao Qi wrote: > we can wrap methods of 'struct frame_unwind' with try/catch, and handle > NOT_AVAILABLE_ERROR properly. In this way, each unwinder doesn't have > to worry about unavailable memory at all. > > Pedro, what do you think? Did you try this approach in the rest of 9 > different ways :) (you said you "implemented this differently in about > 10 different ways" in your email) ? I no longer recall exactly what I tried. :-) I think it may be a good idea. There are a few constraints that we need to keep in mind: - Frames that only have the PC available should have distinct frame ids, and it should be distinct from outer_frame_id. (See frame_id_build_unavailable_stack calls). This makes e.g., the frame_id_eq check in tfind_1 work as intended, see: https://sourceware.org/ml/gdb-patches/2013-12/msg00535.html - When an unwind sniffer throws, it'll destroy its struct frame_unwind_cache. So if we don't catch the error, the frame's this_id method can't return something more detailed than outer_frame_id. I don't see this done by wrapping methods of 'struct frame_unwind'. I think it'd work to have an ultimate-fallback unwinder that frame_unwind_find_by_frame returns instead of the internal error at the end. This would return UNWIND_UNAVAILABLE or UNWIND_MEMORY_ERROR in the unwinder->stop_reason method, depending on the error the last registered unwinder thrown. (That last unwinder will always be the arch's heuristic unwinder.) And it would return frame_id_build_unavailable_stack(PC) in the unwinder->this_id method if the last error was UNWIND_UNAVAILABLE, outer_frame_id otherwise (or we add a new frame_id_build_stackless function, to go along with frame_id_build_unavailable_stack). I think that would fix the cases where we end up internal erroring, like in today's Andreas' patch: https://sourceware.org/ml/gdb-patches/2016-02/msg00773.html And then the heuristic unwinders probably no longer need to care to use the safe_read_memory_xxx functions. And it'd fix the bogus cases where the sentinel frame level (-1) shows through, due to: struct frame_info * get_current_frame (void) { ... if (current_frame == NULL) { struct frame_info *sentinel_frame = create_sentinel_frame (current_program_space, get_current_regcache ()); if (catch_exceptions (current_uiout, unwind_to_current_frame, sentinel_frame, RETURN_MASK_ERROR) != 0) { /* Oops! Fake a current frame? Is this useful? It has a PC of zero, for instance. */ current_frame = sentinel_frame; } See recent example here: https://sourceware.org/ml/gdb-patches/2016-01/msg00222.html Thanks, Pedro Alves
Pedro Alves writes: > On 02/12/2016 02:46 PM, Yao Qi wrote: > >> we can wrap methods of 'struct frame_unwind' with try/catch, and handle >> NOT_AVAILABLE_ERROR properly. In this way, each unwinder doesn't have >> to worry about unavailable memory at all. >> >> Pedro, what do you think? Did you try this approach in the rest of 9 >> different ways :) (you said you "implemented this differently in about >> 10 different ways" in your email) ? > > I no longer recall exactly what I tried. :-) > > I think it may be a good idea. > > There are a few constraints that we need to keep in mind: > > - Frames that only have the PC available should have distinct frame ids, > and it should be distinct from outer_frame_id. (See frame_id_build_unavailable_stack calls). > > This makes e.g., the frame_id_eq check in tfind_1 work as intended, see: > https://sourceware.org/ml/gdb-patches/2013-12/msg00535.html > > - When an unwind sniffer throws, it'll destroy its > struct frame_unwind_cache. So if we don't catch the error, the > frame's this_id method can't return something more detailed than > outer_frame_id. > > I don't see this done by wrapping methods of 'struct frame_unwind'. > > I think it'd work to have an ultimate-fallback unwinder that > frame_unwind_find_by_frame returns instead of the internal error at > the end. This would return UNWIND_UNAVAILABLE or UNWIND_MEMORY_ERROR > in the unwinder->stop_reason method, depending on the error the last registered > unwinder thrown. (That last unwinder will always be the arch's heuristic unwinder.) > And it would return frame_id_build_unavailable_stack(PC) in the unwinder->this_id > method if the last error was UNWIND_UNAVAILABLE, outer_frame_id otherwise > (or we add a new frame_id_build_stackless function, to go along with > frame_id_build_unavailable_stack). > > I think that would fix the cases where we end up internal erroring, > like in today's Andreas' patch: > > https://sourceware.org/ml/gdb-patches/2016-02/msg00773.html > > And then the heuristic unwinders probably no longer need to care to > use the safe_read_memory_xxx functions. > > And it'd fix the bogus cases where the sentinel frame level (-1) > shows through, due to: > > struct frame_info * > get_current_frame (void) > { > ... > if (current_frame == NULL) > { > struct frame_info *sentinel_frame = > create_sentinel_frame (current_program_space, get_current_regcache ()); > if (catch_exceptions (current_uiout, unwind_to_current_frame, > sentinel_frame, RETURN_MASK_ERROR) != 0) > { > /* Oops! Fake a current frame? Is this useful? It has a PC > of zero, for instance. */ > current_frame = sentinel_frame; > } > > See recent example here: > https://sourceware.org/ml/gdb-patches/2016-01/msg00222.html > Reading Pedro's description I'm not against the refactoring but it's non trivial to me at the moment at least. I suggest we allow this patch to go in in order to make progress on the arm tracepoint patchset and do that refactoring in a subsequent patch. Would that be OK ? Regards, Antoine
Antoine Tremblay <antoine.tremblay@ericsson.com> writes: > Reading Pedro's description I'm not against the refactoring but it's non > trivial to me at the moment at least. It is not a simple refacotring... > > I suggest we allow this patch to go in in order to make progress on the > arm tracepoint patchset and do that refactoring in a subsequent patch. > > Would that be OK ? I am afraid not. We should try this approach, because this will benefit all targets. IMO, handling unavailable memory in general frame unwinding is more important. b.t.w, I am still not confident on the arm software single step in GDBserver on some cases, such as branch-to-self (".L2: b .L2") and single step with signal. ARM tracepoint patches can go in after these issues are resolved (I am working on these issues).
Yao Qi writes: > Antoine Tremblay <antoine.tremblay@ericsson.com> writes: > >> Reading Pedro's description I'm not against the refactoring but it's non >> trivial to me at the moment at least. > > It is not a simple refacotring... > >> >> I suggest we allow this patch to go in in order to make progress on the >> arm tracepoint patchset and do that refactoring in a subsequent patch. >> >> Would that be OK ? > > I am afraid not. We should try this approach, because this will benefit > all targets. IMO, handling unavailable memory in general frame > unwinding is more important. So you intend to work on this ? > > b.t.w, I am still not confident on the arm software single step in > GDBserver on some cases, such as branch-to-self (".L2: b .L2") and > single step with signal. ARM tracepoint patches can go in after these > issues are resolved (I am working on these issues). Thanks for working on that, I'm just trying to progress whereever I can meanwhile so that tracepoint patches are ready to go when single stepping is OK. Thanks, Antoine
Antoine Tremblay <antoine.tremblay@ericsson.com> writes: >> I am afraid not. We should try this approach, because this will benefit >> all targets. IMO, handling unavailable memory in general frame >> unwinding is more important. > > So you intend to work on this ? I'd like to add it to my todo list, but I don't mind someone else picks it up.
Yao Qi writes: > Antoine Tremblay <antoine.tremblay@ericsson.com> writes: > >>> I am afraid not. We should try this approach, because this will benefit >>> all targets. IMO, handling unavailable memory in general frame >>> unwinding is more important. >> >> So you intend to work on this ? > > I'd like to add it to my todo list, but I don't mind someone else picks > it up. Just so that there is no confusion, I do not indend to pick this up. Regards, Antoine
Pedro Alves <palves@redhat.com> writes: Hi Pedro, > I think it'd work to have an ultimate-fallback unwinder that > frame_unwind_find_by_frame returns instead of the internal error at > the end. This would return UNWIND_UNAVAILABLE or UNWIND_MEMORY_ERROR > in the unwinder->stop_reason method, depending on the error the last registered > unwinder thrown. (That last unwinder will always be the arch's heuristic unwinder.) > And it would return frame_id_build_unavailable_stack(PC) in the unwinder->this_id > method if the last error was UNWIND_UNAVAILABLE, outer_frame_id otherwise > (or we add a new frame_id_build_stackless function, to go along with > frame_id_build_unavailable_stack). I write some code to implement your suggestion here, and it looks OK except that I can't get PC to pass to frame_id_build_unavailable_stack, since PC is extracted from frame cache which varies on different archs and unwinders. I tried to define a super class frame_cache for various frame cache (nowadays, it is defined as void *), frame_cache has one field PC, and various frame caches are the sub class of frame_cache. Many frame unwinding APIs need update, and many places need update too, as a result. I stop here as I am not sure it is a right approach. However, I think we can still do the change you suggested, but in a smaller scope, so the change is less aggressive, and some progress can be made, like this, - Add an unavailable frame unwinder for gdbarch A which supports tracepoint, as the ultimate-fallback. - For the unwinders in gdbarch A, move the code creating frame cache to the sniffer. If the sniffer accepts the frame, creates the frame cache. - Exceptions are allowed to be thrown out in frame cache creation. The exception is caught in the caller of sniffer (frame_unwind_try_unwinder) today, so if exception is thrown, GDB will try the next unwinder, - In this way, only 'sniffer' in 'frame_unwind' may throw exception, so we don't have to worry about other 'frame_unwind' methods. IOW, all unwinders in gdbarch A except unavailable frame unwinder don't worry about the unavailable memory/register. - the unavailable frame unwinder is the last unwinder for gdbarch A, it knows how/where to get PC, if PC is available, return frame_id_build_unavailable_stack (PC), otherwise return outer_frame_id. In this way, the change will be smaller, and we can apply this change to each gdbarch one by one, and in the future, it is possible to have a single gdbarch-independent unavailable frame unwinder once we figure out how to get PC from various frame caches.
On 04/06/2016 04:54 PM, Yao Qi wrote: > I write some code to implement your suggestion here, and it looks OK > except that I can't get PC to pass to frame_id_build_unavailable_stack, > since PC is extracted from frame cache which varies on different archs > and unwinders. Hmm, I think I'm confused, since the PC is extracted/unwound from the _next_ frame, not the one we're building the cache for? See get_frame_pc / get_frame_pc_if_available. Thanks, Pedro Alves
Pedro Alves <palves@redhat.com> writes: > Hmm, I think I'm confused, since the PC is extracted/unwound from > the _next_ frame, not the one we're building the cache for? > See get_frame_pc / get_frame_pc_if_available. Oh, right. Thanks for the explanation.
diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c index 05d60bb..5ee7fb0 100644 --- a/gdb/arm-tdep.c +++ b/gdb/arm-tdep.c @@ -252,6 +252,9 @@ struct arm_prologue_cache to identify this frame. */ CORE_ADDR prev_sp; + /* Is the target available to read from ? */ + int available_p; + /* The frame base for this frame is just prev_sp - frame size. FRAMESIZE is the distance from the frame pointer to the initial stack pointer. */ @@ -1793,19 +1796,29 @@ arm_make_prologue_cache (struct frame_info *this_frame) cache = FRAME_OBSTACK_ZALLOC (struct arm_prologue_cache); cache->saved_regs = trad_frame_alloc_saved_regs (this_frame); - arm_scan_prologue (this_frame, cache); + TRY + { + arm_scan_prologue (this_frame, cache); + unwound_fp = get_frame_register_unsigned (this_frame, cache->framereg); + if (unwound_fp == 0) + return cache; - unwound_fp = get_frame_register_unsigned (this_frame, cache->framereg); - if (unwound_fp == 0) - return cache; + cache->prev_sp = unwound_fp + cache->framesize; - cache->prev_sp = unwound_fp + cache->framesize; + /* Calculate actual addresses of saved registers using offsets + determined by arm_scan_prologue. */ + for (reg = 0; reg < gdbarch_num_regs (get_frame_arch (this_frame)); reg++) + if (trad_frame_addr_p (cache->saved_regs, reg)) + cache->saved_regs[reg].addr += cache->prev_sp; - /* Calculate actual addresses of saved registers using offsets - determined by arm_scan_prologue. */ - for (reg = 0; reg < gdbarch_num_regs (get_frame_arch (this_frame)); reg++) - if (trad_frame_addr_p (cache->saved_regs, reg)) - cache->saved_regs[reg].addr += cache->prev_sp; + cache->available_p = 1; + } + CATCH (ex, RETURN_MASK_ERROR) + { + if (ex.error != NOT_AVAILABLE_ERROR) + throw_exception (ex); + } + END_CATCH return cache; } @@ -1823,6 +1836,9 @@ arm_prologue_unwind_stop_reason (struct frame_info *this_frame, *this_cache = arm_make_prologue_cache (this_frame); cache = (struct arm_prologue_cache *) *this_cache; + if (!cache->available_p) + return UNWIND_UNAVAILABLE; + /* This is meant to halt the backtrace at "_start". */ pc = get_frame_pc (this_frame); if (pc <= gdbarch_tdep (get_frame_arch (this_frame))->lowest_pc) @@ -1851,16 +1867,23 @@ arm_prologue_this_id (struct frame_info *this_frame, *this_cache = arm_make_prologue_cache (this_frame); cache = (struct arm_prologue_cache *) *this_cache; - /* Use function start address as part of the frame ID. If we cannot - identify the start address (due to missing symbol information), - fall back to just using the current PC. */ - pc = get_frame_pc (this_frame); - func = get_frame_func (this_frame); - if (!func) - func = pc; + if (!cache->available_p) + { + *this_id = frame_id_build_unavailable_stack (cache->prev_sp); + } + else + { + /* Use function start address as part of the frame ID. If we cannot + identify the start address (due to missing symbol information), + fall back to just using the current PC. */ + pc = get_frame_pc (this_frame); + func = get_frame_func (this_frame); + if (!func) + func = pc; - id = frame_id_build (cache->prev_sp, func); - *this_id = id; + id = frame_id_build (cache->prev_sp, func); + *this_id = id; + } } static struct value * @@ -2738,7 +2761,17 @@ arm_make_stub_cache (struct frame_info *this_frame) cache = FRAME_OBSTACK_ZALLOC (struct arm_prologue_cache); cache->saved_regs = trad_frame_alloc_saved_regs (this_frame); - cache->prev_sp = get_frame_register_unsigned (this_frame, ARM_SP_REGNUM); + TRY + { + cache->prev_sp = get_frame_register_unsigned (this_frame, ARM_SP_REGNUM); + cache->available_p = 1; + } + CATCH (ex, RETURN_MASK_ERROR) + { + if (ex.error != NOT_AVAILABLE_ERROR) + throw_exception (ex); + } + END_CATCH return cache; } @@ -2756,7 +2789,10 @@ arm_stub_this_id (struct frame_info *this_frame, *this_cache = arm_make_stub_cache (this_frame); cache = (struct arm_prologue_cache *) *this_cache; - *this_id = frame_id_build (cache->prev_sp, get_frame_pc (this_frame)); + if (!cache->available_p) + *this_id = frame_id_build_unavailable_stack (cache->prev_sp); + else + *this_id = frame_id_build (cache->prev_sp, get_frame_pc (this_frame)); } static int @@ -2809,29 +2845,38 @@ arm_m_exception_cache (struct frame_info *this_frame) cache = FRAME_OBSTACK_ZALLOC (struct arm_prologue_cache); cache->saved_regs = trad_frame_alloc_saved_regs (this_frame); - unwound_sp = get_frame_register_unsigned (this_frame, - ARM_SP_REGNUM); - - /* The hardware saves eight 32-bit words, comprising xPSR, - ReturnAddress, LR (R14), R12, R3, R2, R1, R0. See details in - "B1.5.6 Exception entry behavior" in - "ARMv7-M Architecture Reference Manual". */ - cache->saved_regs[0].addr = unwound_sp; - cache->saved_regs[1].addr = unwound_sp + 4; - cache->saved_regs[2].addr = unwound_sp + 8; - cache->saved_regs[3].addr = unwound_sp + 12; - cache->saved_regs[12].addr = unwound_sp + 16; - cache->saved_regs[14].addr = unwound_sp + 20; - cache->saved_regs[15].addr = unwound_sp + 24; - cache->saved_regs[ARM_PS_REGNUM].addr = unwound_sp + 28; - - /* If bit 9 of the saved xPSR is set, then there is a four-byte - aligner between the top of the 32-byte stack frame and the - previous context's stack pointer. */ - cache->prev_sp = unwound_sp + 32; - if (safe_read_memory_integer (unwound_sp + 28, 4, byte_order, &xpsr) - && (xpsr & (1 << 9)) != 0) - cache->prev_sp += 4; + TRY + { + unwound_sp = get_frame_register_unsigned (this_frame, ARM_SP_REGNUM); + /* The hardware saves eight 32-bit words, comprising xPSR, + ReturnAddress, LR (R14), R12, R3, R2, R1, R0. See details in + "B1.5.6 Exception entry behavior" in + "ARMv7-M Architecture Reference Manual". */ + cache->saved_regs[0].addr = unwound_sp; + cache->saved_regs[1].addr = unwound_sp + 4; + cache->saved_regs[2].addr = unwound_sp + 8; + cache->saved_regs[3].addr = unwound_sp + 12; + cache->saved_regs[12].addr = unwound_sp + 16; + cache->saved_regs[14].addr = unwound_sp + 20; + cache->saved_regs[15].addr = unwound_sp + 24; + cache->saved_regs[ARM_PS_REGNUM].addr = unwound_sp + 28; + + /* If bit 9 of the saved xPSR is set, then there is a four-byte + aligner between the top of the 32-byte stack frame and the + previous context's stack pointer. */ + cache->prev_sp = unwound_sp + 32; + if (safe_read_memory_integer (unwound_sp + 28, 4, byte_order, &xpsr) + && (xpsr & (1 << 9)) != 0) + cache->prev_sp += 4; + + cache->available_p = 1; + } + CATCH (ex, RETURN_MASK_ERROR) + { + if (ex.error != NOT_AVAILABLE_ERROR) + throw_exception (ex); + } + END_CATCH return cache; } @@ -2850,9 +2895,12 @@ arm_m_exception_this_id (struct frame_info *this_frame, *this_cache = arm_m_exception_cache (this_frame); cache = (struct arm_prologue_cache *) *this_cache; - /* Our frame ID for a stub frame is the current SP and LR. */ - *this_id = frame_id_build (cache->prev_sp, - get_frame_pc (this_frame)); + if (!cache->available_p) + *this_id = frame_id_build_unavailable_stack (cache->prev_sp); + else + /* Our frame ID for a stub frame is the current SP and LR. */ + *this_id = frame_id_build (cache->prev_sp, + get_frame_pc (this_frame)); } /* Implementation of function hook 'prev_register' in