[v2] gdb: Modify until_break_command to act correctly on SIGTRAMP_FRAME

Message ID 1666353538-15846-1-git-send-email-vanekt@fbl.cz
State New
Headers
Series [v2] gdb: Modify until_break_command to act correctly on SIGTRAMP_FRAME |

Commit Message

Tomas Vanek Oct. 21, 2022, 11:58 a.m. UTC
  This patch partially depends on
gdb/arm: Terminate frame unwinding in M-profile lockup state
(without it lockup state is unwound as if it were a normal
stack frame).

The commands 'advance' and 'until' try to set a breakpoint
on the bogus return address derived from Arm M-profile magic
address (actually EXC_RETURN or a PC value indicating lockup).

The offending breakpoint should be set at the return address in
the caller. The magic value 0xffffffff in LR indicates
there is no caller (return to this address would lock up the CPU).

Similar behaviour of 'advance' and 'until' is observed in
an exception handler routine. In this case LR contains e.g.
0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
0xfffffff0. It should use a return value stacked by the exception
instead.

Testbench setup:
STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
A test application (an ordinary blink) with a standard startup
is loaded to the device flash.

Steps to reproduce the problem:

start GDB server
 $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg

start GDB in second terminal
 $ arm-none-eabi-gdb blink.elf

 (gdb) target extended-remote localhost:3333

Reset the device and halt it:
 (gdb) monitor reset halt
 target halted due to debug-request, current mode: Thread
 xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000

Step by one instruction to re-read GDB register cache:
 (gdb) stepi

Check registers, LR should be 0xffffffff after reset:
 (gdb) info registers
 ...
 sp             0x20020000          0x20020000
 lr             0xffffffff          -1
 pc             0x8000e16           0x8000e16
 xPSR           0x1000000           16777216
 ...

 (gdb) set debug remote

Issue 'advance' command:
 (gdb) advance main
 [remote] Sending packet: $mfffffffe,2#fa
 [remote] Packet received: 0000
 [remote] Sending packet: $mfffffffe,2#fa
 [remote] Packet received: 0000
 [remote] Sending packet: $m8000526,2#30
 [remote] Packet received: 2046
 [remote] Sending packet: $Z1,8000526,2#7a
 [remote] Packet received: OK
 [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
 [remote] Sending packet: $Z0,fffffffe,2#43
 [remote] Packet received: E0E
 [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
 Warning:
 Cannot insert breakpoint 0.
 Cannot access memory at address 0xfffffffe

 Command aborted.
 (gdb)

Relevant messages from OpenOCD:
 Error: Failed to read memory at 0xfffff000
 Error: can't add breakpoint: unknown reason

This patch adds skipping over frames that are not suitable for
guarding with a breakpoint inspired by 'finish' command.
If no suitable frame is found, a momentary breakpoint is not set.

v2: Comment fixes, bug reference.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
---
 gdb/breakpoint.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)
  

Comments

Luis Machado Oct. 27, 2022, 10:31 a.m. UTC | #1
Hi Tomas,

On 10/21/22 12:58, Tomas Vanek wrote:
> This patch partially depends on
> gdb/arm: Terminate frame unwinding in M-profile lockup state
> (without it lockup state is unwound as if it were a normal
> stack frame).
> 
> The commands 'advance' and 'until' try to set a breakpoint
> on the bogus return address derived from Arm M-profile magic
> address (actually EXC_RETURN or a PC value indicating lockup).
> 
> The offending breakpoint should be set at the return address in
> the caller. The magic value 0xffffffff in LR indicates
> there is no caller (return to this address would lock up the CPU).
> 
> Similar behaviour of 'advance' and 'until' is observed in
> an exception handler routine. In this case LR contains e.g.
> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
> 0xfffffff0. It should use a return value stacked by the exception
> instead.
> 
> Testbench setup:
> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
> A test application (an ordinary blink) with a standard startup
> is loaded to the device flash.
> 
> Steps to reproduce the problem:
> 
> start GDB server
>   $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
> 
> start GDB in second terminal
>   $ arm-none-eabi-gdb blink.elf
> 
>   (gdb) target extended-remote localhost:3333
> 
> Reset the device and halt it:
>   (gdb) monitor reset halt
>   target halted due to debug-request, current mode: Thread
>   xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
> 
> Step by one instruction to re-read GDB register cache:
>   (gdb) stepi
> 
> Check registers, LR should be 0xffffffff after reset:
>   (gdb) info registers
>   ...
>   sp             0x20020000          0x20020000
>   lr             0xffffffff          -1
>   pc             0x8000e16           0x8000e16
>   xPSR           0x1000000           16777216
>   ...
> 
>   (gdb) set debug remote
> 
> Issue 'advance' command:
>   (gdb) advance main
>   [remote] Sending packet: $mfffffffe,2#fa
>   [remote] Packet received: 0000
>   [remote] Sending packet: $mfffffffe,2#fa
>   [remote] Packet received: 0000
>   [remote] Sending packet: $m8000526,2#30
>   [remote] Packet received: 2046
>   [remote] Sending packet: $Z1,8000526,2#7a
>   [remote] Packet received: OK
>   [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>   [remote] Sending packet: $Z0,fffffffe,2#43
>   [remote] Packet received: E0E
>   [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>   Warning:
>   Cannot insert breakpoint 0.
>   Cannot access memory at address 0xfffffffe
> 
>   Command aborted.
>   (gdb)
> 
> Relevant messages from OpenOCD:
>   Error: Failed to read memory at 0xfffff000
>   Error: can't add breakpoint: unknown reason
> 
> This patch adds skipping over frames that are not suitable for
> guarding with a breakpoint inspired by 'finish' command.
> If no suitable frame is found, a momentary breakpoint is not set.
> 
> v2: Comment fixes, bug reference.
> 
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
> ---
>   gdb/breakpoint.c | 22 +++++++++++++++++-----
>   1 file changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
> index f6591d4..bb85342 100644
> --- a/gdb/breakpoint.c
> +++ b/gdb/breakpoint.c
> @@ -10467,6 +10467,7 @@ enum async_reply_reason
>   until_break_command (const char *arg, int from_tty, int anywhere)
>   {
>     frame_info_ptr frame;
> +  frame_info_ptr caller_frame;
>     struct gdbarch *frame_gdbarch;
>     struct frame_id stack_frame_id;
>     struct frame_id caller_frame_id;
> @@ -10505,7 +10506,17 @@ enum async_reply_reason
>     frame = get_selected_frame (NULL);
>     frame_gdbarch = get_frame_arch (frame);
>     stack_frame_id = get_stack_frame_id (frame);
> -  caller_frame_id = frame_unwind_caller_id (frame);
> +
> +  caller_frame = get_prev_frame_always (frame);
> +
> +  while (caller_frame)
> +    {
> +      if (get_frame_type (caller_frame) != TAILCALL_FRAME
> +	  && gdbarch_code_of_frame_writable (get_frame_arch (caller_frame), caller_frame))
> +	break;
> +
> +      caller_frame = get_prev_frame_always (caller_frame);
> +    }
>   
>     /* Keep within the current frame, or in frames called by the current
>        one.  */
> @@ -10514,14 +10525,15 @@ enum async_reply_reason
>   
>     gdb::optional<delete_longjmp_breakpoint_cleanup> lj_deleter;
>   
> -  if (frame_id_p (caller_frame_id))
> +  if (caller_frame)
>       {
>         struct symtab_and_line sal2;
>         struct gdbarch *caller_gdbarch;
>   
> -      sal2 = find_pc_line (frame_unwind_caller_pc (frame), 0);
> -      sal2.pc = frame_unwind_caller_pc (frame);
> -      caller_gdbarch = frame_unwind_caller_arch (frame);
> +      sal2 = find_pc_line (get_frame_pc (caller_frame), 0);
> +      sal2.pc = get_frame_pc (caller_frame);
> +      caller_gdbarch = get_frame_arch (caller_frame);
> +      caller_frame_id = get_frame_id (caller_frame);
>   
>         breakpoint_up caller_breakpoint
>   	= set_momentary_breakpoint (caller_gdbarch, sal2,

My understanding is that these changes are meant to prevent both commands (until/advance) from
attempting to place a breakpoint in a caller that doesn't really exist, right?

The finish command, as you mentioned, seems to have a similar treatment in "skip_finish_frames".

Would it be possible to factor out that code into a common function that we can call to determine
if we have a valid caller whose PC we can breakpoint?
  
Tomas Vanek Oct. 27, 2022, 5:46 p.m. UTC | #2
Hi Luis,

On 27/10/2022 12:31, Luis Machado wrote:
> Hi Tomas,
>
> On 10/21/22 12:58, Tomas Vanek wrote:
>> This patch partially depends on
>> gdb/arm: Terminate frame unwinding in M-profile lockup state
>> (without it lockup state is unwound as if it were a normal
>> stack frame).
>>
>> The commands 'advance' and 'until' try to set a breakpoint
>> on the bogus return address derived from Arm M-profile magic
>> address (actually EXC_RETURN or a PC value indicating lockup).
>>
>> The offending breakpoint should be set at the return address in
>> the caller. The magic value 0xffffffff in LR indicates
>> there is no caller (return to this address would lock up the CPU).
>>
>> Similar behaviour of 'advance' and 'until' is observed in
>> an exception handler routine. In this case LR contains e.g.
>> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
>> 0xfffffff0. It should use a return value stacked by the exception
>> instead.
>>
>> Testbench setup:
>> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
>> A test application (an ordinary blink) with a standard startup
>> is loaded to the device flash.
>>
>> Steps to reproduce the problem:
>>
>> start GDB server
>>   $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
>>
>> start GDB in second terminal
>>   $ arm-none-eabi-gdb blink.elf
>>
>>   (gdb) target extended-remote localhost:3333
>>
>> Reset the device and halt it:
>>   (gdb) monitor reset halt
>>   target halted due to debug-request, current mode: Thread
>>   xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
>>
>> Step by one instruction to re-read GDB register cache:
>>   (gdb) stepi
>>
>> Check registers, LR should be 0xffffffff after reset:
>>   (gdb) info registers
>>   ...
>>   sp             0x20020000          0x20020000
>>   lr             0xffffffff          -1
>>   pc             0x8000e16           0x8000e16
>>   xPSR           0x1000000           16777216
>>   ...
>>
>>   (gdb) set debug remote
>>
>> Issue 'advance' command:
>>   (gdb) advance main
>>   [remote] Sending packet: $mfffffffe,2#fa
>>   [remote] Packet received: 0000
>>   [remote] Sending packet: $mfffffffe,2#fa
>>   [remote] Packet received: 0000
>>   [remote] Sending packet: $m8000526,2#30
>>   [remote] Packet received: 2046
>>   [remote] Sending packet: $Z1,8000526,2#7a
>>   [remote] Packet received: OK
>>   [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>>   [remote] Sending packet: $Z0,fffffffe,2#43
>>   [remote] Packet received: E0E
>>   [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>>   Warning:
>>   Cannot insert breakpoint 0.
>>   Cannot access memory at address 0xfffffffe
>>
>>   Command aborted.
>>   (gdb)
>>
>> Relevant messages from OpenOCD:
>>   Error: Failed to read memory at 0xfffff000
>>   Error: can't add breakpoint: unknown reason
>>
>> This patch adds skipping over frames that are not suitable for
>> guarding with a breakpoint inspired by 'finish' command.
>> If no suitable frame is found, a momentary breakpoint is not set.
>>
>> v2: Comment fixes, bug reference.
>>
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
>> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
>> ---
>>   gdb/breakpoint.c | 22 +++++++++++++++++-----
>>   1 file changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
>> index f6591d4..bb85342 100644
>> --- a/gdb/breakpoint.c
>> +++ b/gdb/breakpoint.c
>> @@ -10467,6 +10467,7 @@ enum async_reply_reason
>>   until_break_command (const char *arg, int from_tty, int anywhere)
>>   {
>>     frame_info_ptr frame;
>> +  frame_info_ptr caller_frame;
>>     struct gdbarch *frame_gdbarch;
>>     struct frame_id stack_frame_id;
>>     struct frame_id caller_frame_id;
>> @@ -10505,7 +10506,17 @@ enum async_reply_reason
>>     frame = get_selected_frame (NULL);
>>     frame_gdbarch = get_frame_arch (frame);
>>     stack_frame_id = get_stack_frame_id (frame);
>> -  caller_frame_id = frame_unwind_caller_id (frame);
>> +
>> +  caller_frame = get_prev_frame_always (frame);
>> +
>> +  while (caller_frame)
>> +    {
>> +      if (get_frame_type (caller_frame) != TAILCALL_FRAME
>> +      && gdbarch_code_of_frame_writable (get_frame_arch 
>> (caller_frame), caller_frame))
>> +    break;
>> +
>> +      caller_frame = get_prev_frame_always (caller_frame);
>> +    }
>>       /* Keep within the current frame, or in frames called by the 
>> current
>>        one.  */
>> @@ -10514,14 +10525,15 @@ enum async_reply_reason
>>       gdb::optional<delete_longjmp_breakpoint_cleanup> lj_deleter;
>>   -  if (frame_id_p (caller_frame_id))
>> +  if (caller_frame)
>>       {
>>         struct symtab_and_line sal2;
>>         struct gdbarch *caller_gdbarch;
>>   -      sal2 = find_pc_line (frame_unwind_caller_pc (frame), 0);
>> -      sal2.pc = frame_unwind_caller_pc (frame);
>> -      caller_gdbarch = frame_unwind_caller_arch (frame);
>> +      sal2 = find_pc_line (get_frame_pc (caller_frame), 0);
>> +      sal2.pc = get_frame_pc (caller_frame);
>> +      caller_gdbarch = get_frame_arch (caller_frame);
>> +      caller_frame_id = get_frame_id (caller_frame);
>>           breakpoint_up caller_breakpoint
>>       = set_momentary_breakpoint (caller_gdbarch, sal2,
>
> My understanding is that these changes are meant to prevent both 
> commands (until/advance) from
> attempting to place a breakpoint in a caller that doesn't really 
> exist, right?
>
Yes.

> The finish command, as you mentioned, seems to have a similar 
> treatment in "skip_finish_frames".
>
> Would it be possible to factor out that code into a common function 
> that we can call to determine
> if we have a valid caller whose PC we can breakpoint?

Of course it was also my original idea.

Unfortunately skip_finish_frames() uses skip_tailcall_frames() and 
skip_unwritable_frames()
which both call get_prev_frame(). get_prev_frame() stops if the current 
frame is main() or startup.
This is probably sufficient for the 'finish' command (until the user 
requests to 'finish' the main() function:
'finish' refuses to do so with message "finish not meaningful in the 
outermost frame").

'advance' places one breakpoint according to the user request.
The second one is set as a safety measure for the case the first one is 
not reached.
It is quite common to use 'advance' in the main() function and even to 
execute the startup code
and stop at the main() begin. IMO gdb should treat main() as any other 
function and place the safety
breakpoint at its return if possible.

That's why I use get_prev_frame_always() instead of get_prev_frame().
And this is also the reason why there is no simple and elegant way to 
factor out a common function for
both 'advance' and 'finish'.

Tomas
  
Tomas Vanek Nov. 22, 2022, 6:48 a.m. UTC | #3
On 27/10/2022 19:46, Tomas Vanek wrote:
> Hi Luis,
>
> On 27/10/2022 12:31, Luis Machado wrote:
>> Hi Tomas,
>>
>> On 10/21/22 12:58, Tomas Vanek wrote:
>>> This patch partially depends on
>>> gdb/arm: Terminate frame unwinding in M-profile lockup state
>>> (without it lockup state is unwound as if it were a normal
>>> stack frame).
>>>
>>> The commands 'advance' and 'until' try to set a breakpoint
>>> on the bogus return address derived from Arm M-profile magic
>>> address (actually EXC_RETURN or a PC value indicating lockup).
>>>
>>> The offending breakpoint should be set at the return address in
>>> the caller. The magic value 0xffffffff in LR indicates
>>> there is no caller (return to this address would lock up the CPU).
>>>
>>> Similar behaviour of 'advance' and 'until' is observed in
>>> an exception handler routine. In this case LR contains e.g.
>>> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
>>> 0xfffffff0. It should use a return value stacked by the exception
>>> instead.
>>>
>>> Testbench setup:
>>> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
>>> A test application (an ordinary blink) with a standard startup
>>> is loaded to the device flash.
>>>
>>> Steps to reproduce the problem:
>>>
>>> start GDB server
>>>   $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
>>>
>>> start GDB in second terminal
>>>   $ arm-none-eabi-gdb blink.elf
>>>
>>>   (gdb) target extended-remote localhost:3333
>>>
>>> Reset the device and halt it:
>>>   (gdb) monitor reset halt
>>>   target halted due to debug-request, current mode: Thread
>>>   xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
>>>
>>> Step by one instruction to re-read GDB register cache:
>>>   (gdb) stepi
>>>
>>> Check registers, LR should be 0xffffffff after reset:
>>>   (gdb) info registers
>>>   ...
>>>   sp             0x20020000          0x20020000
>>>   lr             0xffffffff          -1
>>>   pc             0x8000e16           0x8000e16
>>>   xPSR           0x1000000           16777216
>>>   ...
>>>
>>>   (gdb) set debug remote
>>>
>>> Issue 'advance' command:
>>>   (gdb) advance main
>>>   [remote] Sending packet: $mfffffffe,2#fa
>>>   [remote] Packet received: 0000
>>>   [remote] Sending packet: $mfffffffe,2#fa
>>>   [remote] Packet received: 0000
>>>   [remote] Sending packet: $m8000526,2#30
>>>   [remote] Packet received: 2046
>>>   [remote] Sending packet: $Z1,8000526,2#7a
>>>   [remote] Packet received: OK
>>>   [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>>>   [remote] Sending packet: $Z0,fffffffe,2#43
>>>   [remote] Packet received: E0E
>>>   [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>>>   Warning:
>>>   Cannot insert breakpoint 0.
>>>   Cannot access memory at address 0xfffffffe
>>>
>>>   Command aborted.
>>>   (gdb)
>>>
>>> Relevant messages from OpenOCD:
>>>   Error: Failed to read memory at 0xfffff000
>>>   Error: can't add breakpoint: unknown reason
>>>
>>> This patch adds skipping over frames that are not suitable for
>>> guarding with a breakpoint inspired by 'finish' command.
>>> If no suitable frame is found, a momentary breakpoint is not set.
>>>
>>> v2: Comment fixes, bug reference.
>>>
>>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
>>> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
>>> ---
>>>   gdb/breakpoint.c | 22 +++++++++++++++++-----
>>>   1 file changed, 17 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
>>> index f6591d4..bb85342 100644
>>> --- a/gdb/breakpoint.c
>>> +++ b/gdb/breakpoint.c
>>> @@ -10467,6 +10467,7 @@ enum async_reply_reason
>>>   until_break_command (const char *arg, int from_tty, int anywhere)
>>>   {
>>>     frame_info_ptr frame;
>>> +  frame_info_ptr caller_frame;
>>>     struct gdbarch *frame_gdbarch;
>>>     struct frame_id stack_frame_id;
>>>     struct frame_id caller_frame_id;
>>> @@ -10505,7 +10506,17 @@ enum async_reply_reason
>>>     frame = get_selected_frame (NULL);
>>>     frame_gdbarch = get_frame_arch (frame);
>>>     stack_frame_id = get_stack_frame_id (frame);
>>> -  caller_frame_id = frame_unwind_caller_id (frame);
>>> +
>>> +  caller_frame = get_prev_frame_always (frame);
>>> +
>>> +  while (caller_frame)
>>> +    {
>>> +      if (get_frame_type (caller_frame) != TAILCALL_FRAME
>>> +      && gdbarch_code_of_frame_writable (get_frame_arch 
>>> (caller_frame), caller_frame))
>>> +    break;
>>> +
>>> +      caller_frame = get_prev_frame_always (caller_frame);
>>> +    }
>>>       /* Keep within the current frame, or in frames called by the 
>>> current
>>>        one.  */
>>> @@ -10514,14 +10525,15 @@ enum async_reply_reason
>>>       gdb::optional<delete_longjmp_breakpoint_cleanup> lj_deleter;
>>>   -  if (frame_id_p (caller_frame_id))
>>> +  if (caller_frame)
>>>       {
>>>         struct symtab_and_line sal2;
>>>         struct gdbarch *caller_gdbarch;
>>>   -      sal2 = find_pc_line (frame_unwind_caller_pc (frame), 0);
>>> -      sal2.pc = frame_unwind_caller_pc (frame);
>>> -      caller_gdbarch = frame_unwind_caller_arch (frame);
>>> +      sal2 = find_pc_line (get_frame_pc (caller_frame), 0);
>>> +      sal2.pc = get_frame_pc (caller_frame);
>>> +      caller_gdbarch = get_frame_arch (caller_frame);
>>> +      caller_frame_id = get_frame_id (caller_frame);
>>>           breakpoint_up caller_breakpoint
>>>       = set_momentary_breakpoint (caller_gdbarch, sal2,
>>
>> My understanding is that these changes are meant to prevent both 
>> commands (until/advance) from
>> attempting to place a breakpoint in a caller that doesn't really 
>> exist, right?
>>
> Yes.
>
>> The finish command, as you mentioned, seems to have a similar 
>> treatment in "skip_finish_frames".
>>
>> Would it be possible to factor out that code into a common function 
>> that we can call to determine
>> if we have a valid caller whose PC we can breakpoint?
>
> Of course it was also my original idea.
>
> Unfortunately skip_finish_frames() uses skip_tailcall_frames() and 
> skip_unwritable_frames()
> which both call get_prev_frame(). get_prev_frame() stops if the 
> current frame is main() or startup.
> This is probably sufficient for the 'finish' command (until the user 
> requests to 'finish' the main() function:
> 'finish' refuses to do so with message "finish not meaningful in the 
> outermost frame").
>
> 'advance' places one breakpoint according to the user request.
> The second one is set as a safety measure for the case the first one 
> is not reached.
> It is quite common to use 'advance' in the main() function and even to 
> execute the startup code
> and stop at the main() begin. IMO gdb should treat main() as any other 
> function and place the safety
> breakpoint at its return if possible.
>
> That's why I use get_prev_frame_always() instead of get_prev_frame().
> And this is also the reason why there is no simple and elegant way to 
> factor out a common function for
> both 'advance' and 'finish'.
>
> Tomas

Luis,

do you have more comments on this patch?
Or do we need it reviewed by somebody else?

Tomas
  
Luis Machado Nov. 22, 2022, 7:27 a.m. UTC | #4
Hi Tomas,

On 11/22/22 06:48, Tomas Vanek wrote:
> On 27/10/2022 19:46, Tomas Vanek wrote:
>> Hi Luis,
>>
>> On 27/10/2022 12:31, Luis Machado wrote:
>>> Hi Tomas,
>>>
>>> On 10/21/22 12:58, Tomas Vanek wrote:
>>>> This patch partially depends on
>>>> gdb/arm: Terminate frame unwinding in M-profile lockup state
>>>> (without it lockup state is unwound as if it were a normal
>>>> stack frame).
>>>>
>>>> The commands 'advance' and 'until' try to set a breakpoint
>>>> on the bogus return address derived from Arm M-profile magic
>>>> address (actually EXC_RETURN or a PC value indicating lockup).
>>>>
>>>> The offending breakpoint should be set at the return address in
>>>> the caller. The magic value 0xffffffff in LR indicates
>>>> there is no caller (return to this address would lock up the CPU).
>>>>
>>>> Similar behaviour of 'advance' and 'until' is observed in
>>>> an exception handler routine. In this case LR contains e.g.
>>>> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
>>>> 0xfffffff0. It should use a return value stacked by the exception
>>>> instead.
>>>>
>>>> Testbench setup:
>>>> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
>>>> A test application (an ordinary blink) with a standard startup
>>>> is loaded to the device flash.
>>>>
>>>> Steps to reproduce the problem:
>>>>
>>>> start GDB server
>>>>   $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
>>>>
>>>> start GDB in second terminal
>>>>   $ arm-none-eabi-gdb blink.elf
>>>>
>>>>   (gdb) target extended-remote localhost:3333
>>>>
>>>> Reset the device and halt it:
>>>>   (gdb) monitor reset halt
>>>>   target halted due to debug-request, current mode: Thread
>>>>   xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
>>>>
>>>> Step by one instruction to re-read GDB register cache:
>>>>   (gdb) stepi
>>>>
>>>> Check registers, LR should be 0xffffffff after reset:
>>>>   (gdb) info registers
>>>>   ...
>>>>   sp             0x20020000          0x20020000
>>>>   lr             0xffffffff          -1
>>>>   pc             0x8000e16           0x8000e16
>>>>   xPSR           0x1000000           16777216
>>>>   ...
>>>>
>>>>   (gdb) set debug remote
>>>>
>>>> Issue 'advance' command:
>>>>   (gdb) advance main
>>>>   [remote] Sending packet: $mfffffffe,2#fa
>>>>   [remote] Packet received: 0000
>>>>   [remote] Sending packet: $mfffffffe,2#fa
>>>>   [remote] Packet received: 0000
>>>>   [remote] Sending packet: $m8000526,2#30
>>>>   [remote] Packet received: 2046
>>>>   [remote] Sending packet: $Z1,8000526,2#7a
>>>>   [remote] Packet received: OK
>>>>   [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>>>>   [remote] Sending packet: $Z0,fffffffe,2#43
>>>>   [remote] Packet received: E0E
>>>>   [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>>>>   Warning:
>>>>   Cannot insert breakpoint 0.
>>>>   Cannot access memory at address 0xfffffffe
>>>>
>>>>   Command aborted.
>>>>   (gdb)
>>>>
>>>> Relevant messages from OpenOCD:
>>>>   Error: Failed to read memory at 0xfffff000
>>>>   Error: can't add breakpoint: unknown reason
>>>>
>>>> This patch adds skipping over frames that are not suitable for
>>>> guarding with a breakpoint inspired by 'finish' command.
>>>> If no suitable frame is found, a momentary breakpoint is not set.
>>>>
>>>> v2: Comment fixes, bug reference.
>>>>
>>>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
>>>> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
>>>> ---
>>>>   gdb/breakpoint.c | 22 +++++++++++++++++-----
>>>>   1 file changed, 17 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
>>>> index f6591d4..bb85342 100644
>>>> --- a/gdb/breakpoint.c
>>>> +++ b/gdb/breakpoint.c
>>>> @@ -10467,6 +10467,7 @@ enum async_reply_reason
>>>>   until_break_command (const char *arg, int from_tty, int anywhere)
>>>>   {
>>>>     frame_info_ptr frame;
>>>> +  frame_info_ptr caller_frame;
>>>>     struct gdbarch *frame_gdbarch;
>>>>     struct frame_id stack_frame_id;
>>>>     struct frame_id caller_frame_id;
>>>> @@ -10505,7 +10506,17 @@ enum async_reply_reason
>>>>     frame = get_selected_frame (NULL);
>>>>     frame_gdbarch = get_frame_arch (frame);
>>>>     stack_frame_id = get_stack_frame_id (frame);
>>>> -  caller_frame_id = frame_unwind_caller_id (frame);
>>>> +
>>>> +  caller_frame = get_prev_frame_always (frame);
>>>> +
>>>> +  while (caller_frame)
>>>> +    {
>>>> +      if (get_frame_type (caller_frame) != TAILCALL_FRAME
>>>> +      && gdbarch_code_of_frame_writable (get_frame_arch (caller_frame), caller_frame))
>>>> +    break;
>>>> +
>>>> +      caller_frame = get_prev_frame_always (caller_frame);
>>>> +    }
>>>>       /* Keep within the current frame, or in frames called by the current
>>>>        one.  */
>>>> @@ -10514,14 +10525,15 @@ enum async_reply_reason
>>>>       gdb::optional<delete_longjmp_breakpoint_cleanup> lj_deleter;
>>>>   -  if (frame_id_p (caller_frame_id))
>>>> +  if (caller_frame)
>>>>       {
>>>>         struct symtab_and_line sal2;
>>>>         struct gdbarch *caller_gdbarch;
>>>>   -      sal2 = find_pc_line (frame_unwind_caller_pc (frame), 0);
>>>> -      sal2.pc = frame_unwind_caller_pc (frame);
>>>> -      caller_gdbarch = frame_unwind_caller_arch (frame);
>>>> +      sal2 = find_pc_line (get_frame_pc (caller_frame), 0);
>>>> +      sal2.pc = get_frame_pc (caller_frame);
>>>> +      caller_gdbarch = get_frame_arch (caller_frame);
>>>> +      caller_frame_id = get_frame_id (caller_frame);
>>>>           breakpoint_up caller_breakpoint
>>>>       = set_momentary_breakpoint (caller_gdbarch, sal2,
>>>
>>> My understanding is that these changes are meant to prevent both commands (until/advance) from
>>> attempting to place a breakpoint in a caller that doesn't really exist, right?
>>>
>> Yes.
>>
>>> The finish command, as you mentioned, seems to have a similar treatment in "skip_finish_frames".
>>>
>>> Would it be possible to factor out that code into a common function that we can call to determine
>>> if we have a valid caller whose PC we can breakpoint?
>>
>> Of course it was also my original idea.
>>
>> Unfortunately skip_finish_frames() uses skip_tailcall_frames() and skip_unwritable_frames()
>> which both call get_prev_frame(). get_prev_frame() stops if the current frame is main() or startup.
>> This is probably sufficient for the 'finish' command (until the user requests to 'finish' the main() function:
>> 'finish' refuses to do so with message "finish not meaningful in the outermost frame").
>>
>> 'advance' places one breakpoint according to the user request.
>> The second one is set as a safety measure for the case the first one is not reached.
>> It is quite common to use 'advance' in the main() function and even to execute the startup code
>> and stop at the main() begin. IMO gdb should treat main() as any other function and place the safety
>> breakpoint at its return if possible.
>>
>> That's why I use get_prev_frame_always() instead of get_prev_frame().
>> And this is also the reason why there is no simple and elegant way to factor out a common function for
>> both 'advance' and 'finish'.
>>
>> Tomas
> 
> Luis,
> 
> do you have more comments on this patch?

I think the patch makes sense to me, according to what you explained.

> Or do we need it reviewed by somebody else?

Yes, global maintainers need to approve the generic parts of patches. It would be nice to get some feedback from others.

> 
> Tomas
  
Tomas Vanek Nov. 28, 2022, 11:48 a.m. UTC | #5
On 21/10/2022 13:58, Tomas Vanek wrote:
> This patch partially depends on
> gdb/arm: Terminate frame unwinding in M-profile lockup state
> (without it lockup state is unwound as if it were a normal
> stack frame).
>
> The commands 'advance' and 'until' try to set a breakpoint
> on the bogus return address derived from Arm M-profile magic
> address (actually EXC_RETURN or a PC value indicating lockup).
>
> The offending breakpoint should be set at the return address in
> the caller. The magic value 0xffffffff in LR indicates
> there is no caller (return to this address would lock up the CPU).
>
> Similar behaviour of 'advance' and 'until' is observed in
> an exception handler routine. In this case LR contains e.g.
> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
> 0xfffffff0. It should use a return value stacked by the exception
> instead.
>
> Testbench setup:
> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
> A test application (an ordinary blink) with a standard startup
> is loaded to the device flash.
>
> Steps to reproduce the problem:
>
> start GDB server
>   $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
>
> start GDB in second terminal
>   $ arm-none-eabi-gdb blink.elf
>
>   (gdb) target extended-remote localhost:3333
>
> Reset the device and halt it:
>   (gdb) monitor reset halt
>   target halted due to debug-request, current mode: Thread
>   xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
>
> Step by one instruction to re-read GDB register cache:
>   (gdb) stepi
>
> Check registers, LR should be 0xffffffff after reset:
>   (gdb) info registers
>   ...
>   sp             0x20020000          0x20020000
>   lr             0xffffffff          -1
>   pc             0x8000e16           0x8000e16
>   xPSR           0x1000000           16777216
>   ...
>
>   (gdb) set debug remote
>
> Issue 'advance' command:
>   (gdb) advance main
>   [remote] Sending packet: $mfffffffe,2#fa
>   [remote] Packet received: 0000
>   [remote] Sending packet: $mfffffffe,2#fa
>   [remote] Packet received: 0000
>   [remote] Sending packet: $m8000526,2#30
>   [remote] Packet received: 2046
>   [remote] Sending packet: $Z1,8000526,2#7a
>   [remote] Packet received: OK
>   [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>   [remote] Sending packet: $Z0,fffffffe,2#43
>   [remote] Packet received: E0E
>   [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>   Warning:
>   Cannot insert breakpoint 0.
>   Cannot access memory at address 0xfffffffe
>
>   Command aborted.
>   (gdb)
>
> Relevant messages from OpenOCD:
>   Error: Failed to read memory at 0xfffff000
>   Error: can't add breakpoint: unknown reason
>
> This patch adds skipping over frames that are not suitable for
> guarding with a breakpoint inspired by 'finish' command.
> If no suitable frame is found, a momentary breakpoint is not set.
>
> v2: Comment fixes, bug reference.
>
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
> ---
>   gdb/breakpoint.c | 22 +++++++++++++++++-----
>   1 file changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
> index f6591d4..bb85342 100644
> --- a/gdb/breakpoint.c
> +++ b/gdb/breakpoint.c
> @@ -10467,6 +10467,7 @@ enum async_reply_reason
>   until_break_command (const char *arg, int from_tty, int anywhere)
>   {
>     frame_info_ptr frame;
> +  frame_info_ptr caller_frame;
>     struct gdbarch *frame_gdbarch;
>     struct frame_id stack_frame_id;
>     struct frame_id caller_frame_id;
> @@ -10505,7 +10506,17 @@ enum async_reply_reason
>     frame = get_selected_frame (NULL);
>     frame_gdbarch = get_frame_arch (frame);
>     stack_frame_id = get_stack_frame_id (frame);
> -  caller_frame_id = frame_unwind_caller_id (frame);
> +
> +  caller_frame = get_prev_frame_always (frame);
> +
> +  while (caller_frame)
> +    {
> +      if (get_frame_type (caller_frame) != TAILCALL_FRAME
> +	  && gdbarch_code_of_frame_writable (get_frame_arch (caller_frame), caller_frame))
> +	break;
> +
> +      caller_frame = get_prev_frame_always (caller_frame);
> +    }
>   
>     /* Keep within the current frame, or in frames called by the current
>        one.  */
> @@ -10514,14 +10525,15 @@ enum async_reply_reason
>   
>     gdb::optional<delete_longjmp_breakpoint_cleanup> lj_deleter;
>   
> -  if (frame_id_p (caller_frame_id))
> +  if (caller_frame)
>       {
>         struct symtab_and_line sal2;
>         struct gdbarch *caller_gdbarch;
>   
> -      sal2 = find_pc_line (frame_unwind_caller_pc (frame), 0);
> -      sal2.pc = frame_unwind_caller_pc (frame);
> -      caller_gdbarch = frame_unwind_caller_arch (frame);
> +      sal2 = find_pc_line (get_frame_pc (caller_frame), 0);
> +      sal2.pc = get_frame_pc (caller_frame);
> +      caller_gdbarch = get_frame_arch (caller_frame);
> +      caller_frame_id = get_frame_id (caller_frame);
>   
>         breakpoint_up caller_breakpoint
>   	= set_momentary_breakpoint (caller_gdbarch, sal2,
  
Luis Machado Dec. 8, 2022, 1:15 a.m. UTC | #6
On 11/28/22 11:48, Tomas Vanek via Gdb-patches wrote:
> On 21/10/2022 13:58, Tomas Vanek wrote:
>> This patch partially depends on
>> gdb/arm: Terminate frame unwinding in M-profile lockup state
>> (without it lockup state is unwound as if it were a normal
>> stack frame).
>>
>> The commands 'advance' and 'until' try to set a breakpoint
>> on the bogus return address derived from Arm M-profile magic
>> address (actually EXC_RETURN or a PC value indicating lockup).
>>
>> The offending breakpoint should be set at the return address in
>> the caller. The magic value 0xffffffff in LR indicates
>> there is no caller (return to this address would lock up the CPU).
>>
>> Similar behaviour of 'advance' and 'until' is observed in
>> an exception handler routine. In this case LR contains e.g.
>> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
>> 0xfffffff0. It should use a return value stacked by the exception
>> instead.
>>
>> Testbench setup:
>> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
>> A test application (an ordinary blink) with a standard startup
>> is loaded to the device flash.
>>
>> Steps to reproduce the problem:
>>
>> start GDB server
>>   $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
>>
>> start GDB in second terminal
>>   $ arm-none-eabi-gdb blink.elf
>>
>>   (gdb) target extended-remote localhost:3333
>>
>> Reset the device and halt it:
>>   (gdb) monitor reset halt
>>   target halted due to debug-request, current mode: Thread
>>   xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
>>
>> Step by one instruction to re-read GDB register cache:
>>   (gdb) stepi
>>
>> Check registers, LR should be 0xffffffff after reset:
>>   (gdb) info registers
>>   ...
>>   sp             0x20020000          0x20020000
>>   lr             0xffffffff          -1
>>   pc             0x8000e16           0x8000e16
>>   xPSR           0x1000000           16777216
>>   ...
>>
>>   (gdb) set debug remote
>>
>> Issue 'advance' command:
>>   (gdb) advance main
>>   [remote] Sending packet: $mfffffffe,2#fa
>>   [remote] Packet received: 0000
>>   [remote] Sending packet: $mfffffffe,2#fa
>>   [remote] Packet received: 0000
>>   [remote] Sending packet: $m8000526,2#30
>>   [remote] Packet received: 2046
>>   [remote] Sending packet: $Z1,8000526,2#7a
>>   [remote] Packet received: OK
>>   [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>>   [remote] Sending packet: $Z0,fffffffe,2#43
>>   [remote] Packet received: E0E
>>   [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>>   Warning:
>>   Cannot insert breakpoint 0.
>>   Cannot access memory at address 0xfffffffe
>>
>>   Command aborted.
>>   (gdb)
>>
>> Relevant messages from OpenOCD:
>>   Error: Failed to read memory at 0xfffff000
>>   Error: can't add breakpoint: unknown reason
>>
>> This patch adds skipping over frames that are not suitable for
>> guarding with a breakpoint inspired by 'finish' command.
>> If no suitable frame is found, a momentary breakpoint is not set.
>>
>> v2: Comment fixes, bug reference.
>>
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
>> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
>> ---
>>   gdb/breakpoint.c | 22 +++++++++++++++++-----
>>   1 file changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
>> index f6591d4..bb85342 100644
>> --- a/gdb/breakpoint.c
>> +++ b/gdb/breakpoint.c
>> @@ -10467,6 +10467,7 @@ enum async_reply_reason
>>   until_break_command (const char *arg, int from_tty, int anywhere)
>>   {
>>     frame_info_ptr frame;
>> +  frame_info_ptr caller_frame;
>>     struct gdbarch *frame_gdbarch;
>>     struct frame_id stack_frame_id;
>>     struct frame_id caller_frame_id;
>> @@ -10505,7 +10506,17 @@ enum async_reply_reason
>>     frame = get_selected_frame (NULL);
>>     frame_gdbarch = get_frame_arch (frame);
>>     stack_frame_id = get_stack_frame_id (frame);
>> -  caller_frame_id = frame_unwind_caller_id (frame);
>> +
>> +  caller_frame = get_prev_frame_always (frame);
>> +
>> +  while (caller_frame)
>> +    {
>> +      if (get_frame_type (caller_frame) != TAILCALL_FRAME
>> +      && gdbarch_code_of_frame_writable (get_frame_arch (caller_frame), caller_frame))
>> +    break;
>> +
>> +      caller_frame = get_prev_frame_always (caller_frame);
>> +    }
>>     /* Keep within the current frame, or in frames called by the current
>>        one.  */
>> @@ -10514,14 +10525,15 @@ enum async_reply_reason
>>     gdb::optional<delete_longjmp_breakpoint_cleanup> lj_deleter;
>> -  if (frame_id_p (caller_frame_id))
>> +  if (caller_frame)
>>       {
>>         struct symtab_and_line sal2;
>>         struct gdbarch *caller_gdbarch;
>> -      sal2 = find_pc_line (frame_unwind_caller_pc (frame), 0);
>> -      sal2.pc = frame_unwind_caller_pc (frame);
>> -      caller_gdbarch = frame_unwind_caller_arch (frame);
>> +      sal2 = find_pc_line (get_frame_pc (caller_frame), 0);
>> +      sal2.pc = get_frame_pc (caller_frame);
>> +      caller_gdbarch = get_frame_arch (caller_frame);
>> +      caller_frame_id = get_frame_id (caller_frame);
>>         breakpoint_up caller_breakpoint
>>       = set_momentary_breakpoint (caller_gdbarch, sal2,
> 
Would be nice to have some feedback on the generic parts of the above patch. It seems like a useful fix to have.
  
Tomas Vanek Dec. 21, 2022, 8:52 a.m. UTC | #7
On 08/12/2022 02:15, Luis Machado wrote:
> On 11/28/22 11:48, Tomas Vanek via Gdb-patches wrote:
>> On 21/10/2022 13:58, Tomas Vanek wrote:
>>> This patch partially depends on
>>> gdb/arm: Terminate frame unwinding in M-profile lockup state
>>> (without it lockup state is unwound as if it were a normal
>>> stack frame).
>>>
>>> The commands 'advance' and 'until' try to set a breakpoint
>>> on the bogus return address derived from Arm M-profile magic
>>> address (actually EXC_RETURN or a PC value indicating lockup).
>>>
>>> The offending breakpoint should be set at the return address in
>>> the caller. The magic value 0xffffffff in LR indicates
>>> there is no caller (return to this address would lock up the CPU).
>>>
>>> Similar behaviour of 'advance' and 'until' is observed in
>>> an exception handler routine. In this case LR contains e.g.
>>> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
>>> 0xfffffff0. It should use a return value stacked by the exception
>>> instead.
>>>
>>> Testbench setup:
>>> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
>>> A test application (an ordinary blink) with a standard startup
>>> is loaded to the device flash.
>>>
>>> Steps to reproduce the problem:
>>>
>>> start GDB server
>>>   $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
>>>
>>> start GDB in second terminal
>>>   $ arm-none-eabi-gdb blink.elf
>>>
>>>   (gdb) target extended-remote localhost:3333
>>>
>>> Reset the device and halt it:
>>>   (gdb) monitor reset halt
>>>   target halted due to debug-request, current mode: Thread
>>>   xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
>>>
>>> Step by one instruction to re-read GDB register cache:
>>>   (gdb) stepi
>>>
>>> Check registers, LR should be 0xffffffff after reset:
>>>   (gdb) info registers
>>>   ...
>>>   sp             0x20020000          0x20020000
>>>   lr             0xffffffff          -1
>>>   pc             0x8000e16           0x8000e16
>>>   xPSR           0x1000000           16777216
>>>   ...
>>>
>>>   (gdb) set debug remote
>>>
>>> Issue 'advance' command:
>>>   (gdb) advance main
>>>   [remote] Sending packet: $mfffffffe,2#fa
>>>   [remote] Packet received: 0000
>>>   [remote] Sending packet: $mfffffffe,2#fa
>>>   [remote] Packet received: 0000
>>>   [remote] Sending packet: $m8000526,2#30
>>>   [remote] Packet received: 2046
>>>   [remote] Sending packet: $Z1,8000526,2#7a
>>>   [remote] Packet received: OK
>>>   [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>>>   [remote] Sending packet: $Z0,fffffffe,2#43
>>>   [remote] Packet received: E0E
>>>   [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>>>   Warning:
>>>   Cannot insert breakpoint 0.
>>>   Cannot access memory at address 0xfffffffe
>>>
>>>   Command aborted.
>>>   (gdb)
>>>
>>> Relevant messages from OpenOCD:
>>>   Error: Failed to read memory at 0xfffff000
>>>   Error: can't add breakpoint: unknown reason
>>>
>>> This patch adds skipping over frames that are not suitable for
>>> guarding with a breakpoint inspired by 'finish' command.
>>> If no suitable frame is found, a momentary breakpoint is not set.
>>>
>>> v2: Comment fixes, bug reference.
>>>
>>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
>>> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
>>> ---
>>>   gdb/breakpoint.c | 22 +++++++++++++++++-----
>>>   1 file changed, 17 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
>>> index f6591d4..bb85342 100644
>>> --- a/gdb/breakpoint.c
>>> +++ b/gdb/breakpoint.c
>>> @@ -10467,6 +10467,7 @@ enum async_reply_reason
>>>   until_break_command (const char *arg, int from_tty, int anywhere)
>>>   {
>>>     frame_info_ptr frame;
>>> +  frame_info_ptr caller_frame;
>>>     struct gdbarch *frame_gdbarch;
>>>     struct frame_id stack_frame_id;
>>>     struct frame_id caller_frame_id;
>>> @@ -10505,7 +10506,17 @@ enum async_reply_reason
>>>     frame = get_selected_frame (NULL);
>>>     frame_gdbarch = get_frame_arch (frame);
>>>     stack_frame_id = get_stack_frame_id (frame);
>>> -  caller_frame_id = frame_unwind_caller_id (frame);
>>> +
>>> +  caller_frame = get_prev_frame_always (frame);
>>> +
>>> +  while (caller_frame)
>>> +    {
>>> +      if (get_frame_type (caller_frame) != TAILCALL_FRAME
>>> +      && gdbarch_code_of_frame_writable (get_frame_arch 
>>> (caller_frame), caller_frame))
>>> +    break;
>>> +
>>> +      caller_frame = get_prev_frame_always (caller_frame);
>>> +    }
>>>     /* Keep within the current frame, or in frames called by the 
>>> current
>>>        one.  */
>>> @@ -10514,14 +10525,15 @@ enum async_reply_reason
>>>     gdb::optional<delete_longjmp_breakpoint_cleanup> lj_deleter;
>>> -  if (frame_id_p (caller_frame_id))
>>> +  if (caller_frame)
>>>       {
>>>         struct symtab_and_line sal2;
>>>         struct gdbarch *caller_gdbarch;
>>> -      sal2 = find_pc_line (frame_unwind_caller_pc (frame), 0);
>>> -      sal2.pc = frame_unwind_caller_pc (frame);
>>> -      caller_gdbarch = frame_unwind_caller_arch (frame);
>>> +      sal2 = find_pc_line (get_frame_pc (caller_frame), 0);
>>> +      sal2.pc = get_frame_pc (caller_frame);
>>> +      caller_gdbarch = get_frame_arch (caller_frame);
>>> +      caller_frame_id = get_frame_id (caller_frame);
>>>         breakpoint_up caller_breakpoint
>>>       = set_momentary_breakpoint (caller_gdbarch, sal2,
>>
> Would be nice to have some feedback on the generic parts of the above 
> patch. It seems like a useful fix to have.
  
Tomas Vanek Jan. 10, 2023, 1:19 p.m. UTC | #8
On 08/12/2022 02:15, Luis Machado wrote:
> On 11/28/22 11:48, Tomas Vanek via Gdb-patches wrote:
>> On 21/10/2022 13:58, Tomas Vanek wrote:
>>>
>>> The commands 'advance' and 'until' try to set a breakpoint
>>> on the bogus return address derived from Arm M-profile magic
>>> address (actually EXC_RETURN or a PC value indicating lockup).
>>>
>>> The offending breakpoint should be set at the return address in
>>> the caller. The magic value 0xffffffff in LR indicates
>>> there is no caller (return to this address would lock up the CPU).
>>>
>>> Similar behaviour of 'advance' and 'until' is observed in
>>> an exception handler routine. In this case LR contains e.g.
>>> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
>>> 0xfffffff0. It should use a return value stacked by the exception
>>> instead.
>>>
>>> Testbench setup:
>>> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
>>> A test application (an ordinary blink) with a standard startup
>>> is loaded to the device flash.
>>>
>>> Steps to reproduce the problem:
>>>
>>> start GDB server
>>>   $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
>>>
>>> start GDB in second terminal
>>>   $ arm-none-eabi-gdb blink.elf
>>>
>>>   (gdb) target extended-remote localhost:3333
>>>
>>> Reset the device and halt it:
>>>   (gdb) monitor reset halt
>>>   target halted due to debug-request, current mode: Thread
>>>   xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
>>>
>>> Step by one instruction to re-read GDB register cache:
>>>   (gdb) stepi
>>>
>>> Check registers, LR should be 0xffffffff after reset:
>>>   (gdb) info registers
>>>   ...
>>>   sp             0x20020000          0x20020000
>>>   lr             0xffffffff          -1
>>>   pc             0x8000e16           0x8000e16
>>>   xPSR           0x1000000           16777216
>>>   ...
>>>
>>>   (gdb) set debug remote
>>>
>>> Issue 'advance' command:
>>>   (gdb) advance main
>>>   [remote] Sending packet: $mfffffffe,2#fa
>>>   [remote] Packet received: 0000
>>>   [remote] Sending packet: $mfffffffe,2#fa
>>>   [remote] Packet received: 0000
>>>   [remote] Sending packet: $m8000526,2#30
>>>   [remote] Packet received: 2046
>>>   [remote] Sending packet: $Z1,8000526,2#7a
>>>   [remote] Packet received: OK
>>>   [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>>>   [remote] Sending packet: $Z0,fffffffe,2#43
>>>   [remote] Packet received: E0E
>>>   [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>>>   Warning:
>>>   Cannot insert breakpoint 0.
>>>   Cannot access memory at address 0xfffffffe
>>>
>>>   Command aborted.
>>>   (gdb)
>>>
>>> Relevant messages from OpenOCD:
>>>   Error: Failed to read memory at 0xfffff000
>>>   Error: can't add breakpoint: unknown reason
>>>
>>> This patch adds skipping over frames that are not suitable for
>>> guarding with a breakpoint inspired by 'finish' command.
>>> If no suitable frame is found, a momentary breakpoint is not set.
>>>
>>> v2: Comment fixes, bug reference.
>>>
>>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
>>> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
>>> ---
>>>   gdb/breakpoint.c | 22 +++++++++++++++++-----
>>>   1 file changed, 17 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
>>> index f6591d4..bb85342 100644
>>> --- a/gdb/breakpoint.c
>>> +++ b/gdb/breakpoint.c
>>> @@ -10467,6 +10467,7 @@ enum async_reply_reason
>>>   until_break_command (const char *arg, int from_tty, int anywhere)
>>>   {
>>>     frame_info_ptr frame;
>>> +  frame_info_ptr caller_frame;
>>>     struct gdbarch *frame_gdbarch;
>>>     struct frame_id stack_frame_id;
>>>     struct frame_id caller_frame_id;
>>> @@ -10505,7 +10506,17 @@ enum async_reply_reason
>>>     frame = get_selected_frame (NULL);
>>>     frame_gdbarch = get_frame_arch (frame);
>>>     stack_frame_id = get_stack_frame_id (frame);
>>> -  caller_frame_id = frame_unwind_caller_id (frame);
>>> +
>>> +  caller_frame = get_prev_frame_always (frame);
>>> +
>>> +  while (caller_frame)
>>> +    {
>>> +      if (get_frame_type (caller_frame) != TAILCALL_FRAME
>>> +      && gdbarch_code_of_frame_writable (get_frame_arch 
>>> (caller_frame), caller_frame))
>>> +    break;
>>> +
>>> +      caller_frame = get_prev_frame_always (caller_frame);
>>> +    }
>>>     /* Keep within the current frame, or in frames called by the current
>>>        one.  */
>>> @@ -10514,14 +10525,15 @@ enum async_reply_reason
>>>     gdb::optional<delete_longjmp_breakpoint_cleanup> lj_deleter;
>>> -  if (frame_id_p (caller_frame_id))
>>> +  if (caller_frame)
>>>       {
>>>         struct symtab_and_line sal2;
>>>         struct gdbarch *caller_gdbarch;
>>> -      sal2 = find_pc_line (frame_unwind_caller_pc (frame), 0);
>>> -      sal2.pc = frame_unwind_caller_pc (frame);
>>> -      caller_gdbarch = frame_unwind_caller_arch (frame);
>>> +      sal2 = find_pc_line (get_frame_pc (caller_frame), 0);
>>> +      sal2.pc = get_frame_pc (caller_frame);
>>> +      caller_gdbarch = get_frame_arch (caller_frame);
>>> +      caller_frame_id = get_frame_id (caller_frame);
>>>         breakpoint_up caller_breakpoint
>>>       = set_momentary_breakpoint (caller_gdbarch, sal2,
>>
> Would be nice to have some feedback on the generic parts of the above 
> patch. It seems like a useful fix to have.
  
Simon Marchi Jan. 10, 2023, 3:31 p.m. UTC | #9
On 10/21/22 07:58, Tomas Vanek wrote:
> This patch partially depends on
> gdb/arm: Terminate frame unwinding in M-profile lockup state
> (without it lockup state is unwound as if it were a normal
> stack frame).
> 
> The commands 'advance' and 'until' try to set a breakpoint
> on the bogus return address derived from Arm M-profile magic
> address (actually EXC_RETURN or a PC value indicating lockup).
> 
> The offending breakpoint should be set at the return address in
> the caller. The magic value 0xffffffff in LR indicates
> there is no caller (return to this address would lock up the CPU).
> 
> Similar behaviour of 'advance' and 'until' is observed in
> an exception handler routine. In this case LR contains e.g.
> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
> 0xfffffff0. It should use a return value stacked by the exception
> instead.
> 
> Testbench setup:
> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
> A test application (an ordinary blink) with a standard startup
> is loaded to the device flash.
> 
> Steps to reproduce the problem:
> 
> start GDB server
>  $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
> 
> start GDB in second terminal
>  $ arm-none-eabi-gdb blink.elf
> 
>  (gdb) target extended-remote localhost:3333
> 
> Reset the device and halt it:
>  (gdb) monitor reset halt
>  target halted due to debug-request, current mode: Thread
>  xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
> 
> Step by one instruction to re-read GDB register cache:
>  (gdb) stepi
> 
> Check registers, LR should be 0xffffffff after reset:
>  (gdb) info registers
>  ...
>  sp             0x20020000          0x20020000
>  lr             0xffffffff          -1
>  pc             0x8000e16           0x8000e16
>  xPSR           0x1000000           16777216
>  ...
> 
>  (gdb) set debug remote
> 
> Issue 'advance' command:
>  (gdb) advance main
>  [remote] Sending packet: $mfffffffe,2#fa
>  [remote] Packet received: 0000
>  [remote] Sending packet: $mfffffffe,2#fa
>  [remote] Packet received: 0000
>  [remote] Sending packet: $m8000526,2#30
>  [remote] Packet received: 2046
>  [remote] Sending packet: $Z1,8000526,2#7a
>  [remote] Packet received: OK
>  [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>  [remote] Sending packet: $Z0,fffffffe,2#43
>  [remote] Packet received: E0E
>  [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>  Warning:
>  Cannot insert breakpoint 0.
>  Cannot access memory at address 0xfffffffe
> 
>  Command aborted.
>  (gdb)
> 
> Relevant messages from OpenOCD:
>  Error: Failed to read memory at 0xfffff000
>  Error: can't add breakpoint: unknown reason
> 
> This patch adds skipping over frames that are not suitable for
> guarding with a breakpoint inspired by 'finish' command.
> If no suitable frame is found, a momentary breakpoint is not set.
> 
> v2: Comment fixes, bug reference.
> 
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>

Hi Tomas,

In order to better understand if this is the right fix, can you describe
what the different frames are in this situation, starting with the
current frame?  I'd like to see what the frame ids and frame types are.

Is there any chance this could be reproduced using the GDB simulator (or maybe
qemu), so I could tinker with it?

Simon
  
Tomas Vanek Jan. 10, 2023, 4:33 p.m. UTC | #10
On 10/01/2023 16:31, Simon Marchi wrote:
>
> On 10/21/22 07:58, Tomas Vanek wrote:
>> This patch partially depends on
>> gdb/arm: Terminate frame unwinding in M-profile lockup state
>> (without it lockup state is unwound as if it were a normal
>> stack frame).
>>
>> The commands 'advance' and 'until' try to set a breakpoint
>> on the bogus return address derived from Arm M-profile magic
>> address (actually EXC_RETURN or a PC value indicating lockup).
>>
>> The offending breakpoint should be set at the return address in
>> the caller. The magic value 0xffffffff in LR indicates
>> there is no caller (return to this address would lock up the CPU).
>>
>> Similar behaviour of 'advance' and 'until' is observed in
>> an exception handler routine. In this case LR contains e.g.
>> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
>> 0xfffffff0. It should use a return value stacked by the exception
>> instead.
>>
>> Testbench setup:
>> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
>> A test application (an ordinary blink) with a standard startup
>> is loaded to the device flash.
>>
>> Steps to reproduce the problem:
>>
>> start GDB server
>>   $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
>>
>> start GDB in second terminal
>>   $ arm-none-eabi-gdb blink.elf
>>
>>   (gdb) target extended-remote localhost:3333
>>
>> Reset the device and halt it:
>>   (gdb) monitor reset halt
>>   target halted due to debug-request, current mode: Thread
>>   xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
>>
>> Step by one instruction to re-read GDB register cache:
>>   (gdb) stepi
>>
>> Check registers, LR should be 0xffffffff after reset:
>>   (gdb) info registers
>>   ...
>>   sp             0x20020000          0x20020000
>>   lr             0xffffffff          -1
>>   pc             0x8000e16           0x8000e16
>>   xPSR           0x1000000           16777216
>>   ...
>>
>>   (gdb) set debug remote
>>
>> Issue 'advance' command:
>>   (gdb) advance main
>>   [remote] Sending packet: $mfffffffe,2#fa
>>   [remote] Packet received: 0000
>>   [remote] Sending packet: $mfffffffe,2#fa
>>   [remote] Packet received: 0000
>>   [remote] Sending packet: $m8000526,2#30
>>   [remote] Packet received: 2046
>>   [remote] Sending packet: $Z1,8000526,2#7a
>>   [remote] Packet received: OK
>>   [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>>   [remote] Sending packet: $Z0,fffffffe,2#43
>>   [remote] Packet received: E0E
>>   [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>>   Warning:
>>   Cannot insert breakpoint 0.
>>   Cannot access memory at address 0xfffffffe
>>
>>   Command aborted.
>>   (gdb)
>>
>> Relevant messages from OpenOCD:
>>   Error: Failed to read memory at 0xfffff000
>>   Error: can't add breakpoint: unknown reason
>>
>> This patch adds skipping over frames that are not suitable for
>> guarding with a breakpoint inspired by 'finish' command.
>> If no suitable frame is found, a momentary breakpoint is not set.
>>
>> v2: Comment fixes, bug reference.
>>
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
>> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
> Hi Tomas,
>
> In order to better understand if this is the right fix, can you describe
> what the different frames are in this situation, starting with the
> current frame?  I'd like to see what the frame ids and frame types are.
>
> Is there any chance this could be reproduced using the GDB simulator (or maybe
> qemu), so I could tinker with it?
>
> Simon

Hi Simon,

I'm not familiar with GDB sim nor qemu and have no idea if their 
Cortex-M profile
implementation is precise enough to get the same behaviour. I'll give it 
a try...

Please see gdb debug output, I marked compute_frame_id results. This is 
a real Cortex-M device (this time STM32H7A3) right after reset,
OpenOCD used as gdb server:

(gdb) maintenance flush register-cache
Register cache flushed.
(gdb) set debug frame 1
(gdb) bt -past-entry
[frame] get_prev_frame_always_1: enter
   [frame] get_prev_frame_always_1: this_frame=-1
   [frame] get_prev_frame_always_1:   -> 
{level=0,type=<unknown>,unwinder=<unknown>,pc=0x800609c,id=<not 
computed>,func=<unknown>} // cached
[frame] get_prev_frame_always_1: exit
[frame] get_prev_frame_always_1: enter
   [frame] get_prev_frame_always_1: this_frame=-1
   [frame] get_prev_frame_always_1:   -> 
{level=0,type=<unknown>,unwinder=<unknown>,pc=0x800609c,id=<not 
computed>,func=<unknown>} // cached
[frame] get_prev_frame_always_1: exit
[frame] compute_frame_id: enter
   [frame] compute_frame_id: fi=0
   [frame] frame_unwind_find_by_frame: enter
     [frame] frame_unwind_find_by_frame: this_frame=0
     [frame] frame_unwind_try_unwinder: trying unwinder "dummy"
     [frame] frame_unwind_try_unwinder: no
     [frame] frame_unwind_try_unwinder: trying unwinder "dwarf2 tailcall"
     [frame] frame_unwind_try_unwinder: no
     [frame] frame_unwind_try_unwinder: trying unwinder "inline"
     [frame] frame_unwind_try_unwinder: no
     [frame] frame_unwind_try_unwinder: trying unwinder "jit"
     [frame] frame_unwind_try_unwinder: no
     [frame] frame_unwind_try_unwinder: trying unwinder "arm m exception 
lockup sec_fnc"
     [frame] frame_unwind_try_unwinder: no
     [frame] frame_unwind_try_unwinder: trying unwinder "arm stub"
     [frame] frame_unwind_try_unwinder: no
     [frame] frame_unwind_try_unwinder: trying unwinder "dwarf2"
     [frame] frame_unwind_try_unwinder: no
     [frame] frame_unwind_try_unwinder: trying unwinder "dwarf2 signal"
     [frame] frame_unwind_try_unwinder: no
     [frame] frame_unwind_try_unwinder: trying unwinder "arm exidx"
     [frame] frame_unwind_try_unwinder: no
     [frame] frame_unwind_try_unwinder: trying unwinder "arm epilogue"
     [frame] frame_unwind_register_value: enter
       [frame] frame_unwind_register_value: frame=-1, regnum=25(xPSR)
       [frame] frame_unwind_register_value:   -> register=25 
bytes=[00000001]
     [frame] frame_unwind_register_value: exit
     [frame] frame_unwind_try_unwinder: no
     [frame] frame_unwind_try_unwinder: trying unwinder "arm prologue"
     [frame] frame_unwind_try_unwinder: yes
   [frame] frame_unwind_find_by_frame: exit
   [frame] frame_unwind_register_value: enter
     [frame] frame_unwind_register_value: frame=-1, regnum=13(sp)
     [frame] frame_unwind_register_value:   -> register=13 bytes=[00000220]
   [frame] frame_unwind_register_value: exit
   [frame] frame_unwind_register_value: enter
     [frame] frame_unwind_register_value: frame=-1, regnum=91(msp)
     [frame] frame_unwind_register_value:   -> register=91 bytes=[00000220]
   [frame] frame_unwind_register_value: exit
   [frame] frame_unwind_register_value: enter
     [frame] frame_unwind_register_value: frame=-1, regnum=92(psp)
     [frame] frame_unwind_register_value:   -> register=92 bytes=[00000000]
   [frame] frame_unwind_register_value: exit
   [frame] frame_unwind_register_value: enter
     [frame] frame_unwind_register_value: frame=-1, regnum=25(xPSR)
     [frame] frame_unwind_register_value:   -> register=25 bytes=[00000001]
   [frame] frame_unwind_register_value: exit
   [frame] frame_unwind_register_value: enter
     [frame] frame_unwind_register_value: frame=-1, regnum=13(sp)
     [frame] frame_unwind_register_value:   -> register=13 bytes=[00000220]
   [frame] frame_unwind_register_value: exit
   [frame] get_frame_func_if_available: this_frame=0 -> 0x800609c
   [frame] frame_id_p: 
l={stack=0x20020000,code=0x000000000800609c,!special} -> 1
   [frame] compute_frame_id:   -> 
{stack=0x20020000,code=0x000000000800609c,!special} <------- *current frame*
[frame] compute_frame_id: exit
#0  Reset_Handler ([frame] frame_id_p: l={!stack,!code,!special} -> 0
) at startup_stm32h7a3xx.s:62
[frame] operator==: 
l={stack=0x20020000,code=0x000000000800609c,!special}, 
r={!stack,!code,!special} -> 0
[frame] get_prev_frame: enter
   [frame] get_prev_frame_always_1: enter
     [frame] get_prev_frame_always_1: this_frame=0
     [frame] get_prev_frame_raw:   -> 
{level=1,type=<unknown>,unwinder=<unknown>,pc=<unknown>,id=<not 
computed>,func=<unknown>}
     [frame] compute_frame_id: enter
       [frame] compute_frame_id: fi=1
       [frame] frame_unwind_find_by_frame: enter
         [frame] frame_unwind_find_by_frame: this_frame=1
         [frame] frame_unwind_arch: next_frame=0 -> armv7e-m
         [frame] frame_unwind_try_unwinder: trying unwinder "dummy"
         [frame] frame_unwind_try_unwinder: no
         [frame] frame_unwind_try_unwinder: trying unwinder "dwarf2 
tailcall"
         [frame] frame_unwind_try_unwinder: no
         [frame] frame_unwind_try_unwinder: trying unwinder "inline"
         [frame] frame_unwind_register_value: enter
           [frame] frame_unwind_register_value: frame=0, regnum=15(pc)
           [frame] frame_unwind_register_value: enter
             [frame] frame_unwind_register_value: frame=0, regnum=14(lr)
             [frame] frame_id_p: 
l={stack=<sentinel>,!code,special=0x0000000000000000} -> 1
             [frame] frame_id_p: 
l={stack=<sentinel>,!code,special=0x0000000000000000} -> 1
             [frame] operator==: 
l={stack=<sentinel>,!code,special=0x0000000000000000}, 
r={stack=<sentinel>,!code,special=0x0000000000000000} -> 1
             [frame] frame_unwind_register_value: enter
               [frame] frame_unwind_register_value: frame=-1, regnum=14(lr)
               [frame] frame_unwind_register_value:   -> register=14 
bytes=[ffffffff]
             [frame] frame_unwind_register_value: exit
             [frame] frame_id_p: 
l={stack=<sentinel>,!code,special=0x0000000000000000} -> 1
             [frame] operator==: 
l={stack=<sentinel>,!code,special=0x0000000000000000}, 
r={stack=<sentinel>,!code,special=0x0000000000000000} -> 1
             [frame] get_prev_frame_always_1: enter
               [frame] get_prev_frame_always_1: this_frame=-1
               [frame] get_prev_frame_always_1:   -> 
{level=0,type=NORMAL_FRAME,unwinder="arm 
prologue",pc=0x800609c,id={stack=0x20020000,code=0x000000000800609c,!special},func=0x800609c} 
//
  cached
             [frame] get_prev_frame_always_1: exit
             [frame] value_fetch_lazy_register: (frame=0, regnum=14(lr), 
...) -> register=14 bytes=[ffffffff]
             [frame] frame_unwind_register_value:   -> register=14 
bytes=[ffffffff]
           [frame] frame_unwind_register_value: exit
           [frame] frame_unwind_register_value:   -> computed 
bytes=[ffffffff]
         [frame] frame_unwind_register_value: exit
         [frame] frame_unwind_pc: this_frame=0 -> 0xffffffff
         [frame] frame_unwind_try_unwinder: no
         [frame] frame_unwind_try_unwinder: trying unwinder "jit"
         [frame] frame_unwind_try_unwinder: no
         [frame] frame_unwind_try_unwinder: trying unwinder "arm m 
exception lockup sec_fnc"
         [frame] frame_unwind_try_unwinder: yes
       [frame] frame_unwind_find_by_frame: exit
       [frame] frame_unwind_register_value: enter
         [frame] frame_unwind_register_value: frame=0, regnum=13(sp)
         [frame] frame_unwind_register_value:   -> computed bytes=[00000220]
       [frame] frame_unwind_register_value: exit
       [frame] frame_unwind_register_value: enter
         [frame] frame_unwind_register_value: frame=0, regnum=91(msp)
         [frame] frame_unwind_register_value:   -> computed bytes=[00000220]
       [frame] frame_unwind_register_value: exit
       [frame] frame_unwind_register_value: enter
         [frame] frame_unwind_register_value: frame=0, regnum=92(psp)
         [frame] frame_unwind_register_value:   -> computed bytes=[00000000]
       [frame] frame_unwind_register_value: exit
       [frame] frame_id_p: 
l={stack=0x0,code=0x00000000ffffffff,!special} -> 1
       [frame] compute_frame_id:   -> 
{stack=0x0,code=0x00000000ffffffff,!special}        <----- *magic value 
0xffffffff in LR indicates there is no caller*
     [frame] compute_frame_id: exit
   [frame] get_prev_frame_always_1: exit
[frame] get_prev_frame: exit
#1  <signal handler called>
[frame] operator==: l={stack=0x0,code=0x00000000ffffffff,!special}, 
r={!stack,!code,!special} -> 0
[frame] get_prev_frame: enter
   [frame] get_prev_frame_always_1: enter
     [frame] get_prev_frame_always_1: this_frame=1
     [frame] get_prev_frame_always_1:   -> nullptr // UNWIND_OUTERMOST
   [frame] get_prev_frame_always_1: exit
[frame] get_prev_frame: exit
[frame] get_prev_frame_always_1: enter
   [frame] get_prev_frame_always_1: this_frame=1
   [frame] get_prev_frame_always_1:   -> nullptr // UNWIND_OUTERMOST // 
cached
[frame] get_prev_frame_always_1: exit
[frame] get_prev_frame_always_1: enter
   [frame] get_prev_frame_always_1: this_frame=-1
   [frame] get_prev_frame_always_1:   -> 
{level=0,type=NORMAL_FRAME,unwinder="arm 
prologue",pc=0x800609c,id={stack=0x20020000,code=0x000000000800609c,!special},func=0x800609c} 
// cached
[frame] get_prev_frame_always_1: exit


The situation when stopped in an ISR is similar, just the magic value 
differs e.g. 0xfffffff1
and other frames follows.
  
Simon Marchi Jan. 10, 2023, 5:48 p.m. UTC | #11
On 10/21/22 07:58, Tomas Vanek wrote:
> This patch partially depends on
> gdb/arm: Terminate frame unwinding in M-profile lockup state
> (without it lockup state is unwound as if it were a normal
> stack frame).
> 
> The commands 'advance' and 'until' try to set a breakpoint
> on the bogus return address derived from Arm M-profile magic
> address (actually EXC_RETURN or a PC value indicating lockup).
> 
> The offending breakpoint should be set at the return address in
> the caller. The magic value 0xffffffff in LR indicates
> there is no caller (return to this address would lock up the CPU).
> 
> Similar behaviour of 'advance' and 'until' is observed in
> an exception handler routine. In this case LR contains e.g.
> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
> 0xfffffff0. It should use a return value stacked by the exception
> instead.
> 
> Testbench setup:
> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
> A test application (an ordinary blink) with a standard startup
> is loaded to the device flash.
> 
> Steps to reproduce the problem:
> 
> start GDB server
>  $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
> 
> start GDB in second terminal
>  $ arm-none-eabi-gdb blink.elf
> 
>  (gdb) target extended-remote localhost:3333
> 
> Reset the device and halt it:
>  (gdb) monitor reset halt
>  target halted due to debug-request, current mode: Thread
>  xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
> 
> Step by one instruction to re-read GDB register cache:
>  (gdb) stepi
> 
> Check registers, LR should be 0xffffffff after reset:
>  (gdb) info registers
>  ...
>  sp             0x20020000          0x20020000
>  lr             0xffffffff          -1
>  pc             0x8000e16           0x8000e16
>  xPSR           0x1000000           16777216
>  ...
> 
>  (gdb) set debug remote
> 
> Issue 'advance' command:
>  (gdb) advance main
>  [remote] Sending packet: $mfffffffe,2#fa
>  [remote] Packet received: 0000
>  [remote] Sending packet: $mfffffffe,2#fa
>  [remote] Packet received: 0000
>  [remote] Sending packet: $m8000526,2#30
>  [remote] Packet received: 2046
>  [remote] Sending packet: $Z1,8000526,2#7a
>  [remote] Packet received: OK
>  [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>  [remote] Sending packet: $Z0,fffffffe,2#43
>  [remote] Packet received: E0E
>  [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>  Warning:
>  Cannot insert breakpoint 0.
>  Cannot access memory at address 0xfffffffe
> 
>  Command aborted.
>  (gdb)
> 
> Relevant messages from OpenOCD:
>  Error: Failed to read memory at 0xfffff000
>  Error: can't add breakpoint: unknown reason
> 
> This patch adds skipping over frames that are not suitable for
> guarding with a breakpoint inspired by 'finish' command.
> If no suitable frame is found, a momentary breakpoint is not set.
> 
> v2: Comment fixes, bug reference.
> 
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
> ---
>  gdb/breakpoint.c | 22 +++++++++++++++++-----
>  1 file changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
> index f6591d4..bb85342 100644
> --- a/gdb/breakpoint.c
> +++ b/gdb/breakpoint.c
> @@ -10467,6 +10467,7 @@ enum async_reply_reason
>  until_break_command (const char *arg, int from_tty, int anywhere)
>  {
>    frame_info_ptr frame;
> +  frame_info_ptr caller_frame;
>    struct gdbarch *frame_gdbarch;
>    struct frame_id stack_frame_id;
>    struct frame_id caller_frame_id;
> @@ -10505,7 +10506,17 @@ enum async_reply_reason
>    frame = get_selected_frame (NULL);
>    frame_gdbarch = get_frame_arch (frame);
>    stack_frame_id = get_stack_frame_id (frame);
> -  caller_frame_id = frame_unwind_caller_id (frame);
> +
> +  caller_frame = get_prev_frame_always (frame);
> +
> +  while (caller_frame)
> +    {
> +      if (get_frame_type (caller_frame) != TAILCALL_FRAME
> +	  && gdbarch_code_of_frame_writable (get_frame_arch (caller_frame), caller_frame))
> +	break;
> +
> +      caller_frame = get_prev_frame_always (caller_frame);
> +    }

frame_unwind_caller_id does skip inline frames (through
skip_artificial_frames), whereas your version does not, is that correct?

I'm wondering if we should make frame_unwind_caller_id /
skip_artificial_frames handle it instead.  In other words, would other
callers of frame_unwind_caller_id / skip_artificial_frames need to skip
such frames, and benefit from it.

If we do that, frame_unwind_caller_id would return null_frame_id in your
case, and I think that until_break_command would not need to be
modified.

Looking at watch_command_1, for instance, there seems to be the same
situation, where we create a temporary breakpoint to detect when we get
out of scope:

  https://gitlab.com/gnutools/binutils-gdb/-/blob/8ec0b0b5df0ebe28c32900afc7ae8ff22b21f381/gdb/breakpoint.c#L10215

I guess if you try to create a watchpoint in that special frame, you'll
hit the same problem?

From what I understand, in the ARM case, the sigtramp frame does not
represent any code, and has no real PC.  When the normal frame #0
returns, control goes back directly to the frame above the sigtramp
frame.  It does not go execute instructions in the sigtramp frame.  Is
that right?

With Linux on amd64, after returning from the normal frame, control goes
to some instructions that will call the sigreturn syscall, so the
sigtramp frame does have a real pc.

If so, I think that's the distinction we are dealing with here.  We want
to know what's the first instruction that is going to be executed after
leaving the current frame.  And it may sometimes be the sigtramp frame,
or it may sometimes be the frame above that.

Simon
  
Tomas Vanek Jan. 10, 2023, 11:22 p.m. UTC | #12
On 10/01/2023 18:48, Simon Marchi wrote:
>
> On 10/21/22 07:58, Tomas Vanek wrote:
>> This patch partially depends on
>> gdb/arm: Terminate frame unwinding in M-profile lockup state
>> (without it lockup state is unwound as if it were a normal
>> stack frame).
>>
>> The commands 'advance' and 'until' try to set a breakpoint
>> on the bogus return address derived from Arm M-profile magic
>> address (actually EXC_RETURN or a PC value indicating lockup).
>>
>> The offending breakpoint should be set at the return address in
>> the caller. The magic value 0xffffffff in LR indicates
>> there is no caller (return to this address would lock up the CPU).
>>
>> Similar behaviour of 'advance' and 'until' is observed in
>> an exception handler routine. In this case LR contains e.g.
>> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
>> 0xfffffff0. It should use a return value stacked by the exception
>> instead.
>>
>> Testbench setup:
>> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
>> A test application (an ordinary blink) with a standard startup
>> is loaded to the device flash.
>>
>> Steps to reproduce the problem:
>>
>> start GDB server
>>   $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
>>
>> start GDB in second terminal
>>   $ arm-none-eabi-gdb blink.elf
>>
>>   (gdb) target extended-remote localhost:3333
>>
>> Reset the device and halt it:
>>   (gdb) monitor reset halt
>>   target halted due to debug-request, current mode: Thread
>>   xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
>>
>> Step by one instruction to re-read GDB register cache:
>>   (gdb) stepi
>>
>> Check registers, LR should be 0xffffffff after reset:
>>   (gdb) info registers
>>   ...
>>   sp             0x20020000          0x20020000
>>   lr             0xffffffff          -1
>>   pc             0x8000e16           0x8000e16
>>   xPSR           0x1000000           16777216
>>   ...
>>
>>   (gdb) set debug remote
>>
>> Issue 'advance' command:
>>   (gdb) advance main
>>   [remote] Sending packet: $mfffffffe,2#fa
>>   [remote] Packet received: 0000
>>   [remote] Sending packet: $mfffffffe,2#fa
>>   [remote] Packet received: 0000
>>   [remote] Sending packet: $m8000526,2#30
>>   [remote] Packet received: 2046
>>   [remote] Sending packet: $Z1,8000526,2#7a
>>   [remote] Packet received: OK
>>   [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>>   [remote] Sending packet: $Z0,fffffffe,2#43
>>   [remote] Packet received: E0E
>>   [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>>   Warning:
>>   Cannot insert breakpoint 0.
>>   Cannot access memory at address 0xfffffffe
>>
>>   Command aborted.
>>   (gdb)
>>
>> Relevant messages from OpenOCD:
>>   Error: Failed to read memory at 0xfffff000
>>   Error: can't add breakpoint: unknown reason
>>
>> This patch adds skipping over frames that are not suitable for
>> guarding with a breakpoint inspired by 'finish' command.
>> If no suitable frame is found, a momentary breakpoint is not set.
>>
>> v2: Comment fixes, bug reference.
>>
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
>> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
>> ---
>>   gdb/breakpoint.c | 22 +++++++++++++++++-----
>>   1 file changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
>> index f6591d4..bb85342 100644
>> --- a/gdb/breakpoint.c
>> +++ b/gdb/breakpoint.c
>> @@ -10467,6 +10467,7 @@ enum async_reply_reason
>>   until_break_command (const char *arg, int from_tty, int anywhere)
>>   {
>>     frame_info_ptr frame;
>> +  frame_info_ptr caller_frame;
>>     struct gdbarch *frame_gdbarch;
>>     struct frame_id stack_frame_id;
>>     struct frame_id caller_frame_id;
>> @@ -10505,7 +10506,17 @@ enum async_reply_reason
>>     frame = get_selected_frame (NULL);
>>     frame_gdbarch = get_frame_arch (frame);
>>     stack_frame_id = get_stack_frame_id (frame);
>> -  caller_frame_id = frame_unwind_caller_id (frame);
>> +
>> +  caller_frame = get_prev_frame_always (frame);
>> +
>> +  while (caller_frame)
>> +    {
>> +      if (get_frame_type (caller_frame) != TAILCALL_FRAME
>> +	  && gdbarch_code_of_frame_writable (get_frame_arch (caller_frame), caller_frame))
>> +	break;
>> +
>> +      caller_frame = get_prev_frame_always (caller_frame);
>> +    }
> frame_unwind_caller_id does skip inline frames (through
> skip_artificial_frames), whereas your version does not, is that correct?
TBH not sure.
I just reused the code in 'finish' command
https://gitlab.com/gnutools/binutils-gdb/-/blob/8ec0b0b5df0ebe28c32900afc7ae8ff22b21f381/gdb/infcmd.c#L1790-1808
because it does not suffer from breakpoints at magic addresses and 
changed it to prevent stopping at main()
as explained in
https://sourceware.org/pipermail/gdb-patches/2022-October/193223.html

>
> I'm wondering if we should make frame_unwind_caller_id /
> skip_artificial_frames handle it instead.  In other words, would other
> callers of frame_unwind_caller_id / skip_artificial_frames need to skip
> such frames, and benefit from it.
>
> If we do that, frame_unwind_caller_id would return null_frame_id in your
> case, and I think that until_break_command would not need to be
> modified.
>
> Looking at watch_command_1, for instance, there seems to be the same
> situation, where we create a temporary breakpoint to detect when we get
> out of scope:
>
>    https://gitlab.com/gnutools/binutils-gdb/-/blob/8ec0b0b5df0ebe28c32900afc7ae8ff22b21f381/gdb/breakpoint.c#L10215
>
> I guess if you try to create a watchpoint in that special frame, you'll
> hit the same problem?
Indeed.
I uploaded a slightly modified example
https://github.com/libopencm3/libopencm3-examples/blob/master/examples/stm32/f4/stm32f4-discovery/tick_blink/tick_blink.c
to the original bug ticket 
https://sourceware.org/bugzilla/show_bug.cgi?id=28683
It runs in QEMU and has a local variable in the ISR routine.
Setting 'watch loc'  while stopped at tick_blink.c:39 and 'continue' 
results in
  [remote] Sending packet: $Z0,fffffff9,2#17
as you expected.
>
>  From what I understand, in the ARM case, the sigtramp frame does not
> represent any code, and has no real PC.  When the normal frame #0
> returns, control goes back directly to the frame above the sigtramp
> frame.  It does not go execute instructions in the sigtramp frame.  Is
> that right?
Yes, exactly.
>
> With Linux on amd64, after returning from the normal frame, control goes
> to some instructions that will call the sigreturn syscall, so the
> sigtramp frame does have a real pc.
Didn't know that.
At least the current 'finish' command implementation skips that part.
Isn't it intentional?
>
> If so, I think that's the distinction we are dealing with here.  We want
> to know what's the first instruction that is going to be executed after
> leaving the current frame.  And it may sometimes be the sigtramp frame,
> or it may sometimes be the frame above that.
No idea how to handle that.

Thanks for your comments.

Tomas
  
Simon Marchi Jan. 11, 2023, 1:38 a.m. UTC | #13
On 1/10/23 18:22, Tomas Vanek via Gdb-patches wrote:
> On 10/01/2023 18:48, Simon Marchi wrote:
>>
>> On 10/21/22 07:58, Tomas Vanek wrote:
>>> This patch partially depends on
>>> gdb/arm: Terminate frame unwinding in M-profile lockup state
>>> (without it lockup state is unwound as if it were a normal
>>> stack frame).
>>>
>>> The commands 'advance' and 'until' try to set a breakpoint
>>> on the bogus return address derived from Arm M-profile magic
>>> address (actually EXC_RETURN or a PC value indicating lockup).
>>>
>>> The offending breakpoint should be set at the return address in
>>> the caller. The magic value 0xffffffff in LR indicates
>>> there is no caller (return to this address would lock up the CPU).
>>>
>>> Similar behaviour of 'advance' and 'until' is observed in
>>> an exception handler routine. In this case LR contains e.g.
>>> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
>>> 0xfffffff0. It should use a return value stacked by the exception
>>> instead.
>>>
>>> Testbench setup:
>>> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
>>> A test application (an ordinary blink) with a standard startup
>>> is loaded to the device flash.
>>>
>>> Steps to reproduce the problem:
>>>
>>> start GDB server
>>>   $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
>>>
>>> start GDB in second terminal
>>>   $ arm-none-eabi-gdb blink.elf
>>>
>>>   (gdb) target extended-remote localhost:3333
>>>
>>> Reset the device and halt it:
>>>   (gdb) monitor reset halt
>>>   target halted due to debug-request, current mode: Thread
>>>   xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
>>>
>>> Step by one instruction to re-read GDB register cache:
>>>   (gdb) stepi
>>>
>>> Check registers, LR should be 0xffffffff after reset:
>>>   (gdb) info registers
>>>   ...
>>>   sp             0x20020000          0x20020000
>>>   lr             0xffffffff          -1
>>>   pc             0x8000e16           0x8000e16
>>>   xPSR           0x1000000           16777216
>>>   ...
>>>
>>>   (gdb) set debug remote
>>>
>>> Issue 'advance' command:
>>>   (gdb) advance main
>>>   [remote] Sending packet: $mfffffffe,2#fa
>>>   [remote] Packet received: 0000
>>>   [remote] Sending packet: $mfffffffe,2#fa
>>>   [remote] Packet received: 0000
>>>   [remote] Sending packet: $m8000526,2#30
>>>   [remote] Packet received: 2046
>>>   [remote] Sending packet: $Z1,8000526,2#7a
>>>   [remote] Packet received: OK
>>>   [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>>>   [remote] Sending packet: $Z0,fffffffe,2#43
>>>   [remote] Packet received: E0E
>>>   [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>>>   Warning:
>>>   Cannot insert breakpoint 0.
>>>   Cannot access memory at address 0xfffffffe
>>>
>>>   Command aborted.
>>>   (gdb)
>>>
>>> Relevant messages from OpenOCD:
>>>   Error: Failed to read memory at 0xfffff000
>>>   Error: can't add breakpoint: unknown reason
>>>
>>> This patch adds skipping over frames that are not suitable for
>>> guarding with a breakpoint inspired by 'finish' command.
>>> If no suitable frame is found, a momentary breakpoint is not set.
>>>
>>> v2: Comment fixes, bug reference.
>>>
>>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
>>> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
>>> ---
>>>   gdb/breakpoint.c | 22 +++++++++++++++++-----
>>>   1 file changed, 17 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
>>> index f6591d4..bb85342 100644
>>> --- a/gdb/breakpoint.c
>>> +++ b/gdb/breakpoint.c
>>> @@ -10467,6 +10467,7 @@ enum async_reply_reason
>>>   until_break_command (const char *arg, int from_tty, int anywhere)
>>>   {
>>>     frame_info_ptr frame;
>>> +  frame_info_ptr caller_frame;
>>>     struct gdbarch *frame_gdbarch;
>>>     struct frame_id stack_frame_id;
>>>     struct frame_id caller_frame_id;
>>> @@ -10505,7 +10506,17 @@ enum async_reply_reason
>>>     frame = get_selected_frame (NULL);
>>>     frame_gdbarch = get_frame_arch (frame);
>>>     stack_frame_id = get_stack_frame_id (frame);
>>> -  caller_frame_id = frame_unwind_caller_id (frame);
>>> +
>>> +  caller_frame = get_prev_frame_always (frame);
>>> +
>>> +  while (caller_frame)
>>> +    {
>>> +      if (get_frame_type (caller_frame) != TAILCALL_FRAME
>>> +      && gdbarch_code_of_frame_writable (get_frame_arch (caller_frame), caller_frame))
>>> +    break;
>>> +
>>> +      caller_frame = get_prev_frame_always (caller_frame);
>>> +    }
>> frame_unwind_caller_id does skip inline frames (through
>> skip_artificial_frames), whereas your version does not, is that correct?
> TBH not sure.
> I just reused the code in 'finish' command
> https://gitlab.com/gnutools/binutils-gdb/-/blob/8ec0b0b5df0ebe28c32900afc7ae8ff22b21f381/gdb/infcmd.c#L1790-1808
> because it does not suffer from breakpoints at magic addresses and changed it to prevent stopping at main()
> as explained in
> https://sourceware.org/pipermail/gdb-patches/2022-October/193223.htmlo

For finish, we do not want to skip inline frames, because that's the
semantic of the command.  finish for inlined frames is implemented by
single-stepping in the current function until the current frame changes.
It is taken care specially here:

  https://gitlab.com/gnutools/binutils-gdb/-/blob/8ec0b0b5df0ebe28c32900afc7ae8ff22b21f381/gdb/infcmd.c#L1851

Note that the pc for the inlined frame is the same as the pc for the
frame it is inlined in.  So if you were in an inlined frame and simply
took the next frame's pc and put a breakpoint there, you would end up
putting a breakpoint at the current pc, so I think you would stop
immediately.  Not what you want.  Hence the need to skip inline frames
for those.

If we had common code between the finish command and this other stuff, I
think it would be harmless for the finish command if the common code
skipped inline frames, since the finish command would have taken care of
the inline frame case earlier, and returned early.

>> I'm wondering if we should make frame_unwind_caller_id /
>> skip_artificial_frames handle it instead.  In other words, would other
>> callers of frame_unwind_caller_id / skip_artificial_frames need to skip
>> such frames, and benefit from it.
>>
>> If we do that, frame_unwind_caller_id would return null_frame_id in your
>> case, and I think that until_break_command would not need to be
>> modified.
>>
>> Looking at watch_command_1, for instance, there seems to be the same
>> situation, where we create a temporary breakpoint to detect when we get
>> out of scope:
>>
>>    https://gitlab.com/gnutools/binutils-gdb/-/blob/8ec0b0b5df0ebe28c32900afc7ae8ff22b21f381/gdb/breakpoint.c#L10215
>>
>> I guess if you try to create a watchpoint in that special frame, you'll
>> hit the same problem?
> Indeed.
> I uploaded a slightly modified example
> https://github.com/libopencm3/libopencm3-examples/blob/master/examples/stm32/f4/stm32f4-discovery/tick_blink/tick_blink.c
> to the original bug ticket https://sourceware.org/bugzilla/show_bug.cgi?id=28683
> It runs in QEMU and has a local variable in the ISR routine.
> Setting 'watch loc'  while stopped at tick_blink.c:39 and 'continue' results in
>  [remote] Sending packet: $Z0,fffffff9,2#17
> as you expected.

Cool, thanks.

>>  From what I understand, in the ARM case, the sigtramp frame does not
>> represent any code, and has no real PC.  When the normal frame #0
>> returns, control goes back directly to the frame above the sigtramp
>> frame.  It does not go execute instructions in the sigtramp frame.  Is
>> that right?
> Yes, exactly.
>>
>> With Linux on amd64, after returning from the normal frame, control goes
>> to some instructions that will call the sigreturn syscall, so the
>> sigtramp frame does have a real pc.
> Didn't know that.
> At least the current 'finish' command implementation skips that part.
> Isn't it intentional?

Using this program:

  #include <signal.h>

  int c;

  void handler(int a)
  {
    c++;
  }

  int main(int argc, char ** argv)
  {
    signal(SIGSEGV, handler);
    raise(SIGSEGV);
  }

GDB lets me debug the sigtramp frame just like a normal frame, including
finish:

    $ ./gdb -q -nx --data-directory=data-directory a.out -iex "set debuginfod enabled on"
    Reading symbols from a.out...
    (gdb) b handler
    Breakpoint 1 at 0x1150: file test.c, line 7.
    (gdb) run
    Starting program: /home/simark/build/binutils-gdb/gdb/a.out
    Downloading separate debug info for system-supplied DSO at 0x7ffff7fc8000
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1".

    Program received signal SIGSEGV, Segmentation fault.
    __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=11, no_tid=no_tid@entry=0) at pthread_kill.c:44
    44            return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
    (gdb) c
    Continuing.

    Breakpoint 1, handler (a=11) at test.c:7
    7         c++;
    (gdb) bt
    #0  handler (a=11) at test.c:7
    #1  <signal handler called>
    #2  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=11, no_tid=no_tid@entry=0) at pthread_kill.c:44
    #3  0x00007ffff7e2c6b3 in __pthread_kill_internal (signo=11, threadid=<optimized out>) at pthread_kill.c:78
    #4  0x00007ffff7ddc958 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
    #5  0x000055555555518f in main (argc=1, argv=0x7fffffffd758) at test.c:13
    (gdb) fin
    Run till exit from #0  handler (a=11) at test.c:7
    <signal handler called>
    (gdb) disas
    Dump of assembler code for function __restore_rt:
    => 0x00007ffff7ddca00 <+0>:     mov    $0xf,%rax
       0x00007ffff7ddca07 <+7>:     syscall
       0x00007ffff7ddca09 <+9>:     nopl   0x0(%rax)
    End of assembler dump.
    (gdb) bt
    #0  <signal handler called>
    #1  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=11, no_tid=no_tid@entry=0) at pthread_kill.c:44
    #2  0x00007ffff7e2c6b3 in __pthread_kill_internal (signo=11, threadid=<optimized out>) at pthread_kill.c:78
    #3  0x00007ffff7ddc958 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
    #4  0x000055555555518f in main (argc=1, argv=0x7fffffffd758) at test.c:13
    (gdb) si
    <signal handler called>

I'm honestly surprised that GDB lets you debug the sigtramp frame.  Not
that there's anything wrong with it, but the user typically doesn't
care, unless they are debugging the delivery of signals of their OS.

>> If so, I think that's the distinction we are dealing with here.  We want
>> to know what's the first instruction that is going to be executed after
>> leaving the current frame.  And it may sometimes be the sigtramp frame,
>> or it may sometimes be the frame above that.
> No idea how to handle that.

I think you are on the right track.  We need a function that has that
semantic: what is the closest place after returning from the given
frame, where it's possible to put a breakpoint.  I think this function
could be used by the various commands that have this similar need.  The
until/advance commands, probably finish, and maybe the watch command, as
you have found earlier.

Simon
  

Patch

diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
index f6591d4..bb85342 100644
--- a/gdb/breakpoint.c
+++ b/gdb/breakpoint.c
@@ -10467,6 +10467,7 @@  enum async_reply_reason
 until_break_command (const char *arg, int from_tty, int anywhere)
 {
   frame_info_ptr frame;
+  frame_info_ptr caller_frame;
   struct gdbarch *frame_gdbarch;
   struct frame_id stack_frame_id;
   struct frame_id caller_frame_id;
@@ -10505,7 +10506,17 @@  enum async_reply_reason
   frame = get_selected_frame (NULL);
   frame_gdbarch = get_frame_arch (frame);
   stack_frame_id = get_stack_frame_id (frame);
-  caller_frame_id = frame_unwind_caller_id (frame);
+
+  caller_frame = get_prev_frame_always (frame);
+
+  while (caller_frame)
+    {
+      if (get_frame_type (caller_frame) != TAILCALL_FRAME
+	  && gdbarch_code_of_frame_writable (get_frame_arch (caller_frame), caller_frame))
+	break;
+
+      caller_frame = get_prev_frame_always (caller_frame);
+    }
 
   /* Keep within the current frame, or in frames called by the current
      one.  */
@@ -10514,14 +10525,15 @@  enum async_reply_reason
 
   gdb::optional<delete_longjmp_breakpoint_cleanup> lj_deleter;
 
-  if (frame_id_p (caller_frame_id))
+  if (caller_frame)
     {
       struct symtab_and_line sal2;
       struct gdbarch *caller_gdbarch;
 
-      sal2 = find_pc_line (frame_unwind_caller_pc (frame), 0);
-      sal2.pc = frame_unwind_caller_pc (frame);
-      caller_gdbarch = frame_unwind_caller_arch (frame);
+      sal2 = find_pc_line (get_frame_pc (caller_frame), 0);
+      sal2.pc = get_frame_pc (caller_frame);
+      caller_gdbarch = get_frame_arch (caller_frame);
+      caller_frame_id = get_frame_id (caller_frame);
 
       breakpoint_up caller_breakpoint
 	= set_momentary_breakpoint (caller_gdbarch, sal2,