[v2] gdb/arm: Terminate frame unwinding in M-profile lockup state

Message ID f561200e-a646-199d-97ad-a32552b389a4@fbl.cz
State New
Headers
Series [v2] gdb/arm: Terminate frame unwinding in M-profile lockup state |

Commit Message

Tomas Vanek Oct. 16, 2022, 2:13 p.m. UTC
  In the lockup state the PC value of the the outer frame is irreversibly
lost. The other registers are intact so LR likely contains
PC of some frame next to the outer one, but we cannot analyze
the nearest outer frame without knowing its PC
therefore we do not know SP fixup for this frame.

The frame unwinder possibly gets mad due to the wrong SP value.
To prevent problems terminate unwinding if PC contains the magic
value of the lockup state.

Example session wihtout this change,
Cortex-M33 CPU in lockup, gdb 13.0.50.20221016-git:
----------------
   (gdb) c
   Continuing.

   Program received signal SIGINT, Interrupt.
   0xeffffffe in ?? ()
   (gdb) bt
   #0  0xeffffffe in ?? ()
   #1  0x0c000a9c in HardFault_Handler ()
       at 
C:/dvl/stm32l5trustzone/GPIO_IOToggle_TrustZone/Secure/Src/stm32l5xx_it.c:99
   #2  0x2002ffd8 in ?? ()
   Backtrace stopped: previous frame identical to this frame (corrupt 
stack?)
   (gdb)
----------------
The frame #1 is at correct PC taken from LR, #2 is a total nonsense.

With the change:
----------------
   (gdb) c
   Continuing.

   Program received signal SIGINT, Interrupt.
   warning: ARM M in lockup state, stack unwinding terminated.
   <signal handler called>
   (gdb) bt
   #0  <signal handler called>
   (gdb)
----------------

There is a visible drawback of emitting a warning in a cache buildnig 
routine
as introduced in Torbjörn SVENSSON's
[PATCH v4] gdb/arm: Stop unwinding on error, but do not assert
The warning is printed just once and not repeated on each backtrace command.

V2 update: warning supressed for other frames than the innermost one.

Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
---
  gdb/arm-tdep.c | 55 
++++++++++++++++++++++++++++++++++++++++++++++++++++---
  1 file changed, 52 insertions(+), 3 deletions(-)

     Table B1-5 Exception return behavior
@@ -769,6 +790,9 @@ class target_arm_instruction_reader : public 
arm_instruction_reader
  static int
  arm_m_addr_is_magic (struct gdbarch *gdbarch, CORE_ADDR addr)
  {
+  if (arm_m_addr_is_lockup (addr))
+    return 1;
+
    arm_gdbarch_tdep *tdep = gdbarch_tdep<arm_gdbarch_tdep> (gdbarch);
    if (tdep->have_sec_ext)
      {
@@ -3355,6 +3379,30 @@ struct frame_unwind arm_stub_unwind = {
       describes which bits in LR that define which stack was used prior
       to the exception and if FPU is used (causing extended stack 
frame).  */

+  /* In the lockup state PC contains a lockup magic value.
+     The PC value of the the next outer frame is irreversibly
+     lost. The other registers are intact so LR likely contains
+     PC of some frame next to the outer one, but we cannot analyze
+     the next outer frame without knowing its PC
+     therefore we do not know SP fixup for this frame.
+     Some heuristics to resynchronize SP might be possible (TODO?)
+     For simplicity just terminate unwinding to prevent the unwinder
+     going mad.  */
+  CORE_ADDR pc = get_frame_pc (this_frame);
+  if (arm_m_addr_is_lockup (pc))
+    {
+      /* The lockup can be real just in the innermost frame
+     as the CPU is stopped and cannot create more frames.
+     If we hit lockup magic PC in the other frame, it is
+     just a sentinel at the top of stack: do not warn then.  */
+      if (frame_relative_level (this_frame) == 0)
+    warning (_("ARM M in lockup state, stack unwinding terminated."));
+
+      /* Terminate any further stack unwinding.  */
+      arm_cache_set_active_sp_value (cache, tdep, 0);
+      return cache;
+    }
+
    CORE_ADDR lr = get_frame_register_unsigned (this_frame, ARM_LR_REGNUM);

    /* ARMv7-M Architecture Reference "A2.3.1 Arm core registers"
@@ -3824,11 +3872,12 @@ struct frame_unwind arm_stub_unwind = {
    return arm_m_addr_is_magic (gdbarch, this_pc);
  }

-/* Frame unwinder for M-profile exceptions.  */
+/* Frame unwinder for M-profile exceptions, lockup
+   and secure/nonsecure interstate fnc calls.  */

  struct frame_unwind arm_m_exception_unwind =
  {
-  "arm m exception",
+  "arm m exception lockup sec_fnc",
    SIGTRAMP_FRAME,
    arm_m_exception_frame_unwind_stop_reason,
    arm_m_exception_this_id,
  

Comments

Luis Machado Oct. 17, 2022, 11:06 a.m. UTC | #1
Hi,

Just some general comments on the formatting.

On 10/16/22 15:13, Tomas Vanek wrote:
> In the lockup state the PC value of the the outer frame is irreversibly
> lost. The other registers are intact so LR likely contains
> PC of some frame next to the outer one, but we cannot analyze
> the nearest outer frame without knowing its PC
> therefore we do not know SP fixup for this frame.
> 
> The frame unwinder possibly gets mad due to the wrong SP value.
> To prevent problems terminate unwinding if PC contains the magic
> value of the lockup state.
> 
> Example session wihtout this change,
> Cortex-M33 CPU in lockup, gdb 13.0.50.20221016-git:
> ----------------
>    (gdb) c
>    Continuing.
> 
>    Program received signal SIGINT, Interrupt.
>    0xeffffffe in ?? ()
>    (gdb) bt
>    #0  0xeffffffe in ?? ()
>    #1  0x0c000a9c in HardFault_Handler ()
>        at C:/dvl/stm32l5trustzone/GPIO_IOToggle_TrustZone/Secure/Src/stm32l5xx_it.c:99
>    #2  0x2002ffd8 in ?? ()
>    Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>    (gdb)
> ----------------
> The frame #1 is at correct PC taken from LR, #2 is a total nonsense.
> 
> With the change:
> ----------------
>    (gdb) c
>    Continuing.
> 
>    Program received signal SIGINT, Interrupt.
>    warning: ARM M in lockup state, stack unwinding terminated.
>    <signal handler called>
>    (gdb) bt
>    #0  <signal handler called>
>    (gdb)
> ----------------
> 
> There is a visible drawback of emitting a warning in a cache buildnig routine
> as introduced in Torbjörn SVENSSON's

True. We might need to move it somewhere else. Though it is not something we will have to repeat over and over.

There is also the issue of communicating these warnings in a better way through the MI to IDE's.

> [PATCH v4] gdb/arm: Stop unwinding on error, but do not assert
> The warning is printed just once and not repeated on each backtrace command.
> 
> V2 update: warning supressed for other frames than the innermost one.
> 
> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
> ---
>   gdb/arm-tdep.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++---
>   1 file changed, 52 insertions(+), 3 deletions(-)
> 
> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
> index b5facae..3fd4f0c 100644
> --- a/gdb/arm-tdep.c
> +++ b/gdb/arm-tdep.c
> @@ -724,9 +724,30 @@ class target_arm_instruction_reader : public arm_instruction_reader
>     return 0;
>   }
> 
> +static inline bool
> +arm_m_addr_is_lockup (CORE_ADDR addr)
> +{
> +  switch (addr)
> +    {
> +      /* Values for Lockup state.
> +     For more details see "B1.5.15 Unrecoverable exception cases" in
> +     both ARMv6-M and ARMv7-M Architecture Reference Manuals, or
> +     see "B4.32 Lockup" in ARMv8-M Architecture Reference Manual. */
> +      case 0xeffffffe:
> +      case 0xfffffffe:
> +      case 0xffffffff:
> +    return 1;

1 -> true.
> +
> +      default:
> +    /* Address is not magic.  */
> +    return 0;

0 -> false.

> +    }
> +}
> +
>   /* Determine if the address specified equals any of these magic return
>      values, called EXC_RETURN, defined by the ARM v6-M, v7-M and v8-M
> -   architectures.
> +   architectures. Also include lockup magic PC value.
> +   Check also for FNC_RETURN if we have v8-M security extension.
> 
>      From ARMv6-M Reference Manual B1.5.8
>      Table B1-5 Exception return behavior
> @@ -769,6 +790,9 @@ class target_arm_instruction_reader : public arm_instruction_reader
>   static int
>   arm_m_addr_is_magic (struct gdbarch *gdbarch, CORE_ADDR addr)
>   {
> +  if (arm_m_addr_is_lockup (addr))
> +    return 1;
> +
>     arm_gdbarch_tdep *tdep = gdbarch_tdep<arm_gdbarch_tdep> (gdbarch);
>     if (tdep->have_sec_ext)
>       {
> @@ -3355,6 +3379,30 @@ struct frame_unwind arm_stub_unwind = {
>        describes which bits in LR that define which stack was used prior
>        to the exception and if FPU is used (causing extended stack frame).  */
> 
> +  /* In the lockup state PC contains a lockup magic value.
> +     The PC value of the the next outer frame is irreversibly
> +     lost. The other registers are intact so LR likely contains
> +     PC of some frame next to the outer one, but we cannot analyze
> +     the next outer frame without knowing its PC
> +     therefore we do not know SP fixup for this frame.
> +     Some heuristics to resynchronize SP might be possible (TODO?)

Stating a heuristic is possible should be enough. We avoid adding TODO's.

> +     For simplicity just terminate unwinding to prevent the unwinder
> +     going mad.  */
> +  CORE_ADDR pc = get_frame_pc (this_frame);
> +  if (arm_m_addr_is_lockup (pc))
> +    {
> +      /* The lockup can be real just in the innermost frame
> +     as the CPU is stopped and cannot create more frames.
> +     If we hit lockup magic PC in the other frame, it is
> +     just a sentinel at the top of stack: do not warn then.  */
> +      if (frame_relative_level (this_frame) == 0)
> +    warning (_("ARM M in lockup state, stack unwinding terminated."));
> +
> +      /* Terminate any further stack unwinding.  */
> +      arm_cache_set_active_sp_value (cache, tdep, 0);
> +      return cache;
> +    }
> +
>     CORE_ADDR lr = get_frame_register_unsigned (this_frame, ARM_LR_REGNUM);
> 
>     /* ARMv7-M Architecture Reference "A2.3.1 Arm core registers"
> @@ -3824,11 +3872,12 @@ struct frame_unwind arm_stub_unwind = {
>     return arm_m_addr_is_magic (gdbarch, this_pc);
>   }
> 
> -/* Frame unwinder for M-profile exceptions.  */
> +/* Frame unwinder for M-profile exceptions, lockup
> +   and secure/nonsecure interstate fnc calls.  */

fnc -> function?

> 
>   struct frame_unwind arm_m_exception_unwind =
>   {
> -  "arm m exception",
> +  "arm m exception lockup sec_fnc",
>     SIGTRAMP_FRAME,
>     arm_m_exception_frame_unwind_stop_reason,
>     arm_m_exception_this_id,
  
Tomas Vanek Oct. 17, 2022, 8:20 p.m. UTC | #2
Hi Luis,

On 17/10/2022 13:06, Luis Machado wrote:
>
>> With the change:
>> ----------------
>>    (gdb) c
>>    Continuing.
>>
>>    Program received signal SIGINT, Interrupt.
>>    warning: ARM M in lockup state, stack unwinding terminated.
>>    <signal handler called>
>>    (gdb) bt
>>    #0  <signal handler called>
>>    (gdb)
>> ----------------
>>
>> There is a visible drawback of emitting a warning in a cache buildnig 
>> routine
>> as introduced in Torbjörn SVENSSON's
>
> True. We might need to move it somewhere else. Though it is not 
> something we will have to repeat over and over.

Just an idea: What about a text message (describing something odd during 
unwind)
which an unwinder can associate with the frame and  gets printed later 
by 'backtrace' command?
It may also replace not too informative <signal handler called> for 
SIGTRAMP_FRAMEs.

Anyway the message about the CPU in lockup is useful even just after 
stopping the running target.

>
> There is also the issue of communicating these warnings in a better 
> way through the MI to IDE's.
>
>> [PATCH v4] gdb/arm: Stop unwinding on error, but do not assert
>> The warning is printed just once and not repeated on each backtrace 
>> command.
>>

Tomas
  

Patch

diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
index b5facae..3fd4f0c 100644
--- a/gdb/arm-tdep.c
+++ b/gdb/arm-tdep.c
@@ -724,9 +724,30 @@  class target_arm_instruction_reader : public 
arm_instruction_reader
    return 0;
  }

+static inline bool
+arm_m_addr_is_lockup (CORE_ADDR addr)
+{
+  switch (addr)
+    {
+      /* Values for Lockup state.
+     For more details see "B1.5.15 Unrecoverable exception cases" in
+     both ARMv6-M and ARMv7-M Architecture Reference Manuals, or
+     see "B4.32 Lockup" in ARMv8-M Architecture Reference Manual. */
+      case 0xeffffffe:
+      case 0xfffffffe:
+      case 0xffffffff:
+    return 1;
+
+      default:
+    /* Address is not magic.  */
+    return 0;
+    }
+}
+
  /* Determine if the address specified equals any of these magic return
     values, called EXC_RETURN, defined by the ARM v6-M, v7-M and v8-M
-   architectures.
+   architectures. Also include lockup magic PC value.
+   Check also for FNC_RETURN if we have v8-M security extension.

     From ARMv6-M Reference Manual B1.5.8