gdb/arm: Terminate frame unwinding in M-profile lockup state

Message ID 07d7fb29-4da7-bd22-8056-6b209c4973bf@fbl.cz
State Superseded
Headers
Series gdb/arm: Terminate frame unwinding in M-profile lockup state |

Commit Message

Tomas Vanek Oct. 16, 2022, 9:23 a.m. UTC
  In the lockup state the PC value of the the outer frame is irreversibly
lost. The other registers are intact so LR likely contains
PC of some frame next to the outer one, but we cannot analyze
the nearest outer frame without knowing its PC
therefore we do not know SP fixup for this frame.

The frame unwinder possibly gets mad due to the wrong SP value.
To prevent problems terminate unwinding if PC contains the magic
value of the lockup state.

Example session wihtout this change,
Cortex-M33 CPU in lockup, gdb 13.0.50.20221016-git:
----------------
   (gdb) c
   Continuing.

   Program received signal SIGINT, Interrupt.
   0xeffffffe in ?? ()
   (gdb) bt
   #0  0xeffffffe in ?? ()
   #1  0x0c000a9c in HardFault_Handler ()
       at 
C:/dvl/stm32l5trustzone/GPIO_IOToggle_TrustZone/Secure/Src/stm32l5xx_it.c:99
   #2  0x2002ffd8 in ?? ()
   Backtrace stopped: previous frame identical to this frame (corrupt 
stack?)
   (gdb)
----------------
The frame #1 is at correct PC taken from LR, #2 is a total nonsense.

With the change:
----------------
   (gdb) c
   Continuing.

   Program received signal SIGINT, Interrupt.
   warning: ARM M in lockup state, stack unwinding terminated.
   <signal handler called>
   (gdb) bt
   #0  <signal handler called>
   (gdb)
----------------

There is a visible drawback of emitting a warning in a cache buildnig 
routine
as introduced in Torbjörn SVENSSON's
[PATCH v4] gdb/arm: Stop unwinding on error, but do not assert
The warning is printed just once and not repeated on each backtrace command.

Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
---
  gdb/arm-tdep.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++---
  1 file changed, 46 insertions(+), 3 deletions(-)

     Table B1-5 Exception return behavior
@@ -769,6 +790,9 @@ class target_arm_instruction_reader : public 
arm_instruction_reader
  static int
  arm_m_addr_is_magic (struct gdbarch *gdbarch, CORE_ADDR addr)
  {
+  if (arm_m_addr_is_lockup (addr))
+    return 1;
+
    arm_gdbarch_tdep *tdep = gdbarch_tdep<arm_gdbarch_tdep> (gdbarch);
    if (tdep->have_sec_ext)
      {
@@ -3355,6 +3379,24 @@ struct frame_unwind arm_stub_unwind = {
       describes which bits in LR that define which stack was used prior
       to the exception and if FPU is used (causing extended stack 
frame).  */

+  /* In the lockup state PC contains a lockup magic value.
+     The PC value of the the next outer frame is irreversibly
+     lost. The other registers are intact so LR likely contains
+     PC of some frame next to the outer one, but we cannot analyze
+     the next outer frame without knowing its PC
+     therefore we do not know SP fixup for this frame.
+     Some heuristics to resynchronize SP might be possible (TODO?)
+     For simplicity just terminate unwinding to prevent the unwinder
+     going mad.  */
+  CORE_ADDR pc = get_frame_pc (this_frame);
+  if (arm_m_addr_is_lockup (pc))
+    {
+      warning (_("ARM M in lockup state, stack unwinding terminated."));
+      /* Terminate any further stack unwinding.  */
+      arm_cache_set_active_sp_value (cache, tdep, 0);
+      return cache;
+    }
+
    CORE_ADDR lr = get_frame_register_unsigned (this_frame, ARM_LR_REGNUM);

    /* ARMv7-M Architecture Reference "A2.3.1 Arm core registers"
@@ -3824,11 +3866,12 @@ struct frame_unwind arm_stub_unwind = {
    return arm_m_addr_is_magic (gdbarch, this_pc);
  }

-/* Frame unwinder for M-profile exceptions.  */
+/* Frame unwinder for M-profile exceptions, lockup
+   and secure/nonsecure interstate fnc calls.  */

  struct frame_unwind arm_m_exception_unwind =
  {
-  "arm m exception",
+  "arm m exception lockup sec_fnc",
    SIGTRAMP_FRAME,
    arm_m_exception_frame_unwind_stop_reason,
    arm_m_exception_this_id,
  

Comments

Torbjorn SVENSSON Oct. 16, 2022, 10:13 a.m. UTC | #1
Hi Tomas,

Thanks for sharing the patch.

On 2022-10-16 11:23, Tomas Vanek wrote:
> In the lockup state the PC value of the the outer frame is irreversibly
> lost. The other registers are intact so LR likely contains
> PC of some frame next to the outer one, but we cannot analyze
> the nearest outer frame without knowing its PC
> therefore we do not know SP fixup for this frame.
> 
> The frame unwinder possibly gets mad due to the wrong SP value.
> To prevent problems terminate unwinding if PC contains the magic
> value of the lockup state.
> 
> Example session wihtout this change,
> Cortex-M33 CPU in lockup, gdb 13.0.50.20221016-git:
> ----------------
>    (gdb) c
>    Continuing.
> 
>    Program received signal SIGINT, Interrupt.
>    0xeffffffe in ?? ()
>    (gdb) bt
>    #0  0xeffffffe in ?? ()
>    #1  0x0c000a9c in HardFault_Handler ()
>        at 
> C:/dvl/stm32l5trustzone/GPIO_IOToggle_TrustZone/Secure/Src/stm32l5xx_it.c:99
>    #2  0x2002ffd8 in ?? ()
>    Backtrace stopped: previous frame identical to this frame (corrupt 
> stack?)
>    (gdb)
> ----------------
> The frame #1 is at correct PC taken from LR, #2 is a total nonsense.

I think your assumption here is wrong as 0xeffffffe is a valid value for 
Cortex-M33 with TrustZone (FNC_RETURN with S=0). See more on the comment 
below.

> 
> With the change:
> ----------------
>    (gdb) c
>    Continuing.
> 
>    Program received signal SIGINT, Interrupt.
>    warning: ARM M in lockup state, stack unwinding terminated.
>    <signal handler called>
>    (gdb) bt
>    #0  <signal handler called>
>    (gdb)
> ----------------
> 
> There is a visible drawback of emitting a warning in a cache buildnig 
> routine
> as introduced in Torbjörn SVENSSON's
> [PATCH v4] gdb/arm: Stop unwinding on error, but do not assert
> The warning is printed just once and not repeated on each backtrace 
> command.
> 
> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
> ---
>   gdb/arm-tdep.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++---
>   1 file changed, 46 insertions(+), 3 deletions(-)
> 
> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
> index b5facae..5bd0336 100644
> --- a/gdb/arm-tdep.c
> +++ b/gdb/arm-tdep.c
> @@ -724,9 +724,30 @@ class target_arm_instruction_reader : public 
> arm_instruction_reader
>     return 0;
>   }
> 
> +static inline bool
> +arm_m_addr_is_lockup (CORE_ADDR addr)
> +{
> +  switch (addr)
> +    {
> +      /* Values for Lockup state.
> +     For more details see "B1.5.15 Unrecoverable exception cases" in
> +     both ARMv6-M and ARMv7-M Architecture Reference Manuals, or
> +     see "B4.32 Lockup" in ARMv8-M Architecture Reference Manual. */
> +      case 0xeffffffe:

Reading B3.33 (revision B.m) in the ARMv8-M documentation about lockup 
state, it's written that DHCSR.S_LOCKUP=1 AND PC=0xeffffffe. My 
interpretations that the PC value alone is not enough.

> +      case 0xfffffffe:
> +      case 0xffffffff:
> +    return 1;
> +
> +      default:
> +    /* Address is not magic.  */
> +    return 0;
> +    }
> +}
> +
>   /* Determine if the address specified equals any of these magic return
>      values, called EXC_RETURN, defined by the ARM v6-M, v7-M and v8-M
> -   architectures.
> +   architectures. Also include lockup magic PC value.
> +   Check also for FNC_RETURN if we have v8-M security extension.
> 
>      From ARMv6-M Reference Manual B1.5.8
>      Table B1-5 Exception return behavior
> @@ -769,6 +790,9 @@ class target_arm_instruction_reader : public 
> arm_instruction_reader
>   static int
>   arm_m_addr_is_magic (struct gdbarch *gdbarch, CORE_ADDR addr)
>   {
> +  if (arm_m_addr_is_lockup (addr))
> +    return 1;
> +
>     arm_gdbarch_tdep *tdep = gdbarch_tdep<arm_gdbarch_tdep> (gdbarch);
>     if (tdep->have_sec_ext)
>       {
> @@ -3355,6 +3379,24 @@ struct frame_unwind arm_stub_unwind = {
>        describes which bits in LR that define which stack was used prior
>        to the exception and if FPU is used (causing extended stack 
> frame).  */
> 
> +  /* In the lockup state PC contains a lockup magic value.
> +     The PC value of the the next outer frame is irreversibly
> +     lost. The other registers are intact so LR likely contains
> +     PC of some frame next to the outer one, but we cannot analyze
> +     the next outer frame without knowing its PC
> +     therefore we do not know SP fixup for this frame.
> +     Some heuristics to resynchronize SP might be possible (TODO?)
> +     For simplicity just terminate unwinding to prevent the unwinder
> +     going mad.  */
> +  CORE_ADDR pc = get_frame_pc (this_frame);
> +  if (arm_m_addr_is_lockup (pc))
> +    {
> +      warning (_("ARM M in lockup state, stack unwinding terminated."));
> +      /* Terminate any further stack unwinding.  */
> +      arm_cache_set_active_sp_value (cache, tdep, 0);
> +      return cache;
> +    }
> +
>     CORE_ADDR lr = get_frame_register_unsigned (this_frame, ARM_LR_REGNUM);
> 
>     /* ARMv7-M Architecture Reference "A2.3.1 Arm core registers"
> @@ -3824,11 +3866,12 @@ struct frame_unwind arm_stub_unwind = {
>     return arm_m_addr_is_magic (gdbarch, this_pc);
>   }
> 
> -/* Frame unwinder for M-profile exceptions.  */
> +/* Frame unwinder for M-profile exceptions, lockup
> +   and secure/nonsecure interstate fnc calls.  */
> 
>   struct frame_unwind arm_m_exception_unwind =
>   {
> -  "arm m exception",
> +  "arm m exception lockup sec_fnc",
>     SIGTRAMP_FRAME,
>     arm_m_exception_frame_unwind_stop_reason,
>     arm_m_exception_this_id,


Kind regards,
Torbjörn
  

Patch

diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
index b5facae..5bd0336 100644
--- a/gdb/arm-tdep.c
+++ b/gdb/arm-tdep.c
@@ -724,9 +724,30 @@  class target_arm_instruction_reader : public 
arm_instruction_reader
    return 0;
  }

+static inline bool
+arm_m_addr_is_lockup (CORE_ADDR addr)
+{
+  switch (addr)
+    {
+      /* Values for Lockup state.
+     For more details see "B1.5.15 Unrecoverable exception cases" in
+     both ARMv6-M and ARMv7-M Architecture Reference Manuals, or
+     see "B4.32 Lockup" in ARMv8-M Architecture Reference Manual. */
+      case 0xeffffffe:
+      case 0xfffffffe:
+      case 0xffffffff:
+    return 1;
+
+      default:
+    /* Address is not magic.  */
+    return 0;
+    }
+}
+
  /* Determine if the address specified equals any of these magic return
     values, called EXC_RETURN, defined by the ARM v6-M, v7-M and v8-M
-   architectures.
+   architectures. Also include lockup magic PC value.
+   Check also for FNC_RETURN if we have v8-M security extension.

     From ARMv6-M Reference Manual B1.5.8