[v2] gdb/arm: Stop unwinding on error, but do not assert

Message ID 20221013091740.645783-1-torbjorn.svensson@foss.st.com
State Superseded
Headers
Series [v2] gdb/arm: Stop unwinding on error, but do not assert |

Commit Message

Torbjorn SVENSSON Oct. 13, 2022, 9:17 a.m. UTC
  When it's impossible to read the FPCCR and XPSR, the unwinding is
unpredictable as the it's not possible to determine the correct
frame size or padding.
The only sane thing to do in this condition is to stop the unwinding.

Without this patch, gdb would assert if this errornous state was
detected.

Signed-off-by: Torbj?rn SVENSSON  <torbjorn.svensson@foss.st.com>
---
 gdb/arm-tdep.c | 35 +++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)
  

Comments

Luis Machado Oct. 13, 2022, 9:46 a.m. UTC | #1
Hi,

On 10/13/22 10:17, Torbj?rn SVENSSON wrote:
> When it's impossible to read the FPCCR and XPSR, the unwinding is
> unpredictable as the it's not possible to determine the correct
> frame size or padding.
> The only sane thing to do in this condition is to stop the unwinding.
> 
> Without this patch, gdb would assert if this errornous state was
> detected.
> 

Could you please attach an example of the change in the commit message? What it does before
and after the change?

> Signed-off-by: Torbj?rn SVENSSON  <torbjorn.svensson@foss.st.com>
> ---
>   gdb/arm-tdep.c | 35 +++++++++++++++++++++++++++++------
>   1 file changed, 29 insertions(+), 6 deletions(-)
> 
> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
> index 041e6afefed..afcbce478c2 100644
> --- a/gdb/arm-tdep.c
> +++ b/gdb/arm-tdep.c
> @@ -3591,9 +3591,13 @@ arm_m_exception_cache (frame_info_ptr this_frame)
>   	  ULONGEST fpcar;
>   
>   	  /* Read FPCCR register.  */
> -	  gdb_assert (safe_read_memory_unsigned_integer (FPCCR,
> -							 ARM_INT_REGISTER_SIZE,
> -							 byte_order, &fpccr));
> +	 if (!safe_read_memory_unsigned_integer (FPCCR, ARM_INT_REGISTER_SIZE,
> +						 byte_order, &fpccr))
> +	   {
> +	     warning (_("Could not fetch required FPCCR content.  Further "
> +			"unwind is impossible."));
> +	     return NULL;


NULL -> nullptr everywhere.

> +	   }
>   
>   	  /* Read FPCAR register.  */
>   	  if (!safe_read_memory_unsigned_integer (FPCAR, ARM_INT_REGISTER_SIZE,
> @@ -3669,9 +3673,15 @@ arm_m_exception_cache (frame_info_ptr this_frame)
>   	 aligner between the top of the 32-byte stack frame and the
>   	 previous context's stack pointer.  */
>         ULONGEST xpsr;
> -      gdb_assert (safe_read_memory_unsigned_integer (cache->saved_regs[
> -						     ARM_PS_REGNUM].addr (), 4,
> -						     byte_order, &xpsr));
> +      if (!safe_read_memory_unsigned_integer (cache->saved_regs[ARM_PS_REGNUM]
> +					      .addr (), ARM_INT_REGISTER_SIZE,
> +					      byte_order, &xpsr))
> +	{
> +	  warning (_("Could not fetch required XPSR content.  Further unwind "
> +		     "is impossible."));
> +	  return NULL;
> +	}
> +
>         if (bit (xpsr, 9) != 0)
>   	{
>   	  CORE_ADDR new_sp = arm_cache_get_prev_sp_value (cache, tdep) + 4;
> @@ -3703,6 +3713,14 @@ arm_m_exception_this_id (frame_info_ptr this_frame,
>       *this_cache = arm_m_exception_cache (this_frame);
>     cache = (struct arm_prologue_cache *) *this_cache;
>   
> +  /* Unwind of this frame is not possible.  Return outer_frame_id to stop the
> +     unwinding.  */
> +  if (cache == NULL)
> +    {
> +      *this_id = outer_frame_id;
> +      return;
> +    }
> +
>     /* Our frame ID for a stub frame is the current SP and LR.  */
>     arm_gdbarch_tdep *tdep
>       = gdbarch_tdep<arm_gdbarch_tdep> (get_frame_arch (this_frame));
> @@ -3725,6 +3743,11 @@ arm_m_exception_prev_register (frame_info_ptr this_frame,
>       *this_cache = arm_m_exception_cache (this_frame);
>     cache = (struct arm_prologue_cache *) *this_cache;
>   
> +  /* It's not allowed to call prev_register when this_id has returned the
> +     outer_frame_id.  The arm_m_exception_cache function will return NULL when
> +     the frame cannot be properly unwinded.  */
> +  gdb_assert (cache != NULL);
> +

It does seem safe to assume this function won't be called if there is no frame_id. So I agree this would be
a GDB bug and needs an assertion here.

>     /* The value was already reconstructed into PREV_SP.  */
>     arm_gdbarch_tdep *tdep
>       = gdbarch_tdep<arm_gdbarch_tdep> (get_frame_arch (this_frame));
  
Pedro Alves Oct. 13, 2022, 11:21 a.m. UTC | #2
On 2022-10-13 10:17 a.m., Torbj?rn SVENSSON via Gdb-patches wrote:

> +  /* Unwind of this frame is not possible.  Return outer_frame_id to stop the
> +     unwinding.  */
> +  if (cache == NULL)
> +    {
> +      *this_id = outer_frame_id;
> +      return;
> +    }

Please let's not add more uses of outer_frame_id if we can avoid it.  We're getting
close to eliminating it.  Can a cache object still be returned, and then a frame id
be successfully computed?  

You can stop the unwinding in some other way.  For example, arm_m_exception_cache has a few
of these:

	      /* Terminate any further stack unwinding by referring to self.  */
	      arm_cache_set_active_sp_value (cache, tdep, sp);
	      return cache;

(Not sure exactly how that works.)

Alternatively, you can implement a frame_unwind::stop_reason callback and return
UNWIND_OUTERMOST, which is already done in arm-tdep.c in other scenarios too.
  
Torbjorn SVENSSON Oct. 13, 2022, 12:24 p.m. UTC | #3
Hi Pedro,

On 2022-10-13 13:21, Pedro Alves wrote:
> On 2022-10-13 10:17 a.m., Torbj?rn SVENSSON via Gdb-patches wrote:
> 
>> +  /* Unwind of this frame is not possible.  Return outer_frame_id to stop the
>> +     unwinding.  */
>> +  if (cache == NULL)
>> +    {
>> +      *this_id = outer_frame_id;
>> +      return;
>> +    }
> 
> Please let's not add more uses of outer_frame_id if we can avoid it.  We're getting
> close to eliminating it.  Can a cache object still be returned, and then a frame id
> be successfully computed?

The problem is that it's not always possible to know what registers that 
was written on the stack or if there was padding between 2 frames on the 
stack.
If a cache object is returned, wouldn't that imply that the content of 
this frame is supposed to be valid?

> You can stop the unwinding in some other way.  For example, arm_m_exception_cache has a few
> of these:
> 
> 	      /* Terminate any further stack unwinding by referring to self.  */
> 	      arm_cache_set_active_sp_value (cache, tdep, sp);
> 	      return cache;
> 
> (Not sure exactly how that works.)

I'm behind the some of those statements in arm-tdep.c, but the construct 
was copied from some other place in the GDB sources. I think there is 
some code in GDB that checks if 2 frames have the same SP value and in 
that case, stops the unwinding.

> Alternatively, you can implement a frame_unwind::stop_reason callback and return
> UNWIND_OUTERMOST, which is already done in arm-tdep.c in other scenarios too.

Is it guaranteed that the prev_register method won't be called for a 
cache object that have the UNWIND_OUTERMOST stop reason? If so, I 
suppose the struct arm_prologue_cache could be extended with another 
member that indicates if the frame was successfully unwinded or if there 
were some problem and the UNWIND_OUTERMOST should be returned. Would 
this be okay?

Kind regards,
Torbj?rn
  
Luis Machado Oct. 13, 2022, 1:11 p.m. UTC | #4
On 10/13/22 12:21, Pedro Alves wrote:
> On 2022-10-13 10:17 a.m., Torbj?rn SVENSSON via Gdb-patches wrote:
> 
>> +  /* Unwind of this frame is not possible.  Return outer_frame_id to stop the
>> +     unwinding.  */
>> +  if (cache == NULL)
>> +    {
>> +      *this_id = outer_frame_id;
>> +      return;
>> +    }
> 
> Please let's not add more uses of outer_frame_id if we can avoid it.  We're getting
> close to eliminating it.  Can a cache object still be returned, and then a frame id
> be successfully computed?

Sorry, is that deprecation of outer_frame_id documented somewhere? I haven't seen any warnings or
comments stating it is not supposed to be used.

It was even made more explicit with this commit:

commit 84154d166a1a4592c70e2a8296d5df0ad7f89be9
Author: Simon Marchi <simon.marchi@efficios.com>
Date:   Mon Aug 31 13:23:12 2020 -0400
List-Id: gdb-patches mailing list <gdb-patches.sourceware.org>

     gdb: introduce explicit outer frame id kind


So this is a bit confusing.

> 
> You can stop the unwinding in some other way.  For example, arm_m_exception_cache has a few
> of these:
> 
> 	      /* Terminate any further stack unwinding by referring to self.  */
> 	      arm_cache_set_active_sp_value (cache, tdep, sp);
> 	      return cache;
> 
> (Not sure exactly how that works.)

It probably works by creating a cycle that will be detected by the unwinding machinery.

> 
> Alternatively, you can implement a frame_unwind::stop_reason callback and return
> UNWIND_OUTERMOST, which is already done in arm-tdep.c in other scenarios too.

Maybe we should clarify these uses. It is nice to be able to stop unwinding gracefully, as opposed
to issuing a warning like "previous frame inner to this frame (corrupt stack?)".

It seems UNWIND_OUTERMOST might be able to do that.
  
Pedro Alves Oct. 13, 2022, 1:25 p.m. UTC | #5
On 2022-10-13 1:24 p.m., Torbjorn SVENSSON wrote:
> Hi Pedro,
> 
> On 2022-10-13 13:21, Pedro Alves wrote:
>> On 2022-10-13 10:17 a.m., Torbj?rn SVENSSON via Gdb-patches wrote:
>>
>>> +? /* Unwind of this frame is not possible.? Return outer_frame_id to stop the
>>> +???? unwinding.? */
>>> +? if (cache == NULL)
>>> +??? {
>>> +????? *this_id = outer_frame_id;
>>> +????? return;
>>> +??? }
>>
>> Please let's not add more uses of outer_frame_id if we can avoid it.? We're getting
>> close to eliminating it.? Can a cache object still be returned, and then a frame id
>> be successfully computed?
> 
> The problem is that it's not always possible to know what registers that was written on the stack or if there was padding between 2 frames on the stack.
> If a cache object is returned, wouldn't that imply that the content of this frame is supposed to be valid?

But that info is useful to unwind the prev frame from this_frame.  The contents of registers for this_frame are
unwound from this_frame->next.  The info needed to compute a frame id should be available.

On 2022-10-13 1:24 p.m., Torbjorn SVENSSON wrote:

>> Alternatively, you can implement a frame_unwind::stop_reason callback and return
>> UNWIND_OUTERMOST, which is already done in arm-tdep.c in other scenarios too.
> 
> Is it guaranteed that the prev_register method won't be called for a cache object that have the UNWIND_OUTERMOST stop reason?

That's how outer_frame_id manages to stop unwinding, because of the default implementation, here:

 /* The default frame unwinder stop_reason callback.  */
 
 enum unwind_stop_reason
 default_frame_unwind_stop_reason (frame_info_ptr this_frame,
 				  void **this_cache)
 {
   struct frame_id this_id = get_frame_id (this_frame);
 
   if (this_id == outer_frame_id)
     return UNWIND_OUTERMOST;
   else
     return UNWIND_NO_REASON;
 }

You hit this in frame.c:

  /* Check that this frame is unwindable.  If it isn't, don't try to
     unwind to the prev frame.  */
  this_frame->stop_reason
    = this_frame->unwind->stop_reason (this_frame,
				       &this_frame->prologue_cache);

  if (this_frame->stop_reason != UNWIND_NO_REASON)
    {
      frame_debug_printf
	("  -> nullptr // %s",
	 frame_stop_reason_symbol_string (this_frame->stop_reason));
      return NULL;
    }

> If so, I suppose the struct arm_prologue_cache could be extended with another member that indicates if the frame was
> successfully unwinded or if there were some problem and the UNWIND_OUTERMOST should be returned. Would this be okay?

Yes.
  
Pedro Alves Oct. 13, 2022, 1:41 p.m. UTC | #6
On 2022-10-13 2:11 p.m., Luis Machado wrote:
> On 10/13/22 12:21, Pedro Alves wrote:
>> On 2022-10-13 10:17 a.m., Torbj?rn SVENSSON via Gdb-patches wrote:
>>
>>> +? /* Unwind of this frame is not possible.? Return outer_frame_id to stop the
>>> +???? unwinding.? */
>>> +? if (cache == NULL)
>>> +??? {
>>> +????? *this_id = outer_frame_id;
>>> +????? return;
>>> +??? }
>>
>> Please let's not add more uses of outer_frame_id if we can avoid it.? We're getting
>> close to eliminating it.? Can a cache object still be returned, and then a frame id
>> be successfully computed?
> 
> Sorry, is that deprecation of outer_frame_id documented somewhere? I haven't seen any warnings or
> comments stating it is not supposed to be used.

It's not strictly deprecated, but it's better to avoid abusing it for cases it isn't meant for,
and fill in the frame id with the best info possible.

The frame in question is not an outer frame, it has a frame address.  We just happened
to fail to unwind from it.

> 
> It was even made more explicit with this commit:
> 
> commit 84154d166a1a4592c70e2a8296d5df0ad7f89be9
> Author: Simon Marchi <simon.marchi@efficios.com>
> Date:?? Mon Aug 31 13:23:12 2020 -0400
> 
> ??? gdb: introduce explicit outer frame id kind
> 

That actually removed explicit uses of outer_frame_id.  The explicit FID_STACK_OUTER
id kind is only used for the debug prints, as mentioned in the commit log.

The following patch removed even more explicit checks for outer_frame_id:

commit 22b9b4b05bfad806eb3d16535dcd4dbedb028c40
Author:     Scott Linder <scott@scottlinder.com>
AuthorDate: Mon Aug 31 13:24:20 2020 -0400
Commit:     Simon Marchi <simon.marchi@efficios.com>
CommitDate: Mon Aug 31 13:25:05 2020 -0400

    gdb: support frames inlined into the outer frame

> 
> So this is a bit confusing.
>
  

Patch

diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
index 041e6afefed..afcbce478c2 100644
--- a/gdb/arm-tdep.c
+++ b/gdb/arm-tdep.c
@@ -3591,9 +3591,13 @@  arm_m_exception_cache (frame_info_ptr this_frame)
 	  ULONGEST fpcar;
 
 	  /* Read FPCCR register.  */
-	  gdb_assert (safe_read_memory_unsigned_integer (FPCCR,
-							 ARM_INT_REGISTER_SIZE,
-							 byte_order, &fpccr));
+	 if (!safe_read_memory_unsigned_integer (FPCCR, ARM_INT_REGISTER_SIZE,
+						 byte_order, &fpccr))
+	   {
+	     warning (_("Could not fetch required FPCCR content.  Further "
+			"unwind is impossible."));
+	     return NULL;
+	   }
 
 	  /* Read FPCAR register.  */
 	  if (!safe_read_memory_unsigned_integer (FPCAR, ARM_INT_REGISTER_SIZE,
@@ -3669,9 +3673,15 @@  arm_m_exception_cache (frame_info_ptr this_frame)
 	 aligner between the top of the 32-byte stack frame and the
 	 previous context's stack pointer.  */
       ULONGEST xpsr;
-      gdb_assert (safe_read_memory_unsigned_integer (cache->saved_regs[
-						     ARM_PS_REGNUM].addr (), 4,
-						     byte_order, &xpsr));
+      if (!safe_read_memory_unsigned_integer (cache->saved_regs[ARM_PS_REGNUM]
+					      .addr (), ARM_INT_REGISTER_SIZE,
+					      byte_order, &xpsr))
+	{
+	  warning (_("Could not fetch required XPSR content.  Further unwind "
+		     "is impossible."));
+	  return NULL;
+	}
+
       if (bit (xpsr, 9) != 0)
 	{
 	  CORE_ADDR new_sp = arm_cache_get_prev_sp_value (cache, tdep) + 4;
@@ -3703,6 +3713,14 @@  arm_m_exception_this_id (frame_info_ptr this_frame,
     *this_cache = arm_m_exception_cache (this_frame);
   cache = (struct arm_prologue_cache *) *this_cache;
 
+  /* Unwind of this frame is not possible.  Return outer_frame_id to stop the
+     unwinding.  */
+  if (cache == NULL)
+    {
+      *this_id = outer_frame_id;
+      return;
+    }
+
   /* Our frame ID for a stub frame is the current SP and LR.  */
   arm_gdbarch_tdep *tdep
     = gdbarch_tdep<arm_gdbarch_tdep> (get_frame_arch (this_frame));
@@ -3725,6 +3743,11 @@  arm_m_exception_prev_register (frame_info_ptr this_frame,
     *this_cache = arm_m_exception_cache (this_frame);
   cache = (struct arm_prologue_cache *) *this_cache;
 
+  /* It's not allowed to call prev_register when this_id has returned the
+     outer_frame_id.  The arm_m_exception_cache function will return NULL when
+     the frame cannot be properly unwinded.  */
+  gdb_assert (cache != NULL);
+
   /* The value was already reconstructed into PREV_SP.  */
   arm_gdbarch_tdep *tdep
     = gdbarch_tdep<arm_gdbarch_tdep> (get_frame_arch (this_frame));