aarch64: Fix an infinite loop on bt when the core dump has an SVE section but the target does not support it.

Message ID 20230319205529.75469-1-dmytro.medvied@gmail.com
State New
Headers
Series aarch64: Fix an infinite loop on bt when the core dump has an SVE section but the target does not support it. |

Commit Message

Дмитро Медвєдь March 19, 2023, 8:55 p.m. UTC
  When read_aarch64_ctx() returns AARCH64_SVE_MAGIC and the target does not support SVE gdb goes in an infinite loop because the section was not incremented by the size of the read context.
To fix this behavior, the section needs to be incremented by the size of the read context after the check for target supporting of SVE has returned false.
---
 gdb/aarch64-linux-tdep.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)
  

Comments

Luis Machado March 27, 2023, 9:42 a.m. UTC | #1
Hi,

On 3/19/23 20:55, Dmytro Medvied via Gdb-patches wrote:
> When read_aarch64_ctx() returns AARCH64_SVE_MAGIC and the target does not support SVE gdb goes in an infinite loop because the section was not incremented by the size of the read context.
> To fix this behavior, the section needs to be incremented by the size of the read context after the check for target supporting of SVE has returned false.
> ---
>   gdb/aarch64-linux-tdep.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
> index b183a3c9a38..439760fffe2 100644
> --- a/gdb/aarch64-linux-tdep.c
> +++ b/gdb/aarch64-linux-tdep.c
> @@ -337,7 +337,10 @@ aarch64_linux_sigframe_init (const struct tramp_frame *self,
>   	    uint16_t vq;
>   
>   	    if (!tdep->has_sve ())
> -	      break;
> +	      {
> +		section += size;
> +		break;
> +	      }
>   
>   	    if (target_read_memory (section + AARCH64_SVE_CONTEXT_VL_OFFSET,
>   				    buf, 2) != 0)

That's interesting. If the target doesn't support SVE, why is the SVE_MAGIC context being generated in the first place?

It should've been handled by the default case, where we increment the section by the size.

If gdb knows about SVE, it should be able to handle it, unless the target says the vector length is 0.

Could you please clarify what target you observed this on?
  
Дмитро Медвєдь March 29, 2023, 12:42 p.m. UTC | #2
Hi Luis,

On Mon, Mar 27, 2023 at 12:43 PM Luis Machado <luis.machado@arm.com> wrote:
> Hi,
>
> That's interesting. If the target doesn't support SVE, why is the SVE_MAGIC context being generated in the first place?
>
> It should've been handled by the default case, where we increment the section by the size.
>
> If gdb knows about SVE, it should be able to handle it, unless the target says the vector length is 0.
>
> Could you please clarify what target you observed this on?

The core dump was generated on the Amazon EC2 instance that is powered
by Arm-based AWS Graviton3 processor. It was not generated by gdb. To
make the core dump a modification of a third party tool
https://github.com/Percona-Lab/coredumper was used.  I assume that
this coredumper does not write all the info that is needed for gdb to
resolve the SVE section correctly.

I will prepare a minimal working example which reproduces the issue
with modified Percona-Lab coredumper and will provide the source and
the core dump, but I still think that infinite loop should be fixed in
gdb as well and maybe a warning should be added in this case.
  
Luis Machado March 29, 2023, 1:06 p.m. UTC | #3
Hi,

On 3/29/23 13:42, Дмитро Медвєдь wrote:
> Hi Luis,
> 
> On Mon, Mar 27, 2023 at 12:43 PM Luis Machado <luis.machado@arm.com> wrote:
>> Hi,
>>
>> That's interesting. If the target doesn't support SVE, why is the SVE_MAGIC context being generated in the first place?
>>
>> It should've been handled by the default case, where we increment the section by the size.
>>
>> If gdb knows about SVE, it should be able to handle it, unless the target says the vector length is 0.
>>
>> Could you please clarify what target you observed this on?
> 
> The core dump was generated on the Amazon EC2 instance that is powered
> by Arm-based AWS Graviton3 processor. It was not generated by gdb. To
> make the core dump a modification of a third party tool
> https://github.com/Percona-Lab/coredumper was used.  I assume that
> this coredumper does not write all the info that is needed for gdb to
> resolve the SVE section correctly.

Thanks for the additional information. That's very useful to give the bug
some context.

> 
> I will prepare a minimal working example which reproduces the issue
> with modified Percona-Lab coredumper and will provide the source and

I don't think that's going to be needed.

> the core dump, but I still think that infinite loop should be fixed in
> gdb as well and maybe a warning should be added in this case.

I agree. Even if it is a case of inconsistent state/data, gdb should still be able
to cope with it gracefully. A warning would be appropriate, stating SVE state has
been found, but it is being skipped because the target hasn't reported SVE support.

Do you have a copyright assignment with the FSF so we can take your contribution? The patch
may be trivial, but I thought I'd check anyway.
  
Luis Machado June 8, 2023, 1:47 p.m. UTC | #4
On 6/8/23 14:28, Dmytro Medvied wrote:
> Hi Luis,
> 
> On Wed, Mar 29, 2023 at 4:07 PM Luis Machado <luis.machado@arm.com> wrote:
>>
>> Do you have a copyright assignment with the FSF so we can take your contribution? The patch
>> may be trivial, but I thought I'd check anyway.
> 
> Yes, finally I have a signed copyright assignment with the FSF. I've
> attached it to this email.
> 
> Could you guide me what I should do next?

Great! Could you please re-send the patch to this list (refreshed to the latest gdb head), also mentioning the setup you ran into
issues with (SVE hardware, then modifications to the core file etc)? Just so we can record what the issue was.

I should be able to apply your patch then.

Thanks!
  

Patch

diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
index b183a3c9a38..439760fffe2 100644
--- a/gdb/aarch64-linux-tdep.c
+++ b/gdb/aarch64-linux-tdep.c
@@ -337,7 +337,10 @@  aarch64_linux_sigframe_init (const struct tramp_frame *self,
 	    uint16_t vq;
 
 	    if (!tdep->has_sve ())
-	      break;
+	      {
+		section += size;
+		break;
+	      }
 
 	    if (target_read_memory (section + AARCH64_SVE_CONTEXT_VL_OFFSET,
 				    buf, 2) != 0)