[v5,12/16,gdb/generic] corefile/bug: Use thread-specific gdbarch when dumping register state to core files

Message ID 20230907152018.1031257-13-luis.machado@arm.com
State New
Headers
Series SME support for AArch64 gdb/gdbserver on Linux |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Testing passed

Commit Message

Luis Machado Sept. 7, 2023, 3:20 p.m. UTC
  When we have a core file generated by gdb (via the gcore command), gdb dumps
the target description to a note.  During loading of that core file, gdb will
first try to load that saved target description.

This works fine for almost all architectures. But AArch64 has a few
dynamically-generated target descriptions/gdbarch depending on the vector
length that was in use at the time the core file was generated.

The target description gdb dumps to the core file note is the one generated
at the time of attachment/startup.  If, for example, the SVE vector length
changed during execution, this would not reflect on the core file, as gdb
would still dump the initial target description.

Another issue is that the gdbarch potentially doesn't match the thread's
real gdbarch, and so things like the register cache may have different formats
and sizes.

To address this, fetch the thread's architecture before dumping its register
state.  That way we will always use the correct target description/gdbarch.
---
 gdb/linux-tdep.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)
  

Comments

Luis Machado Sept. 8, 2023, 11:09 a.m. UTC | #1
Could a global maintainer please go through this change and let me know if it is OK? It touches a generic part of gdb.

Though I don't think it should change the behavior of non-aarch64 targets.

On 9/7/23 16:20, Luis Machado via Gdb-patches wrote:
> When we have a core file generated by gdb (via the gcore command), gdb dumps
> the target description to a note.  During loading of that core file, gdb will
> first try to load that saved target description.
> 
> This works fine for almost all architectures. But AArch64 has a few
> dynamically-generated target descriptions/gdbarch depending on the vector
> length that was in use at the time the core file was generated.
> 
> The target description gdb dumps to the core file note is the one generated
> at the time of attachment/startup.  If, for example, the SVE vector length
> changed during execution, this would not reflect on the core file, as gdb
> would still dump the initial target description.
> 
> Another issue is that the gdbarch potentially doesn't match the thread's
> real gdbarch, and so things like the register cache may have different formats
> and sizes.
> 
> To address this, fetch the thread's architecture before dumping its register
> state.  That way we will always use the correct target description/gdbarch.
> ---
>  gdb/linux-tdep.c | 18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
> index b5eee5e108c..7d0976932c6 100644
> --- a/gdb/linux-tdep.c
> +++ b/gdb/linux-tdep.c
> @@ -2099,12 +2099,28 @@ linux_make_corefile_notes (struct gdbarch *gdbarch, bfd *obfd, int *note_size)
>  					  stop_signal);
>  
>    if (signalled_thr != nullptr)
> -    linux_corefile_thread (signalled_thr, &thread_args);
> +    {
> +      /* On some architectures, like AArch64, each thread can have a distinct
> +	 gdbarch (due to scalable extensions), and using the inferior gdbarch
> +	 is incorrect.
> +
> +	 Fetch each thread's gdbarch and pass it down to the lower layers so
> +	 we can dump the right set of registers.  */
> +      thread_args.gdbarch = target_thread_architecture (signalled_thr->ptid);
> +      linux_corefile_thread (signalled_thr, &thread_args);
> +    }
>    for (thread_info *thr : current_inferior ()->non_exited_threads ())
>      {
>        if (thr == signalled_thr)
>  	continue;
>  
> +      /* On some architectures, like AArch64, each thread can have a distinct
> +	 gdbarch (due to scalable extensions), and using the inferior gdbarch
> +	 is incorrect.
> +
> +	 Fetch each thread's gdbarch and pass it down to the lower layers so
> +	 we can dump the right set of registers.  */
> +      thread_args.gdbarch = target_thread_architecture (thr->ptid);
>        linux_corefile_thread (thr, &thread_args);
>      }
>
  
Simon Marchi Sept. 8, 2023, 3:58 p.m. UTC | #2
On 9/8/23 07:09, Luis Machado via Gdb-patches wrote:
> Could a global maintainer please go through this change and let me know if it is OK? It touches a generic part of gdb.
> 
> Though I don't think it should change the behavior of non-aarch64 targets.
> 
> On 9/7/23 16:20, Luis Machado via Gdb-patches wrote:
>> When we have a core file generated by gdb (via the gcore command), gdb dumps
>> the target description to a note.  During loading of that core file, gdb will
>> first try to load that saved target description.
>>
>> This works fine for almost all architectures. But AArch64 has a few
>> dynamically-generated target descriptions/gdbarch depending on the vector
>> length that was in use at the time the core file was generated.
>>
>> The target description gdb dumps to the core file note is the one generated
>> at the time of attachment/startup.  If, for example, the SVE vector length
>> changed during execution, this would not reflect on the core file, as gdb
>> would still dump the initial target description.
>>
>> Another issue is that the gdbarch potentially doesn't match the thread's
>> real gdbarch, and so things like the register cache may have different formats
>> and sizes.
>>
>> To address this, fetch the thread's architecture before dumping its register
>> state.  That way we will always use the correct target description/gdbarch.
>> ---
>>  gdb/linux-tdep.c | 18 +++++++++++++++++-
>>  1 file changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
>> index b5eee5e108c..7d0976932c6 100644
>> --- a/gdb/linux-tdep.c
>> +++ b/gdb/linux-tdep.c
>> @@ -2099,12 +2099,28 @@ linux_make_corefile_notes (struct gdbarch *gdbarch, bfd *obfd, int *note_size)
>>  					  stop_signal);
>>  
>>    if (signalled_thr != nullptr)
>> -    linux_corefile_thread (signalled_thr, &thread_args);
>> +    {
>> +      /* On some architectures, like AArch64, each thread can have a distinct
>> +	 gdbarch (due to scalable extensions), and using the inferior gdbarch
>> +	 is incorrect.
>> +
>> +	 Fetch each thread's gdbarch and pass it down to the lower layers so
>> +	 we can dump the right set of registers.  */
>> +      thread_args.gdbarch = target_thread_architecture (signalled_thr->ptid);
>> +      linux_corefile_thread (signalled_thr, &thread_args);
>> +    }
>>    for (thread_info *thr : current_inferior ()->non_exited_threads ())
>>      {
>>        if (thr == signalled_thr)
>>  	continue;
>>  
>> +      /* On some architectures, like AArch64, each thread can have a distinct
>> +	 gdbarch (due to scalable extensions), and using the inferior gdbarch
>> +	 is incorrect.
>> +
>> +	 Fetch each thread's gdbarch and pass it down to the lower layers so
>> +	 we can dump the right set of registers.  */
>> +      thread_args.gdbarch = target_thread_architecture (thr->ptid);
>>        linux_corefile_thread (thr, &thread_args);
>>      }
>>  
> 

Makes sense to me:

Approved-By: Simon Marchi <simon.marchi@efficios.com>

I think the linux_corefile_thread_data structure is not useful nowadays.
It was probably used through some callback's void pointer before.  But
now linux_corefile_thread could be changed to accept individual
arguments instead, it would make things simpler.  Would you mind doing
this change as a cleanup on top of this series?  Or you can do it before
if you prefer.

Please remind me, does an AArch64 core file contain one target
description per thread, to account for the fact that different threads
could have different register layouts?  Or right now we just hope that
all threads use the same target description (which might be different
from what the inferior started with)?

Simon
  
Simon Marchi Sept. 8, 2023, 4:02 p.m. UTC | #3
> Please remind me, does an AArch64 core file contain one target
> description per thread, to account for the fact that different threads
> could have different register layouts?  Or right now we just hope that
> all threads use the same target description (which might be different
> from what the inferior started with)?
I think that the commit message on the following patch answers my
question: there isn't a full target desc for each thread, but using the
process-wide target desc plus by reading some bits from the SVE (and
eventually SME) state, you can derive on target desc per thread.  That
sounds right?

Simon
  
Luis Machado Sept. 8, 2023, 4:05 p.m. UTC | #4
On 9/8/23 16:58, Simon Marchi wrote:
> On 9/8/23 07:09, Luis Machado via Gdb-patches wrote:
>> Could a global maintainer please go through this change and let me know if it is OK? It touches a generic part of gdb.
>>
>> Though I don't think it should change the behavior of non-aarch64 targets.
>>
>> On 9/7/23 16:20, Luis Machado via Gdb-patches wrote:
>>> When we have a core file generated by gdb (via the gcore command), gdb dumps
>>> the target description to a note.  During loading of that core file, gdb will
>>> first try to load that saved target description.
>>>
>>> This works fine for almost all architectures. But AArch64 has a few
>>> dynamically-generated target descriptions/gdbarch depending on the vector
>>> length that was in use at the time the core file was generated.
>>>
>>> The target description gdb dumps to the core file note is the one generated
>>> at the time of attachment/startup.  If, for example, the SVE vector length
>>> changed during execution, this would not reflect on the core file, as gdb
>>> would still dump the initial target description.
>>>
>>> Another issue is that the gdbarch potentially doesn't match the thread's
>>> real gdbarch, and so things like the register cache may have different formats
>>> and sizes.
>>>
>>> To address this, fetch the thread's architecture before dumping its register
>>> state.  That way we will always use the correct target description/gdbarch.
>>> ---
>>>  gdb/linux-tdep.c | 18 +++++++++++++++++-
>>>  1 file changed, 17 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
>>> index b5eee5e108c..7d0976932c6 100644
>>> --- a/gdb/linux-tdep.c
>>> +++ b/gdb/linux-tdep.c
>>> @@ -2099,12 +2099,28 @@ linux_make_corefile_notes (struct gdbarch *gdbarch, bfd *obfd, int *note_size)
>>>  					  stop_signal);
>>>  
>>>    if (signalled_thr != nullptr)
>>> -    linux_corefile_thread (signalled_thr, &thread_args);
>>> +    {
>>> +      /* On some architectures, like AArch64, each thread can have a distinct
>>> +	 gdbarch (due to scalable extensions), and using the inferior gdbarch
>>> +	 is incorrect.
>>> +
>>> +	 Fetch each thread's gdbarch and pass it down to the lower layers so
>>> +	 we can dump the right set of registers.  */
>>> +      thread_args.gdbarch = target_thread_architecture (signalled_thr->ptid);
>>> +      linux_corefile_thread (signalled_thr, &thread_args);
>>> +    }
>>>    for (thread_info *thr : current_inferior ()->non_exited_threads ())
>>>      {
>>>        if (thr == signalled_thr)
>>>  	continue;
>>>  
>>> +      /* On some architectures, like AArch64, each thread can have a distinct
>>> +	 gdbarch (due to scalable extensions), and using the inferior gdbarch
>>> +	 is incorrect.
>>> +
>>> +	 Fetch each thread's gdbarch and pass it down to the lower layers so
>>> +	 we can dump the right set of registers.  */
>>> +      thread_args.gdbarch = target_thread_architecture (thr->ptid);
>>>        linux_corefile_thread (thr, &thread_args);
>>>      }
>>>  
>>
> 
> Makes sense to me:
> 
> Approved-By: Simon Marchi <simon.marchi@efficios.com>
> 
> I think the linux_corefile_thread_data structure is not useful nowadays.
> It was probably used through some callback's void pointer before.  But
> now linux_corefile_thread could be changed to accept individual
> arguments instead, it would make things simpler.  Would you mind doing
> this change as a cleanup on top of this series?  Or you can do it before
> if you prefer.
> 
> Please remind me, does an AArch64 core file contain one target
> description per thread, to account for the fact that different threads
> could have different register layouts?  Or right now we just hope that
> all threads use the same target description (which might be different
> from what the inferior started with)?

Right now the answer is no, so this is somewhat broken. We just have the one gdb xml note in the core file instead of per-thread notes.

We mostly get away with it because in practice it is difficult to see different vector lengths in each thread.

So either we output per-thread gdb xml notes or rely on the thread's register notes to figure out what the proper target description is
for each thread.

> 
> Simon
  

Patch

diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index b5eee5e108c..7d0976932c6 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -2099,12 +2099,28 @@  linux_make_corefile_notes (struct gdbarch *gdbarch, bfd *obfd, int *note_size)
 					  stop_signal);
 
   if (signalled_thr != nullptr)
-    linux_corefile_thread (signalled_thr, &thread_args);
+    {
+      /* On some architectures, like AArch64, each thread can have a distinct
+	 gdbarch (due to scalable extensions), and using the inferior gdbarch
+	 is incorrect.
+
+	 Fetch each thread's gdbarch and pass it down to the lower layers so
+	 we can dump the right set of registers.  */
+      thread_args.gdbarch = target_thread_architecture (signalled_thr->ptid);
+      linux_corefile_thread (signalled_thr, &thread_args);
+    }
   for (thread_info *thr : current_inferior ()->non_exited_threads ())
     {
       if (thr == signalled_thr)
 	continue;
 
+      /* On some architectures, like AArch64, each thread can have a distinct
+	 gdbarch (due to scalable extensions), and using the inferior gdbarch
+	 is incorrect.
+
+	 Fetch each thread's gdbarch and pass it down to the lower layers so
+	 we can dump the right set of registers.  */
+      thread_args.gdbarch = target_thread_architecture (thr->ptid);
       linux_corefile_thread (thr, &thread_args);
     }