[aarch64] Fix gdbserver crashes on SVE/SME-enabled systems
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_gdb_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gdb_check--master-arm |
success
|
Test passed
|
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 |
success
|
Test passed
|
Commit Message
Commit 51e6b8cfd649013ae16a3d00f1451b2531ba6bc9 fixed a
regression for SVE/SME registers on gdb's side by using a <= comparison for
regcache's raw_compare assertion check. We seem to have failed to do the same
for gdbserver's raw_compare counterpart.
With the code as it is, I'm seeing a lot of crashes for gdbserver on a machine
with SVE enabled. For instance, with the following invocation:
make check-gdb RUNTESTFLAGS="--target_board=native-gdbserver" TESTS=gdb.base/break.exp
Running /work/builds/binutils-gdb/gdb/testsuite/../../../../repos/binutils-gdb/gdb/testsuite/gdb.base/break.exp ...
FAIL: gdb.base/break.exp: test_break: run until function breakpoint
FAIL: gdb.base/break.exp: test_break: run until breakpoint set at a line number (the program is no longer running)
FAIL: gdb.base/break.exp: test_break: run until file:function(6) breakpoint (the program is no longer running)
FAIL: gdb.base/break.exp: test_break: run until file:function(5) breakpoint (the program is no longer running)
FAIL: gdb.base/break.exp: test_break: run until file:function(4) breakpoint (the program is no longer running)
FAIL: gdb.base/break.exp: test_break: run until file:function(3) breakpoint (the program is no longer running)
FAIL: gdb.base/break.exp: test_break: run until file:function(2) breakpoint (the program is no longer running)
FAIL: gdb.base/break.exp: test_break: run until file:function(1) breakpoint (the program is no longer running)
FAIL: gdb.base/break.exp: test_break: run until quoted breakpoint (the program is no longer running)
FAIL: gdb.base/break.exp: test_break: run until file:linenum breakpoint (the program is no longer running)
FAIL: gdb.base/break.exp: test_break: breakpoint offset +1
FAIL: gdb.base/break.exp: test_break: step onto breakpoint (the program is no longer running)
FAIL: gdb.base/break.exp: test_break: setting breakpoint at }
FAIL: gdb.base/break.exp: test_break: continue to breakpoint at } (the program is no longer running)
FAIL: gdb.base/break.exp: test_no_break_on_catchpoint: runto: run to main
FAIL: gdb.base/break.exp: test_break_nonexistent_line: runto: run to main
FAIL: gdb.base/break.exp: test_break_default: runto: run to main
FAIL: gdb.base/break.exp: test_break_silent_and_more: runto: run to main
FAIL: gdb.base/break.exp: test_break_line_convenience_var: runto: run to main
FAIL: gdb.base/break.exp: test_break_user_call: runto: run to main
FAIL: gdb.base/break.exp: test_finish_arguments: runto: run to main
FAIL: gdb.base/break.exp: test_next_with_recursion: kill program
FAIL: gdb.base/break.exp: test_next_with_recursion: run to factorial(6)
FAIL: gdb.base/break.exp: test_next_with_recursion: continue to factorial(5) (the program is no longer running)
FAIL: gdb.base/break.exp: test_next_with_recursion: backtrace from factorial(5)
FAIL: gdb.base/break.exp: test_next_with_recursion: next to recursive call (the program is no longer running)
FAIL: gdb.base/break.exp: test_next_with_recursion: next over recursive call (the program is no longer running)
FAIL: gdb.base/break.exp: test_next_with_recursion: backtrace from factorial(5.1)
FAIL: gdb.base/break.exp: test_next_with_recursion: continue until exit at recursive next test (the program is no longer running)
FAIL: gdb.base/break.exp: test_break_optimized_prologue: run until function breakpoint, optimized file
FAIL: gdb.base/break.exp: test_break_optimized_prologue: run until breakpoint set at small function, optimized file (the program is no longer running)
FAIL: gdb.base/break.exp: test_rbreak_shlib: rbreak junk
Adjusting the regcache raw_compare assertion check to use <= fixes
the problem on aarch64-linux on a SVE-capable system.
I plan to backport this fix to GDB 16 as well, if that's OK.
---
gdbserver/regcache.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On 2/28/25 6:05 AM, Luis Machado wrote:
> Commit 51e6b8cfd649013ae16a3d00f1451b2531ba6bc9 fixed a
> regression for SVE/SME registers on gdb's side by using a <= comparison for
> regcache's raw_compare assertion check. We seem to have failed to do the same
> for gdbserver's raw_compare counterpart.
>
> With the code as it is, I'm seeing a lot of crashes for gdbserver on a machine
> with SVE enabled. For instance, with the following invocation:
>
> make check-gdb RUNTESTFLAGS="--target_board=native-gdbserver" TESTS=gdb.base/break.exp
>
> Running /work/builds/binutils-gdb/gdb/testsuite/../../../../repos/binutils-gdb/gdb/testsuite/gdb.base/break.exp ...
> FAIL: gdb.base/break.exp: test_break: run until function breakpoint
> FAIL: gdb.base/break.exp: test_break: run until breakpoint set at a line number (the program is no longer running)
> FAIL: gdb.base/break.exp: test_break: run until file:function(6) breakpoint (the program is no longer running)
> FAIL: gdb.base/break.exp: test_break: run until file:function(5) breakpoint (the program is no longer running)
> FAIL: gdb.base/break.exp: test_break: run until file:function(4) breakpoint (the program is no longer running)
> FAIL: gdb.base/break.exp: test_break: run until file:function(3) breakpoint (the program is no longer running)
> FAIL: gdb.base/break.exp: test_break: run until file:function(2) breakpoint (the program is no longer running)
> FAIL: gdb.base/break.exp: test_break: run until file:function(1) breakpoint (the program is no longer running)
> FAIL: gdb.base/break.exp: test_break: run until quoted breakpoint (the program is no longer running)
> FAIL: gdb.base/break.exp: test_break: run until file:linenum breakpoint (the program is no longer running)
> FAIL: gdb.base/break.exp: test_break: breakpoint offset +1
> FAIL: gdb.base/break.exp: test_break: step onto breakpoint (the program is no longer running)
> FAIL: gdb.base/break.exp: test_break: setting breakpoint at }
> FAIL: gdb.base/break.exp: test_break: continue to breakpoint at } (the program is no longer running)
> FAIL: gdb.base/break.exp: test_no_break_on_catchpoint: runto: run to main
> FAIL: gdb.base/break.exp: test_break_nonexistent_line: runto: run to main
> FAIL: gdb.base/break.exp: test_break_default: runto: run to main
> FAIL: gdb.base/break.exp: test_break_silent_and_more: runto: run to main
> FAIL: gdb.base/break.exp: test_break_line_convenience_var: runto: run to main
> FAIL: gdb.base/break.exp: test_break_user_call: runto: run to main
> FAIL: gdb.base/break.exp: test_finish_arguments: runto: run to main
> FAIL: gdb.base/break.exp: test_next_with_recursion: kill program
> FAIL: gdb.base/break.exp: test_next_with_recursion: run to factorial(6)
> FAIL: gdb.base/break.exp: test_next_with_recursion: continue to factorial(5) (the program is no longer running)
> FAIL: gdb.base/break.exp: test_next_with_recursion: backtrace from factorial(5)
> FAIL: gdb.base/break.exp: test_next_with_recursion: next to recursive call (the program is no longer running)
> FAIL: gdb.base/break.exp: test_next_with_recursion: next over recursive call (the program is no longer running)
> FAIL: gdb.base/break.exp: test_next_with_recursion: backtrace from factorial(5.1)
> FAIL: gdb.base/break.exp: test_next_with_recursion: continue until exit at recursive next test (the program is no longer running)
> FAIL: gdb.base/break.exp: test_break_optimized_prologue: run until function breakpoint, optimized file
> FAIL: gdb.base/break.exp: test_break_optimized_prologue: run until breakpoint set at small function, optimized file (the program is no longer running)
> FAIL: gdb.base/break.exp: test_rbreak_shlib: rbreak junk
>
> Adjusting the regcache raw_compare assertion check to use <= fixes
> the problem on aarch64-linux on a SVE-capable system.
>
> I plan to backport this fix to GDB 16 as well, if that's OK.
> ---
> gdbserver/regcache.cc | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gdbserver/regcache.cc b/gdbserver/regcache.cc
> index ee0c1b363b0..c08c9ae06c1 100644
> --- a/gdbserver/regcache.cc
> +++ b/gdbserver/regcache.cc
> @@ -503,7 +503,7 @@ regcache::raw_compare (int regnum, const void *buf, int offset) const
> gdb_assert (buf != NULL);
>
> gdb::array_view<const gdb_byte> regbuf = register_data (this, regnum);
> - gdb_assert (offset < regbuf.size ());
> + gdb_assert (offset <= regbuf.size ());
> regbuf = regbuf.slice (offset);
>
> return memcmp (buf, regbuf.data (), regbuf.size ()) == 0;
> --
> 2.25.1
>
In my commit message in 51e6b8cfd649013ae16a3d00f1451b2531ba6bc9, I seem
pretty convinced that comparing 0 bytes was needed for SVE, but I have
no idea why it was needed. But if we did it for GDB, it makes sense to
do it for GDBserver.
I didn't look into it, but can you check if it would be easy to add a
GDBserver selftest to check this, like I added in GDB? I don't want you
to spend an eternity on it, but if it's easy it would be nice.
Simon
Hi,
On 3/3/25 15:16, Simon Marchi wrote:
> On 2/28/25 6:05 AM, Luis Machado wrote:
>> Commit 51e6b8cfd649013ae16a3d00f1451b2531ba6bc9 fixed a
>> regression for SVE/SME registers on gdb's side by using a <= comparison for
>> regcache's raw_compare assertion check. We seem to have failed to do the same
>> for gdbserver's raw_compare counterpart.
>>
>> With the code as it is, I'm seeing a lot of crashes for gdbserver on a machine
>> with SVE enabled. For instance, with the following invocation:
>>
>> make check-gdb RUNTESTFLAGS="--target_board=native-gdbserver" TESTS=gdb.base/break.exp
>>
>> Running /work/builds/binutils-gdb/gdb/testsuite/../../../../repos/binutils-gdb/gdb/testsuite/gdb.base/break.exp ...
>> FAIL: gdb.base/break.exp: test_break: run until function breakpoint
>> FAIL: gdb.base/break.exp: test_break: run until breakpoint set at a line number (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_break: run until file:function(6) breakpoint (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_break: run until file:function(5) breakpoint (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_break: run until file:function(4) breakpoint (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_break: run until file:function(3) breakpoint (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_break: run until file:function(2) breakpoint (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_break: run until file:function(1) breakpoint (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_break: run until quoted breakpoint (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_break: run until file:linenum breakpoint (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_break: breakpoint offset +1
>> FAIL: gdb.base/break.exp: test_break: step onto breakpoint (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_break: setting breakpoint at }
>> FAIL: gdb.base/break.exp: test_break: continue to breakpoint at } (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_no_break_on_catchpoint: runto: run to main
>> FAIL: gdb.base/break.exp: test_break_nonexistent_line: runto: run to main
>> FAIL: gdb.base/break.exp: test_break_default: runto: run to main
>> FAIL: gdb.base/break.exp: test_break_silent_and_more: runto: run to main
>> FAIL: gdb.base/break.exp: test_break_line_convenience_var: runto: run to main
>> FAIL: gdb.base/break.exp: test_break_user_call: runto: run to main
>> FAIL: gdb.base/break.exp: test_finish_arguments: runto: run to main
>> FAIL: gdb.base/break.exp: test_next_with_recursion: kill program
>> FAIL: gdb.base/break.exp: test_next_with_recursion: run to factorial(6)
>> FAIL: gdb.base/break.exp: test_next_with_recursion: continue to factorial(5) (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_next_with_recursion: backtrace from factorial(5)
>> FAIL: gdb.base/break.exp: test_next_with_recursion: next to recursive call (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_next_with_recursion: next over recursive call (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_next_with_recursion: backtrace from factorial(5.1)
>> FAIL: gdb.base/break.exp: test_next_with_recursion: continue until exit at recursive next test (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_break_optimized_prologue: run until function breakpoint, optimized file
>> FAIL: gdb.base/break.exp: test_break_optimized_prologue: run until breakpoint set at small function, optimized file (the program is no longer running)
>> FAIL: gdb.base/break.exp: test_rbreak_shlib: rbreak junk
>>
>> Adjusting the regcache raw_compare assertion check to use <= fixes
>> the problem on aarch64-linux on a SVE-capable system.
>>
>> I plan to backport this fix to GDB 16 as well, if that's OK.
>> ---
>> gdbserver/regcache.cc | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gdbserver/regcache.cc b/gdbserver/regcache.cc
>> index ee0c1b363b0..c08c9ae06c1 100644
>> --- a/gdbserver/regcache.cc
>> +++ b/gdbserver/regcache.cc
>> @@ -503,7 +503,7 @@ regcache::raw_compare (int regnum, const void *buf, int offset) const
>> gdb_assert (buf != NULL);
>>
>> gdb::array_view<const gdb_byte> regbuf = register_data (this, regnum);
>> - gdb_assert (offset < regbuf.size ());
>> + gdb_assert (offset <= regbuf.size ());
>> regbuf = regbuf.slice (offset);
>>
>> return memcmp (buf, regbuf.data (), regbuf.size ()) == 0;
>> --
>> 2.25.1
>>
>
> In my commit message in 51e6b8cfd649013ae16a3d00f1451b2531ba6bc9, I seem
> pretty convinced that comparing 0 bytes was needed for SVE, but I have
> no idea why it was needed. But if we did it for GDB, it makes sense to
> do it for GDBserver.
>
> I didn't look into it, but can you check if it would be easy to add a
> GDBserver selftest to check this, like I added in GDB? I don't want you
> to spend an eternity on it, but if it's easy it would be nice.
>
> Simon
Sure, I can do that. I tried running the selftest hoping it would transfer
things over RSP and, as a consequence, test gdbserver. But that did not work
obviously. Let me come up with something.
Luis
On 3/3/25 10:19 AM, Luis Machado wrote:
> Sure, I can do that. I tried running the selftest hoping it would transfer
> things over RSP and, as a consequence, test gdbserver. But that did not work
> obviously. Let me come up with something.
No, the selftests are really just unit tests instantiating objects and
poking at them. The difficulty is when some object has a dependency.
Sometimes, it is easy to mock the dependency, sometimes not.
Simon
Hi,
On 3/3/25 15:19, Luis Machado wrote:
> Hi,
>
> On 3/3/25 15:16, Simon Marchi wrote:
>> On 2/28/25 6:05 AM, Luis Machado wrote:
>>> Commit 51e6b8cfd649013ae16a3d00f1451b2531ba6bc9 fixed a
>>> regression for SVE/SME registers on gdb's side by using a <= comparison for
>>> regcache's raw_compare assertion check. We seem to have failed to do the same
>>> for gdbserver's raw_compare counterpart.
>>>
>>> With the code as it is, I'm seeing a lot of crashes for gdbserver on a machine
>>> with SVE enabled. For instance, with the following invocation:
>>>
>>> make check-gdb RUNTESTFLAGS="--target_board=native-gdbserver" TESTS=gdb.base/break.exp
>>>
>>> Running /work/builds/binutils-gdb/gdb/testsuite/../../../../repos/binutils-gdb/gdb/testsuite/gdb.base/break.exp ...
>>> FAIL: gdb.base/break.exp: test_break: run until function breakpoint
>>> FAIL: gdb.base/break.exp: test_break: run until breakpoint set at a line number (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_break: run until file:function(6) breakpoint (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_break: run until file:function(5) breakpoint (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_break: run until file:function(4) breakpoint (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_break: run until file:function(3) breakpoint (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_break: run until file:function(2) breakpoint (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_break: run until file:function(1) breakpoint (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_break: run until quoted breakpoint (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_break: run until file:linenum breakpoint (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_break: breakpoint offset +1
>>> FAIL: gdb.base/break.exp: test_break: step onto breakpoint (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_break: setting breakpoint at }
>>> FAIL: gdb.base/break.exp: test_break: continue to breakpoint at } (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_no_break_on_catchpoint: runto: run to main
>>> FAIL: gdb.base/break.exp: test_break_nonexistent_line: runto: run to main
>>> FAIL: gdb.base/break.exp: test_break_default: runto: run to main
>>> FAIL: gdb.base/break.exp: test_break_silent_and_more: runto: run to main
>>> FAIL: gdb.base/break.exp: test_break_line_convenience_var: runto: run to main
>>> FAIL: gdb.base/break.exp: test_break_user_call: runto: run to main
>>> FAIL: gdb.base/break.exp: test_finish_arguments: runto: run to main
>>> FAIL: gdb.base/break.exp: test_next_with_recursion: kill program
>>> FAIL: gdb.base/break.exp: test_next_with_recursion: run to factorial(6)
>>> FAIL: gdb.base/break.exp: test_next_with_recursion: continue to factorial(5) (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_next_with_recursion: backtrace from factorial(5)
>>> FAIL: gdb.base/break.exp: test_next_with_recursion: next to recursive call (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_next_with_recursion: next over recursive call (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_next_with_recursion: backtrace from factorial(5.1)
>>> FAIL: gdb.base/break.exp: test_next_with_recursion: continue until exit at recursive next test (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_break_optimized_prologue: run until function breakpoint, optimized file
>>> FAIL: gdb.base/break.exp: test_break_optimized_prologue: run until breakpoint set at small function, optimized file (the program is no longer running)
>>> FAIL: gdb.base/break.exp: test_rbreak_shlib: rbreak junk
>>>
>>> Adjusting the regcache raw_compare assertion check to use <= fixes
>>> the problem on aarch64-linux on a SVE-capable system.
>>>
>>> I plan to backport this fix to GDB 16 as well, if that's OK.
>>> ---
>>> gdbserver/regcache.cc | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/gdbserver/regcache.cc b/gdbserver/regcache.cc
>>> index ee0c1b363b0..c08c9ae06c1 100644
>>> --- a/gdbserver/regcache.cc
>>> +++ b/gdbserver/regcache.cc
>>> @@ -503,7 +503,7 @@ regcache::raw_compare (int regnum, const void *buf, int offset) const
>>> gdb_assert (buf != NULL);
>>>
>>> gdb::array_view<const gdb_byte> regbuf = register_data (this, regnum);
>>> - gdb_assert (offset < regbuf.size ());
>>> + gdb_assert (offset <= regbuf.size ());
>>> regbuf = regbuf.slice (offset);
>>>
>>> return memcmp (buf, regbuf.data (), regbuf.size ()) == 0;
>>> --
>>> 2.25.1
>>>
>>
>> In my commit message in 51e6b8cfd649013ae16a3d00f1451b2531ba6bc9, I seem
>> pretty convinced that comparing 0 bytes was needed for SVE, but I have
>> no idea why it was needed. But if we did it for GDB, it makes sense to
>> do it for GDBserver.
>>
>> I didn't look into it, but can you check if it would be easy to add a
>> GDBserver selftest to check this, like I added in GDB? I don't want you
>> to spend an eternity on it, but if it's easy it would be nice.
>>
>> Simon
>
> Sure, I can do that. I tried running the selftest hoping it would transfer
> things over RSP and, as a consequence, test gdbserver. But that did not work
> obviously. Let me come up with something.
>
> Luis
Before venturing into creating a more substantial check for this, I took a step back to
understand what was going on.
I think there is an oversight in the code that handles SVE comparisons. It is a corner case
and only happens when we have a vector length of 128-bit, meaning the Z registers and the V
registers have the same size. So it is a complete overlap.
That isn't handled correctly in the code and we pass that down to raw_compare, which has to deal
with an empty comparison. It shouldn't, in my opinion.
What we should do instead is to check if the Z and V registers have the same size and skip this
comparison. Otherwise, if their sizes differ, we should go ahead and compare the contents of the
part that doesn't overlap with the V register.
If we fix the AArch64 code, there isn't a need to consider the 0-byte comparison case in raw_compare,
which means we can use a < comparison as opposed to <= for the offset.
I'll run a few more checks and will send an updated patch.
On 3/5/25 10:56 AM, Luis Machado wrote:
> Before venturing into creating a more substantial check for this, I took a step back to
> understand what was going on.
>
> I think there is an oversight in the code that handles SVE comparisons. It is a corner case
> and only happens when we have a vector length of 128-bit, meaning the Z registers and the V
> registers have the same size. So it is a complete overlap.
>
> That isn't handled correctly in the code and we pass that down to raw_compare, which has to deal
> with an empty comparison. It shouldn't, in my opinion.
>
> What we should do instead is to check if the Z and V registers have the same size and skip this
> comparison. Otherwise, if their sizes differ, we should go ahead and compare the contents of the
> part that doesn't overlap with the V register.
>
> If we fix the AArch64 code, there isn't a need to consider the 0-byte comparison case in raw_compare,
> which means we can use a < comparison as opposed to <= for the offset.
Thanks for digging into this.
I think that both approaches have their merits. If you allow a 0-byte
comparison, it can make caller code simpler. For example, in the SVE
case, you can avoid that "are registers the same size" check. On the
other hand, disallowing 0-byte comparison could find some bugs where we
messed up our offsets and we don't really want to compare 0 bytes.
I don't have a strong preference, as long as it's consistent between GDB
and GDBserver and well documented.
Simon
On 3/5/25 16:15, Simon Marchi wrote:
> On 3/5/25 10:56 AM, Luis Machado wrote:
>> Before venturing into creating a more substantial check for this, I took a step back to
>> understand what was going on.
>>
>> I think there is an oversight in the code that handles SVE comparisons. It is a corner case
>> and only happens when we have a vector length of 128-bit, meaning the Z registers and the V
>> registers have the same size. So it is a complete overlap.
>>
>> That isn't handled correctly in the code and we pass that down to raw_compare, which has to deal
>> with an empty comparison. It shouldn't, in my opinion.
>>
>> What we should do instead is to check if the Z and V registers have the same size and skip this
>> comparison. Otherwise, if their sizes differ, we should go ahead and compare the contents of the
>> part that doesn't overlap with the V register.
>>
>> If we fix the AArch64 code, there isn't a need to consider the 0-byte comparison case in raw_compare,
>> which means we can use a < comparison as opposed to <= for the offset.
>
> Thanks for digging into this.
>
> I think that both approaches have their merits. If you allow a 0-byte
> comparison, it can make caller code simpler. For example, in the SVE
> case, you can avoid that "are registers the same size" check. On the
> other hand, disallowing 0-byte comparison could find some bugs where we
> messed up our offsets and we don't really want to compare 0 bytes.
>
> I don't have a strong preference, as long as it's consistent between GDB
> and GDBserver and well documented.
Alright. That's a reasonable assessment. The standard says memcmp is supposed to return 0 for a 0-length
comparison. So that's one more in favor of keeping that behavior.
Let me think about this a bit more.
Hi Luis,
On Wed, Mar 05, 2025 at 04:59:28PM +0000, Luis Machado wrote:
> On 3/5/25 16:15, Simon Marchi wrote:
> > On 3/5/25 10:56 AM, Luis Machado wrote:
> >> Before venturing into creating a more substantial check for this, I took a step back to
> >> understand what was going on.
> >>
> >> I think there is an oversight in the code that handles SVE comparisons. It is a corner case
> >> and only happens when we have a vector length of 128-bit, meaning the Z registers and the V
> >> registers have the same size. So it is a complete overlap.
> >>
> >> That isn't handled correctly in the code and we pass that down to raw_compare, which has to deal
> >> with an empty comparison. It shouldn't, in my opinion.
> >>
> >> What we should do instead is to check if the Z and V registers have the same size and skip this
> >> comparison. Otherwise, if their sizes differ, we should go ahead and compare the contents of the
> >> part that doesn't overlap with the V register.
> >>
> >> If we fix the AArch64 code, there isn't a need to consider the 0-byte comparison case in raw_compare,
> >> which means we can use a < comparison as opposed to <= for the offset.
> >
> > Thanks for digging into this.
> >
> > I think that both approaches have their merits. If you allow a 0-byte
> > comparison, it can make caller code simpler. For example, in the SVE
> > case, you can avoid that "are registers the same size" check. On the
> > other hand, disallowing 0-byte comparison could find some bugs where we
> > messed up our offsets and we don't really want to compare 0 bytes.
> >
> > I don't have a strong preference, as long as it's consistent between GDB
> > and GDBserver and well documented.
>
> Alright. That's a reasonable assessment. The standard says memcmp is supposed to return 0 for a 0-length
> comparison. So that's one more in favor of keeping that behavior.
>
> Let me think about this a bit more.
Just checking in on this patch, as this one is in the list of potential
fixes for the 16.3 release. Have you made a decision?
Thank you!
On 3/22/25 12:43, Joel Brobecker wrote:
> Hi Luis,
>
> On Wed, Mar 05, 2025 at 04:59:28PM +0000, Luis Machado wrote:
>> On 3/5/25 16:15, Simon Marchi wrote:
>>> On 3/5/25 10:56 AM, Luis Machado wrote:
>>>> Before venturing into creating a more substantial check for this, I took a step back to
>>>> understand what was going on.
>>>>
>>>> I think there is an oversight in the code that handles SVE comparisons. It is a corner case
>>>> and only happens when we have a vector length of 128-bit, meaning the Z registers and the V
>>>> registers have the same size. So it is a complete overlap.
>>>>
>>>> That isn't handled correctly in the code and we pass that down to raw_compare, which has to deal
>>>> with an empty comparison. It shouldn't, in my opinion.
>>>>
>>>> What we should do instead is to check if the Z and V registers have the same size and skip this
>>>> comparison. Otherwise, if their sizes differ, we should go ahead and compare the contents of the
>>>> part that doesn't overlap with the V register.
>>>>
>>>> If we fix the AArch64 code, there isn't a need to consider the 0-byte comparison case in raw_compare,
>>>> which means we can use a < comparison as opposed to <= for the offset.
>>>
>>> Thanks for digging into this.
>>>
>>> I think that both approaches have their merits. If you allow a 0-byte
>>> comparison, it can make caller code simpler. For example, in the SVE
>>> case, you can avoid that "are registers the same size" check. On the
>>> other hand, disallowing 0-byte comparison could find some bugs where we
>>> messed up our offsets and we don't really want to compare 0 bytes.
>>>
>>> I don't have a strong preference, as long as it's consistent between GDB
>>> and GDBserver and well documented.
>>
>> Alright. That's a reasonable assessment. The standard says memcmp is supposed to return 0 for a 0-length
>> comparison. So that's one more in favor of keeping that behavior.
>>
>> Let me think about this a bit more.
>
> Just checking in on this patch, as this one is in the list of potential
> fixes for the 16.3 release. Have you made a decision?
>
> Thank you!
Yes, I want to try to have a gdbserver-side testcase for this, which I'm still working on.
I'm planning on having something for next week if that's fine for the release schedule.
> Yes, I want to try to have a gdbserver-side testcase for this, which
> I'm still working on.
>
> I'm planning on having something for next week if that's fine for the
> release schedule.
Thanks for the update, Luis. I confirm this fits with the release
schedule (which I try to keep flexible anyway, owing to the voluntary
nature of our project).
@@ -503,7 +503,7 @@ regcache::raw_compare (int regnum, const void *buf, int offset) const
gdb_assert (buf != NULL);
gdb::array_view<const gdb_byte> regbuf = register_data (this, regnum);
- gdb_assert (offset < regbuf.size ());
+ gdb_assert (offset <= regbuf.size ());
regbuf = regbuf.slice (offset);
return memcmp (buf, regbuf.data (), regbuf.size ()) == 0;