[v2] Refrain from asking debug stubs to read invalid memory
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_gdb_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 |
success
|
Test passed
|
Commit Message
Some stubs take exception to this.
For example we observe RTEMS's libdebugger freezing when asked to examine
address zero on aarch64/xilinx_zynqmp_lp64_qemu. As of 2024-02-02 "gdb,
types: Resolve pointer types dynamically" (f18fc7e56fb) this happens as
early as 'target remote'. Ordinarily we would be greeted with…
_User_extensions_Thread_switch (executing=0x0, heir=<optimized out>)
at […]/cpukit/include/rtems/score/userextimpl.h:382
… but now, as language_defn::read_var_value calls resolve_dynamic_type with
a "dummy" address and value, resolve_dynamic_type_internal receives a
similarly "dummy" addr_stack, and attempts to read memory address zero:
guard against that.
Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
---
New in v2: applied Thiago's suggestion to add an 'else' branch.
gdb/gdbtypes.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
Comments
Kévin Le Gouguec <legouguec@adacore.com> writes:
> Some stubs take exception to this.
>
> For example we observe RTEMS's libdebugger freezing when asked to examine
> address zero on aarch64/xilinx_zynqmp_lp64_qemu. As of 2024-02-02 "gdb,
> types: Resolve pointer types dynamically" (f18fc7e56fb) this happens as
> early as 'target remote'. Ordinarily we would be greeted with…
>
> _User_extensions_Thread_switch (executing=0x0, heir=<optimized out>)
> at […]/cpukit/include/rtems/score/userextimpl.h:382
>
> … but now, as language_defn::read_var_value calls resolve_dynamic_type with
> a "dummy" address and value, resolve_dynamic_type_internal receives a
> similarly "dummy" addr_stack, and attempts to read memory address zero:
> guard against that.
The problem with this approach is that there are targets out there that
can read from address 0, so detecting 0 and handling it differently as
you propose is not the right approach.
But *sigh* I see we already have code that's special casing 0 in this
function, so maybe adding yet more is just something I'll need to
accept.
Ideally, if we wanted to support not reading memory then we should carry
around a std::optional<CORE_ADDR> instead of just a CORE_ADDR.
As for the target freezing when trying to read from 0 ... that's 100% a
target bug. If the actual h/w doesn't support reading 0 then the
controller (e.g. OpenOCD) should take care of filtering out the access
requests. Or in your case (qemu) then the emulator should just be fixed
to return an error on these accesses. The other option is for the
remote to provide GDB with a memory map[1].
Anyway, I do have a more substantial question, I guess when you talk
about the resolve_dynamic_type calls in ::read_var_value, you are
talking about the lines like this:
type = resolve_dynamic_type (type, view, /* Unused address. */ 0);
The problem you see is that these end up calling:
read_memory_typed_address (addr_stack->addr, type);
in resolve_dynamic_type_internal. Now as far as I can tell this
function will throw an exception if the read fails. So, surely, if
we're sending through '0' as an address with the expectation that we'll
not access the address ... but now we are ... then surely this would
trigger an exception on a target where 0 is not readable, but the target
doesn't hang, right? For example, x86-64 GNU/Linux.
Which makes me think: can't we write a test case for this bug?
Thanks,
Andrew
[1] https://sourceware.org/gdb/current/onlinedocs/gdb.html/Memory-Map-Format.html#Memory-Map-Format
>
> Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
> ---
> New in v2: applied Thiago's suggestion to add an 'else' branch.
>
> gdb/gdbtypes.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/gdb/gdbtypes.c b/gdb/gdbtypes.c
> index f39fe3de6a4..856eff4d166 100644
> --- a/gdb/gdbtypes.c
> +++ b/gdb/gdbtypes.c
> @@ -2804,8 +2804,10 @@ resolve_dynamic_type_internal (struct type *type,
> if (addr_stack->valaddr.data () != NULL)
> pinfo.addr = extract_typed_address (addr_stack->valaddr.data (),
> type);
> - else
> + else if (addr_stack->addr != 0)
> pinfo.addr = read_memory_typed_address (addr_stack->addr, type);
> + else
> + pinfo.addr = 0;
> pinfo.next = addr_stack;
>
> /* Special case a NULL pointer here -- we don't want to
> --
> 2.34.1
Andrew Burgess <aburgess@redhat.com> writes:
> But *sigh* I see we already have code that's special casing 0 in this
> function, so maybe adding yet more is just something I'll need to
> accept.
>
FWIW, it's not clear-cut to me that this needs to go in, as-is or
otherwise. See below.
> Or in your case (qemu) then the emulator should just be fixed
> to return an error on these accesses.
(Clarification: the remote stub under test was not QEMU's, but the one
implemented by RTEMS's libdebugger, linked against the application,
_running inside the QEMU guest_.
So, reason #1 for not going with this approach: it is not clear to me
how widespread this problem is; we only observed it for that one target;
could not find the time to test a newer libdebugger, let alone take the
plunge and try to fix the stub)
> Anyway, I do have a more substantial question, I guess when you talk
> about the resolve_dynamic_type calls in ::read_var_value, you are
> talking about the lines like this:
>
> type = resolve_dynamic_type (type, view, /* Unused address. */ 0);
>
> The problem you see is that these end up calling:
>
> read_memory_typed_address (addr_stack->addr, type);
>
> in resolve_dynamic_type_internal. Now as far as I can tell this
> function will throw an exception if the read fails.
Right; to expand on the commit message, the symptoms were that (a) while
usually 'target remote' would greet us with…
(gdb) target remote localhost:7100
Remote debugging using localhost:7100
_User_extensions_Thread_switch (executing=0x0, heir=<optimized out>)
… it would instead tell us…
(gdb) target remote localhost:7100
Remote debugging using localhost:7100
_User_extensions_Thread_switch (executing=0x0,
Ignoring packet error, continuing...
heir=<error reading variable: Invalid hex digit 116>)
… and (b) subsequent commands would timeout.
(Fingers crossed I am not getting details wrong; going with what I put
on record in our internal tracker 6 months ago.
Which brings me to reason #2 for dropping this: for internal reasons,
testing new changes on this target configuration is no longer a
priority; finding time to research a better fix and/or a test case will
thus prove challenging)
> So, surely, if
> we're sending through '0' as an address with the expectation that we'll
> not access the address ... but now we are ... then surely this would
> trigger an exception on a target where 0 is not readable, but the target
> doesn't hang, right? For example, x86-64 GNU/Linux.
>
> Which makes me think: can't we write a test case for this bug?
A test case would be ideal, but not sure how to produce one. The test
harness for that target config is a handful (adapting that internal test
framework into DejaGNU "boards" is on my pipe-dream list); for other
targets (e.g. x86-64 GNU/Linux) I have not managed to tickle GDB into
producing a similar error.
tl;dr Unless we can produce a test case, maybe we should drop this
patch? If we eventually hit this failure mode internally for target
configs that are easier to capture into a test case, I would pick it
back up and investigate, but until then it is not obvious to me this
needs to go in.
Lest I forget: thank you very much for the feedback; I was indeed
wondering about "targets out there that can read from address 0" when
working on this, but lacked ideas to do much about it.
@@ -2804,8 +2804,10 @@ resolve_dynamic_type_internal (struct type *type,
if (addr_stack->valaddr.data () != NULL)
pinfo.addr = extract_typed_address (addr_stack->valaddr.data (),
type);
- else
+ else if (addr_stack->addr != 0)
pinfo.addr = read_memory_typed_address (addr_stack->addr, type);
+ else
+ pinfo.addr = 0;
pinfo.next = addr_stack;
/* Special case a NULL pointer here -- we don't want to