[v2,1/3,gdb/tdep] Add syscall number cache
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gdb_build--master-arm |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gdb_check--master-arm |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 |
success
|
Testing passed
|
Commit Message
When running test-case gdb.base/catch-syscall.exp on powerpc64le-linux, we run
into an xfail:
...
(gdb) catch syscall execve^M
Catchpoint 18 (syscall 'execve' [11])^M
(gdb) PASS: gdb.base/catch-syscall.exp: execve: \
catch syscall with arguments (execve)
...
continue^M
Continuing.^M
^M
Catchpoint 18 (call to syscall execve), 0x00007ffff7d7f18c in execve () from \
/lib64/libc.so.6^M
(gdb) PASS: gdb.base/catch-syscall.exp: execve: program has called execve
continue^M
Continuing.^M
process 60484 is executing new program: catch-syscall^M
^M
Breakpoint 17, main (argc=1, argv=0x7fffffffe618) at catch-syscall.c:54^M
54 char buf1[2] = "a";^M
(gdb) XFAIL: gdb.base/catch-syscall.exp: execve: syscall execve has returned
...
The problem is that the catchpoint "(return from syscall execve)" doesn't
trigger.
This is caused by ppc_linux_get_syscall_number returning 0 at execve
syscall-exit-stop, while it should return 11.
This is a problem that was fixed in linux kernel version v5.19, by commit
ec6d0dde71d7 ("powerpc: Enable execve syscall exit tracepoint"), but the
machine I'm running the tests on has v4.18.0.
An approach was discussed in the PR where ppc_linux_get_syscall_number would
try to detect an execve syscall-exit-stop based on the register state, but
that was considered too fragile.
Fix this by caching the syscall number at syscall-enter-stop, and reusing it
at syscall-exit-stop.
This is sufficient to stop triggering the xfail, so remove it.
It's good to point out that this doesn't always eliminate the need to get the
syscall number at a syscall-exit-stop.
The test-case has an example called mid-vfork, where we do:
- catch vfork
- continue
- catch syscall
- continue.
The following things happen:
- the "catch vfork" specifies that we capture the PTRACE_EVENT_VFORK event.
- the first continue runs into the event
- the "catch syscall" specifies that we capture syscall-enter-stop and
syscall-exit-stop events.
- the second continue runs into the syscall-exit-stop. At that point there's
no syscall number value cached, because no corresponding syscall-enter-stop
was observed.
We can address this issue somewhat by translating events into syscalls. A
followup patch in this series use this approach (though not for vfork).
PR tdep/28623
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28623
---
gdb/linux-nat.c | 53 ++++++++++++++++++++++--
gdb/linux-nat.h | 3 ++
gdb/testsuite/gdb.base/catch-syscall.exp | 8 +---
3 files changed, 53 insertions(+), 11 deletions(-)
base-commit: f1b8ee6f2b4381bc46a0ad4c233b6eddc1e135b5
Comments
On 11/27/23 12:20 PM, Tom de Vries wrote:
> When running test-case gdb.base/catch-syscall.exp on powerpc64le-linux, we run
> into an xfail:
> ...
> (gdb) catch syscall execve^M
> Catchpoint 18 (syscall 'execve' [11])^M
> (gdb) PASS: gdb.base/catch-syscall.exp: execve: \
> catch syscall with arguments (execve)
> ...
> continue^M
> Continuing.^M
> ^M
> Catchpoint 18 (call to syscall execve), 0x00007ffff7d7f18c in execve () from \
> /lib64/libc.so.6^M
> (gdb) PASS: gdb.base/catch-syscall.exp: execve: program has called execve
> continue^M
> Continuing.^M
> process 60484 is executing new program: catch-syscall^M
> ^M
> Breakpoint 17, main (argc=1, argv=0x7fffffffe618) at catch-syscall.c:54^M
> 54 char buf1[2] = "a";^M
> (gdb) XFAIL: gdb.base/catch-syscall.exp: execve: syscall execve has returned
> ...
>
> The problem is that the catchpoint "(return from syscall execve)" doesn't
> trigger.
>
> This is caused by ppc_linux_get_syscall_number returning 0 at execve
> syscall-exit-stop, while it should return 11.
>
> This is a problem that was fixed in linux kernel version v5.19, by commit
> ec6d0dde71d7 ("powerpc: Enable execve syscall exit tracepoint"), but the
> machine I'm running the tests on has v4.18.0.
>
> An approach was discussed in the PR where ppc_linux_get_syscall_number would
> try to detect an execve syscall-exit-stop based on the register state, but
> that was considered too fragile.
>
> Fix this by caching the syscall number at syscall-enter-stop, and reusing it
> at syscall-exit-stop.
>
> This is sufficient to stop triggering the xfail, so remove it.
>
> It's good to point out that this doesn't always eliminate the need to get the
> syscall number at a syscall-exit-stop.
>
> The test-case has an example called mid-vfork, where we do:
> - catch vfork
> - continue
> - catch syscall
> - continue.
>
> The following things happen:
> - the "catch vfork" specifies that we capture the PTRACE_EVENT_VFORK event.
> - the first continue runs into the event
> - the "catch syscall" specifies that we capture syscall-enter-stop and
> syscall-exit-stop events.
> - the second continue runs into the syscall-exit-stop. At that point there's
> no syscall number value cached, because no corresponding syscall-enter-stop
> was observed.
>
> We can address this issue somewhat by translating events into syscalls. A
> followup patch in this series use this approach (though not for vfork).
>
> PR tdep/28623
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28623
All 3 of these LGTM modulo one nit below.
Reviewed-by: John Baldwin <jhb@FreeBSD.org>
> ---
> gdb/linux-nat.c | 53 ++++++++++++++++++++++--
> gdb/linux-nat.h | 3 ++
> gdb/testsuite/gdb.base/catch-syscall.exp | 8 +---
> 3 files changed, 53 insertions(+), 11 deletions(-)
>
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index 7b0562cf89b..89c4622160a 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -1508,6 +1508,17 @@ linux_resume_one_lwp_throw (struct lwp_info *lp, int step,
> else
> lp->stop_pc = 0;
>
> + if (catch_syscall_enabled () > 0)
> + {
> + /* Function inf_ptrace_target::resume uses PT_SYSCALL. */
> + }
ICYMI, Simon just changed catch_syscall_enabled to return a bool a couple of
hours before your mail.
@@ -1508,6 +1508,17 @@ linux_resume_one_lwp_throw (struct lwp_info *lp, int step,
else
lp->stop_pc = 0;
+ if (catch_syscall_enabled () > 0)
+ {
+ /* Function inf_ptrace_target::resume uses PT_SYSCALL. */
+ }
+ else
+ {
+ /* Function inf_ptrace_target::resume uses PT_CONTINUE.
+ Invalidate syscall_number cache. */
+ lp->syscall_number = -1;
+ }
+
linux_target->low_prepare_to_resume (lp);
linux_target->low_resume (lp->ptid, step, signo);
@@ -1762,7 +1773,31 @@ linux_handle_syscall_trap (struct lwp_info *lp, int stopping)
struct target_waitstatus *ourstatus = &lp->waitstatus;
struct gdbarch *gdbarch = target_thread_architecture (lp->ptid);
thread_info *thread = linux_target->find_thread (lp->ptid);
- int syscall_number = (int) gdbarch_get_syscall_number (gdbarch, thread);
+
+ enum target_waitkind new_syscall_state
+ = (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
+ ? TARGET_WAITKIND_SYSCALL_RETURN
+ : TARGET_WAITKIND_SYSCALL_ENTRY);
+
+ int syscall_number;
+ if (new_syscall_state == TARGET_WAITKIND_SYSCALL_RETURN
+ && lp->syscall_number != -1)
+ {
+ /* Calling gdbarch_get_syscall_number for TARGET_WAITKIND_SYSCALL_RETURN
+ is unreliable on some targets for some syscalls, use the syscall
+ detected at TARGET_WAITKIND_SYSCALL_ENTRY instead. */
+ syscall_number = lp->syscall_number;
+ linux_nat_debug_printf
+ (_("Using syscall number %d supplied by syscall_number cache"),
+ syscall_number);
+ }
+ else
+ {
+ syscall_number = (int) gdbarch_get_syscall_number (gdbarch, thread);
+ linux_nat_debug_printf
+ (_("Using syscall number %d supplied by architecture hook"),
+ syscall_number);
+ }
if (stopping)
{
@@ -1791,6 +1826,7 @@ linux_handle_syscall_trap (struct lwp_info *lp, int stopping)
"PTRACE_CONT for SIGSTOP", syscall_number, lp->ptid.lwp ());
lp->syscall_state = TARGET_WAITKIND_IGNORE;
+ lp->syscall_number = -1;
ptrace (PTRACE_CONT, lp->ptid.lwp (), 0, 0);
lp->stopped = 0;
return 1;
@@ -1801,9 +1837,18 @@ linux_handle_syscall_trap (struct lwp_info *lp, int stopping)
the user could install a new catchpoint for this syscall
between syscall enter/return, and we'll need to know to
report a syscall return if that happens. */
- lp->syscall_state = (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
- ? TARGET_WAITKIND_SYSCALL_RETURN
- : TARGET_WAITKIND_SYSCALL_ENTRY);
+ lp->syscall_state = new_syscall_state;
+
+ if (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY)
+ {
+ /* Save to use in TARGET_WAITKIND_SYSCALL_RETURN. */
+ lp->syscall_number = syscall_number;
+ }
+ else
+ {
+ /* Reset to prevent stale values. */
+ lp->syscall_number = -1;
+ }
if (catch_syscall_enabled ())
{
@@ -277,6 +277,9 @@ struct lwp_info : intrusive_list_node<lwp_info>
- TARGET_WAITKIND_SYSCALL_RETURN */
enum target_waitkind syscall_state;
+ /* Syscall number corresponding to syscall_state. */
+ int syscall_number = -1;
+
/* The processor core this LWP was last seen on. */
int core = -1;
@@ -134,13 +134,7 @@ proc check_return_from_syscall { syscall { pattern "" } } {
return 1
}
-re -wrap ".*Breakpoint $decimal, main .*" {
- # On Powerpc the kernel does not report the returned from
- # syscall as expected by the test. GDB bugzilla 28623.
- if { [istarget "powerpc64*-linux*"] } {
- xfail $thistest
- } else {
- fail $thistest
- }
+ fail $thistest
return 0
}
}