[RFC] tests/run-stack-live-test.sh: prototype 'live' eu-stack testing
Commit Message
Missing a few pieces, but worth sharing as an RFC. My idea is to
ensure better test coverage for eu-stack and then
eu-stacktrace+libdwfl_stacktrace by running against a live process
with known content, stopped at a known location, and aggressively
scrubbing output that's known to vary from testrun to testrun.
This is a very basic preview of how that might look. If the approach
is sound, I hope to make it more sophisticated/reliable.
Unanswered questions:
- Scrub more data (e.g. libc symvers) from a more known program.
Scrub stack frame numbers to account for a case where extra frames
appear / are missing at the bottom of the stack?
- Something better than sed for the scrubbing?
- An equivalent eu-stacktrace test will require privileged perf_events
access for profiling data and therefore likely to be skipped by
default. How feasible is it to be enabled on the buildbots, though?
* tests/run-stack-live-test.sh: New test with wild and fuzzy
testrun_compare variant that scrubs inherently unpredictable parts of
the data. Needs to scrub even more.
* tests/Makefile.am (TESTS): Add run-stack-live-test.sh.
---
tests/Makefile.am | 3 +-
tests/run-stack-live-test.sh | 64 ++++++++++++++++++++++++++++++++++++
2 files changed, 66 insertions(+), 1 deletion(-)
create mode 100755 tests/run-stack-live-test.sh
Comments
Hi Serhei,
On Thu, 2025-05-08 at 18:28 -0400, Serhei Makarov wrote:
> Missing a few pieces, but worth sharing as an RFC. My idea is to
> ensure better test coverage for eu-stack and then
> eu-stacktrace+libdwfl_stacktrace by running against a live process
> with known content, stopped at a known location, and aggressively
> scrubbing output that's known to vary from testrun to testrun.
>
> This is a very basic preview of how that might look. If the approach
> is sound, I hope to make it more sophisticated/reliable.
I like the testrun_compare_fuzzy name.
> Unanswered questions:
> - Scrub more data (e.g. libc symvers) from a more known program.
> Scrub stack frame numbers to account for a case where extra frames
> appear / are missing at the bottom of the stack?
Yes, just remove everything after @... (the symver).
Maybe look at gcc optimization suffixes like .isra, .constprop, .clone,
.ipa (don't know if there is some definite list).
Try to stick to plain C, otherwise think about demangling (or not?).
Scrub everything after you have seen main (or in multi-threaded
programs clone), so things like __libc_start_call_main, _start?
> - Something better than sed for the scrubbing?
It works, so...
> - An equivalent eu-stacktrace test will require privileged perf_events
> access for profiling data and therefore likely to be skipped by
> default. How feasible is it to be enabled on the buildbots, though?
Most probably not generically. Especially not the container builders.
But you might make some deal with a specific direct hardware worker
admin?
Cheers,
Mark
On Thu, May 8, 2025 at 6:28 PM Serhei Makarov <serhei@serhei.io> wrote:
>
> Missing a few pieces, but worth sharing as an RFC. My idea is to
> ensure better test coverage for eu-stack and then
> eu-stacktrace+libdwfl_stacktrace by running against a live process
> with known content, stopped at a known location, and aggressively
> scrubbing output that's known to vary from testrun to testrun.
>
> This is a very basic preview of how that might look. If the approach
> is sound, I hope to make it more sophisticated/reliable.
>
> Unanswered questions:
> - Scrub more data (e.g. libc symvers) from a more known program.
> Scrub stack frame numbers to account for a case where extra frames
> appear / are missing at the bottom of the stack?
> - Something better than sed for the scrubbing?
Sed is fine, we already use it (along with cut) in other test scripts
for scrubbing components of output that might differ between runs or
across environments. run-dwfl-core-noncontig.sh and run-sysroot.sh are
two examples.
> - An equivalent eu-stacktrace test will require privileged perf_events
> access for profiling data and therefore likely to be skipped by
> default. How feasible is it to be enabled on the buildbots, though?
>
> * tests/run-stack-live-test.sh: New test with wild and fuzzy
> testrun_compare variant that scrubs inherently unpredictable parts of
> the data. Needs to scrub even more.
> * tests/Makefile.am (TESTS): Add run-stack-live-test.sh.
> ---
> tests/Makefile.am | 3 +-
> tests/run-stack-live-test.sh | 64 ++++++++++++++++++++++++++++++++++++
> 2 files changed, 66 insertions(+), 1 deletion(-)
> create mode 100755 tests/run-stack-live-test.sh
>
> diff --git a/tests/Makefile.am b/tests/Makefile.am
> index 00ba754d..ecd514c7 100644
> --- a/tests/Makefile.am
> +++ b/tests/Makefile.am
> @@ -191,7 +191,8 @@ TESTS = run-arextract.sh run-arsymtest.sh run-ar.sh newfile test-nlist \
> run-backtrace-core-s390x.sh run-backtrace-core-s390.sh \
> run-backtrace-core-aarch64.sh run-backtrace-core-sparc.sh \
> run-backtrace-demangle.sh run-stack-d-test.sh run-stack-i-test.sh \
> - run-stack-demangled-test.sh run-readelf-zx.sh run-readelf-zp.sh \
> + run-stack-demangled-test.sh run-stack-live-test.sh \
> + run-readelf-zx.sh run-readelf-zp.sh \
> run-readelf-arm-flags.sh \
> run-readelf-addr.sh run-readelf-str.sh \
> run-readelf-multi-noline.sh \
> diff --git a/tests/run-stack-live-test.sh b/tests/run-stack-live-test.sh
> new file mode 100755
> index 00000000..808421bb
> --- /dev/null
> +++ b/tests/run-stack-live-test.sh
> @@ -0,0 +1,64 @@
> +#! /bin/sh
> +# Copyright (C) 2025 Red Hat, Inc.
> +# This file is part of elfutils.
> +#
> +# This file is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# elfutils is distributed in the hope that it will be useful, but
> +# WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> +
> +. $srcdir/test-subr.sh
> +
> +# Depending on whether we are running make check or make installcheck
> +# the actual binary name under test might be different. It is used in
> +# the error message, which we also try to match.
> +if test "$elfutils_testrun" = "installed"; then
> +STACKCMD=${bindir}/`program_transform stack`
> +else
> +STACKCMD=${abs_top_builddir}/src/stack
> +fi
> +
> +# TODO(REVIEW): Can we make the data-scrubbing generic enough
> +# (across multiple eu-stack/eu-stacktrace test cases) to move
> +# to test_subr.sh?
If it's not quite generic enough for test-subr.sh we can always
include this in a stacktrace-subr.sh file, for instance.
> +#
> +# TODO(REVIEW): Better shell-isms for comparing file and regex?
> +# \(\s\e\d\)\+\i\s\a\d\d\i\c\t\e\d\\t\o\b\a\ck\s\l\a\s\h\e\s
I think the use of sed here is fine. If you wanted to improve
readability you could perform the scrubbing across multiple sed
commands each with a comment explaining the step.
> +testrun_compare_fuzzy()
> +{
> + outfile="${1##*/}.out"
> + testrun_out $outfile "$@"
> + sed -i 's/\(PID\|TID\|#[0-9]\+\)\( \+\)\(\(0x\)\?[0-9a-f]\+\)/\1\2nn/g' $outfile
> + diff -u $outfile -
> +}
> +
> +# TODO: Need to scrub more data (e.g. GLIBC_ bits),
> +# and use a program whose inner content we control:
> +sleep 10 &
> +PID=$!
> +testrun_compare_fuzzy ${abs_top_builddir}/src/stack -p $PID <<EOF
> +PID nn - process
> +TID nn:
> +#0 nn clock_nanosleep@GLIBC_2.2.5
> +#1 nn __nanosleep
> +#2 nn main
> +#3 nn __libc_start_call_main
> +#4 nn __libc_start_main@@GLIBC_2.34
> +#5 nn _start
> +EOF
> +# PID 169385 - process
> +# TID 169385:
> +# #0 0x00007f04a98adbd7 clock_nanosleep@GLIBC_2.2.5
> +# #1 0x00007f04a98b9c47 __nanosleep
> +# #2 0x0000561e7fdd9a9f main
> +# #3 0x00007f04a97f4088 __libc_start_call_main
> +# #4 0x00007f04a97f414b __libc_start_main@@GLIBC_2.34
> +# #5 0x0000561e7fdd9c05 _start
> --
> 2.47.0
>
FYI here is my results diff from running this test:
--- stack.out 2025-05-14 09:34:00.676383411 -0400
+++ - 2025-05-14 09:34:00.677344470 -0400
@@ -1,9 +1,8 @@
PID nn - process
TID nn:
-#0 nn __internal_syscall_cancel
-#1 nn clock_nanosleep@GLIBC_2.2.5
-#2 nn __nanosleep
-#3 nn main
-#4 nn __libc_start_call_main
-#5 nn __libc_start_main@@GLIBC_2.34
-#6 nn _start
+#0 nn clock_nanosleep@GLIBC_2.2.5
+#1 nn __nanosleep
+#2 nn main
+#3 nn __libc_start_call_main
+#4 nn __libc_start_main@@GLIBC_2.34
+#5 nn _start
FAIL run-stack-live-test.sh (exit status: 1)
Aaron
On Wed, May 14, 2025, at 9:54 AM, Mark Wielaard wrote:
> I like the testrun_compare_fuzzy name.
Nice, though I might have to end up with several versions of the function
(testrun_compare_fuzzy_stack, testrun_compare_fuzzy_stacktrace, ...).
It's an open question whether they need to be generalized to test_subr.sh,
to some other file, or kept in the local test case.
> Yes, just remove everything after @... (the symver).
> Maybe look at gcc optimization suffixes like .isra, .constprop, .clone,
> .ipa (don't know if there is some definite list).
> Try to stick to plain C, otherwise think about demangling (or not?).
Yes please :) I don't want to think about how consistent or not C++
mangling is across systems, although I think it's a standard procedure?
>> - Something better than sed for the scrubbing?
>
> It works, so...
Anything's good enough with enough comments!
> Most probably not generically. Especially not the container builders.
> But you might make some deal with a specific direct hardware worker
> admin?
Thankfully, it seems there's a fair chance I can test something without
needing root privileges, so I'm going to aim for that goal.
On Wed, May 14, 2025, at 10:30 AM, Aaron Merey wrote:
> If it's not quite generic enough for test-subr.sh we can always
> include this in a stacktrace-subr.sh file, for instance.
Yeah, I'm thinking this is the more likely outcome.
>> +#
>> +# TODO(REVIEW): Better shell-isms for comparing file and regex?
>> +# \(\s\e\d\)\+\i\s\a\d\d\i\c\t\e\d\\t\o\b\a\ck\s\l\a\s\h\e\s
>
> I think the use of sed here is fine. If you wanted to improve
> readability you could perform the scrubbing across multiple sed
> commands each with a comment explaining the step.
Anything's good enough with enough comments!
There's also the possibility of awk being powerful enough to make
me want to use it. Do you know if awk is considered sufficiently
standard for the testsuite to depend on it?
> FYI here is my results diff from running this test:
> -snip-
Thanks, that's another item for the dataset I got from the trybot runs.
Unfortunately, I won't know for sure until I start to test things like bsd....
is there a list somewhere of unices we're supposed to support?
Hi Serhei,
On Wed, 2025-05-14 at 16:35 -0400, Serhei Makarov wrote:
> There's also the possibility of awk being powerful enough to make
> me want to use it. Do you know if awk is considered sufficiently
> standard for the testsuite to depend on it?
We already check for GNU awk in configure for maintainer mode:
AC_CHECK_PROG(HAVE_GAWK, gawk, yes, no)
if test "$HAVE_GAWK" = "no"; then
AC_MSG_ERROR([gawk needed in maintainer mode])
fi
I don't think it would be a problem to do that for non-maintainer-mode
too.
Cheers,
Mark
@@ -191,7 +191,8 @@ TESTS = run-arextract.sh run-arsymtest.sh run-ar.sh newfile test-nlist \
run-backtrace-core-s390x.sh run-backtrace-core-s390.sh \
run-backtrace-core-aarch64.sh run-backtrace-core-sparc.sh \
run-backtrace-demangle.sh run-stack-d-test.sh run-stack-i-test.sh \
- run-stack-demangled-test.sh run-readelf-zx.sh run-readelf-zp.sh \
+ run-stack-demangled-test.sh run-stack-live-test.sh \
+ run-readelf-zx.sh run-readelf-zp.sh \
run-readelf-arm-flags.sh \
run-readelf-addr.sh run-readelf-str.sh \
run-readelf-multi-noline.sh \
new file mode 100755
@@ -0,0 +1,64 @@
+#! /bin/sh
+# Copyright (C) 2025 Red Hat, Inc.
+# This file is part of elfutils.
+#
+# This file is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# elfutils is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+. $srcdir/test-subr.sh
+
+# Depending on whether we are running make check or make installcheck
+# the actual binary name under test might be different. It is used in
+# the error message, which we also try to match.
+if test "$elfutils_testrun" = "installed"; then
+STACKCMD=${bindir}/`program_transform stack`
+else
+STACKCMD=${abs_top_builddir}/src/stack
+fi
+
+# TODO(REVIEW): Can we make the data-scrubbing generic enough
+# (across multiple eu-stack/eu-stacktrace test cases) to move
+# to test_subr.sh?
+#
+# TODO(REVIEW): Better shell-isms for comparing file and regex?
+# \(\s\e\d\)\+\i\s\a\d\d\i\c\t\e\d\\t\o\b\a\ck\s\l\a\s\h\e\s
+testrun_compare_fuzzy()
+{
+ outfile="${1##*/}.out"
+ testrun_out $outfile "$@"
+ sed -i 's/\(PID\|TID\|#[0-9]\+\)\( \+\)\(\(0x\)\?[0-9a-f]\+\)/\1\2nn/g' $outfile
+ diff -u $outfile -
+}
+
+# TODO: Need to scrub more data (e.g. GLIBC_ bits),
+# and use a program whose inner content we control:
+sleep 10 &
+PID=$!
+testrun_compare_fuzzy ${abs_top_builddir}/src/stack -p $PID <<EOF
+PID nn - process
+TID nn:
+#0 nn clock_nanosleep@GLIBC_2.2.5
+#1 nn __nanosleep
+#2 nn main
+#3 nn __libc_start_call_main
+#4 nn __libc_start_main@@GLIBC_2.34
+#5 nn _start
+EOF
+# PID 169385 - process
+# TID 169385:
+# #0 0x00007f04a98adbd7 clock_nanosleep@GLIBC_2.2.5
+# #1 0x00007f04a98b9c47 __nanosleep
+# #2 0x0000561e7fdd9a9f main
+# #3 0x00007f04a97f4088 __libc_start_call_main
+# #4 0x00007f04a97f414b __libc_start_main@@GLIBC_2.34
+# #5 0x0000561e7fdd9c05 _start