[RFC] Set process affinity in test to work around ARM ptrace bug
Commit Message
Pedro Alves <palves@redhat.com> writes:
> I also think that whatever workaround, if any, should be limited
> to known-broken kernels. Otherwise, this is likely to mask
> other problems going forward. Maybe all we have is the version
> number to work with, but that's still better than unconditionally
> enabling this on arm.
The updated version adds a linux kernel version check.
Comments
Ping.
On Mon, Jul 4, 2016 at 11:49 AM, Yao Qi <qiyaoltc@gmail.com> wrote:
> Pedro Alves <palves@redhat.com> writes:
>
>> I also think that whatever workaround, if any, should be limited
>> to known-broken kernels. Otherwise, this is likely to mask
>> other problems going forward. Maybe all we have is the version
>> number to work with, but that's still better than unconditionally
>> enabling this on arm.
>
> The updated version adds a linux kernel version check.
>
> --
> Yao (齐尧)
> From 27fe094e6a99929f8f281d88beaa599771550025 Mon Sep 17 00:00:00 2001
> From: Yao Qi <yao.qi@linaro.org>
> Date: Mon, 27 Jun 2016 08:45:16 +0100
> Subject: [PATCH] Set process affinity in test to work around ARM ptrace bug
>
> We recently found a ARM kernel ptrace bug
> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-May/431962.html
> As a result of this bug, after GDB ptrace set VFP registers, the hardware
> registers may not be updated. This bug causes some intermittent fails in
> tests, like return.exp, call-rt-st.exp, callfuncs.exp, etc.
>
> The bug was introduced by 8130b9d7b9d858aa04ce67805e8951e3cb6e9b2f
> in 2012 and is fixed in e2dfb4b880146bfd4b6aa8e138c0205407cebbaf in May.
> The bug is fixed in ARM kernel tree, but it is impractical to upgrade
> linux kernel from git tree or most recently release. I am wondering
> we can workaround this kernel bug somehow.
>
> My first attempt is to workaround it in GDB, so that GDB still writes
> the VFP registers and sync them to hardware. The kernel patch is quite
> simple, which moves vfp_flush_hwstate one line below. Probably, we can
> call ptrace set vfp registers twice, and then the second vfp set can
> flush the state correctly. Unfortunately, it doesn't work, because
> every time of ptrace set, kernel loads VFP registers from hardware first,
> which might be out of date after the first ptrace set. That is to say,
> we can't workaround this kernel bug in GDB.
>
> Then, I am thinking we can workaround this bug in testing, because the
> intermittent fails are confusing in comparing test results. We can bind
> both tracer and tracee on the same core. For example, we can start GDB
> or GDBserver with "taskset -c 0 ", but this is a global change, may
> have some affects on gdb.threads tests. I also think about doing
> "taskset -p PID -c 0" in test harness after the inferior is started,
> and do the same to the parent process of inferior (which is either GDB
> or GDBserver), but don't know how to get GDB (in remote host) and
> GDBserver's process id.
>
> The approach in this patch is to have a small c function which sets
> both process affinity and its parent's affinity to core 0 if the target
> is arm linux and the kernel version is known broken having the ptrace
> bug setting VFP registers. The function set_process_affinity should
> be called in these tests explicitly, but other tests are not affected
> at all.
>
> Note that this kernel bug only exists between commits
> 8130b9d7b9d858aa04ce67805e8951e3cb6e9b2f and e2dfb4b880146bfd4b6aa8e138c0205407cebbaf
> However, a certain commit will be merged to many branches and releases,
> which makes version checks complicated. I checked all released kernels,
> and get a list of versions that this bug is fixed. Not all longterm
> kernels on kernel.org have this bug fix, I don't know why, for example,
> some 3.x kernels doesn't have this bug fix.
>
> Secondly, kernels older than 8130b9d7b9d858aa04ce67805e8951e3cb6e9b2f
> are not affected by this bug, so the official kernel releases older
> than 3.0.21 or 3.2.6 are not affected by this bug, but I think the
> distro may backport the commit to their older kernel, so it makes few
> sense to check kernel is older than some versions (3.0.21 and 3.2.6).
>
> gdb/testsuite:
>
> 2016-07-04 Yao Qi <yao.qi@linaro.org>
>
> * lib/set_process_affinity.c: New file.
>
> * gdb.arch/arm-neon.c: Include lib/set_process_affinity.c.
> (main): Call set_process_affinity.
> * gdb.base/callfuncs.c: Likewise.
> * gdb.base/call-rt-st.c: Likewise.
> * gdb.base/gnu_vector.c: Likewise.
> * gdb.base/return.c: Likewise.
> * gdb.base/return2.c: Likewise.
> * gdb.base/store.c: Likewise.
> * gdb.base/structs.c: Likewise.
> * gdb.arch/arm-neon.exp: Set breakpoint and continue to
> breakpoint.
> * gdb.base/gnu_vector.exp: Likewise.
>
On 07/25/2016 02:22 PM, Yao Qi wrote:
> Ping.
Thanks.
Hmm. Seeing that the kernel fix was backported to so many
stable releases (positively) surprised me. In that case, I question
the testsuite workaround a bit harder. If this was a workaround in
gdb or gdbserver themselves, then it be more clear to me that the workaround
would be going to a broad set of users for whom updating the kernel is not easy.
But since this is only for when running the testsuite alone, I could argue that
this masks the problem and thus makes it look like gdb works better on an
affected system than it really does. I think if I were working on gdb/gdbserver
on arm, I'd much prefer if gdb told me my system had a broken ptrace, so I
could act on it, rather than masking it off and pretend all is well.
How about we make gdb / gdbserver detect bad kernel version, and output a
warning to the effect? We already have precedent in nat/linux-ptrace.c.
I think we should probably do that regardless of any testsuite workaround.
How bad would it be to push for people to update their kernels?
From a testsuite workaround angle, instead of sprinkling
set_process_affinity calls around, what if we we added a new proc
that would be called at the top of the .exp files:
gdb_caching_proc skip_arm_vfp_tests {} {
if arm && linux && broken linux versions {
return 1
}
return 0
}
This would skip tests instead of making them pass, but how bad would
that be? I assume that people doing gdb development/testing on arm will
be able to update their kernels, and will very much want to do that.
Thanks,
Pedro Alves
@@ -15,6 +15,7 @@
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
+#include "../lib/set_process_affinity.c"
#include <arm_neon.h>
#define DEF_FUNC1(N, TYPE, VALUE...) \
@@ -98,5 +99,6 @@ DEF_FUNC2 (3)
int
main (void)
{
- return 0;
+ set_process_affinity ();
+ return 0; /* breakpoint here */
}
@@ -31,6 +31,9 @@ if ![runto_main] {
return -1
}
+gdb_breakpoint [gdb_get_line_number "breakpoint here"]
+gdb_continue_to_breakpoint "breakpoint here"
+
# Test passing vectors in function argument in the inferior call.
for {set i 1} {$i <= 18} {incr i} {
@@ -1,3 +1,4 @@
+#include "../lib/set_process_affinity.c"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
@@ -565,6 +566,7 @@ int main () {
struct two_floats_t *f3;
gdb_unbuffer_output ();
+ set_process_affinity ();
/* Allocate space for large structures
*/
@@ -25,6 +25,7 @@
#define PARAMS(paramlist) paramlist
#endif
+#include "../lib/set_process_affinity.c"
# include <stdlib.h>
# include <string.h>
@@ -644,7 +645,7 @@ voidfunc (void)
int main ()
{
- void *p = malloc (1);
+ void *p = malloc (1); set_process_affinity ();
t_double_values(double_val1, double_val2);
t_structs_c(struct_val1);
free (p);
@@ -18,6 +18,7 @@
Contributed by Ken Werner <ken.werner@de.ibm.com> */
#include <stdarg.h>
+#include "../lib/set_process_affinity.c"
#define VECTOR(n, type) \
type __attribute__ ((vector_size (n * sizeof(type))))
@@ -137,7 +138,8 @@ main ()
{
int4 res;
- res = add_some_intvecs (i4a, i4a + i4b, i4b);
+ set_process_affinity ();
+ res = add_some_intvecs (i4a, i4a + i4b, i4b); /* breakpoint here */
res = add_some_intvecs (i4a, i4a + i4b, i4b);
@@ -55,6 +55,9 @@ gdb_test_multiple "show endian" "show endian" {
}
}
+gdb_breakpoint [gdb_get_line_number "breakpoint here"]
+gdb_continue_to_breakpoint "breakpoint here"
+
# Test printing of character vector types
gdb_test "print c4" "\\\$$decimal = \\{1, 2, 3, 4\\}"
gdb_test "print c4\[2\]" "\\\$$decimal = 3"
@@ -15,6 +15,7 @@
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
+#include "../lib/set_process_affinity.c"
#include <stdio.h>
/* Test "return" command. */
@@ -40,6 +41,7 @@ double tmp3;
int main ()
{
+ set_process_affinity ();
func1 ();
printf("in main after func1\n");
tmp2 = func2 ();
@@ -15,6 +15,7 @@
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
+#include "../lib/set_process_affinity.c"
/* Test gdb's "return" command. */
int void_test = 0;
@@ -90,6 +91,7 @@ int main (int argc, char **argv)
double double_resultval;
int i;
+ set_process_affinity ();
/* A "test load" that will insure that the function really returns
a ${type} (as opposed to just a truncated or part of a ${type}). */
for (i = 0; i < sizeof (testval.ffff); i++)
@@ -7,6 +7,8 @@
function calls within main even when no optimization flags were
passed. */
+#include "../lib/set_process_affinity.c"
+
typedef signed char charest;
charest
@@ -254,6 +256,7 @@ wack_field_4 (void)
int
main ()
{
+ set_process_affinity ();
/* These calls are for current frame test. */
wack_charest (-1, -2);
wack_short (-1, -2);
@@ -15,6 +15,8 @@
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
+#include "../lib/set_process_affinity.c"
+
/* Useful abreviations. */
typedef void t;
typedef char tc;
@@ -313,6 +315,8 @@ int main()
{
int i;
+ set_process_affinity ();
+
for (i = 0; i < 256; i++)
chartest[i].c = i;
chartest[0].c = 0; /* chartest-done */
new file mode 100644
@@ -0,0 +1,98 @@
+/* Copyright (C) 2016 Free Software Foundation, Inc.
+
+ This file is part of GDB.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+#if defined(__arm__) && defined(__linux__)
+#define _GNU_SOURCE
+#include <sched.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/utsname.h>
+#include <stdlib.h>
+
+struct version
+{
+ long major;
+ long minor;
+ long patch;
+};
+
+/* Probe the kernel version into V, and return 0 on success. */
+
+static int
+probe_kernel_version (struct version *v)
+{
+ struct utsname buffer;
+
+ if (uname (&buffer) == 0)
+ {
+ char *start, *end;
+
+ start = buffer.release;
+ v->major = strtol (start, &end, 10);
+
+ start = end + 1;
+ v->minor = strtol (start, &end, 10);
+
+ start = end + 1;
+ v->patch = strtol (start, &end, 10);
+ return 0;
+ }
+ else
+ return -1;
+}
+
+#define VERSION_NEWER_THAN(VER, MAJOR, MINOR, PATCH) \
+ VER.major == MAJOR && VER.minor == MINOR && VER.patch >= PATCH
+
+#endif
+
+static void
+set_process_affinity (void)
+{
+#if defined(__arm__) && defined(__linux__)
+ struct version kernel;
+ cpu_set_t my_set;
+
+ if (probe_kernel_version (&kernel))
+ {
+ /* Can't get kernel version, do nothing. */
+ return;
+ }
+
+ if (kernel.major >= 5
+ || (kernel.major == 4 && kernel.minor >= 7) /* 4.7 and later */
+ || VERSION_NEWER_THAN (kernel, 4, 6, 3)
+ || VERSION_NEWER_THAN (kernel, 4, 4, 14)
+ || VERSION_NEWER_THAN (kernel, 4, 1, 27)
+ || VERSION_NEWER_THAN (kernel, 3, 18, 36)
+ || VERSION_NEWER_THAN (kernel, 3, 14, 73))
+ {
+ /* Kernel is new enough to have bug fixed, do nothing. */
+ return;
+ }
+
+ /* Set both process and parent process (GDB)'s affinity on core 0 to
+ workaround ARM linux kernel ptrace bug which doesn't flush the VFP
+ state to hardware after ptrace set VFP registers. */
+
+ CPU_ZERO (&my_set);
+ CPU_SET (0, &my_set);
+
+ sched_setaffinity (0, sizeof(cpu_set_t), &my_set);
+ sched_setaffinity (getppid (), sizeof(cpu_set_t), &my_set);
+#endif
+}