Message ID | 20200124030222.13854-1-palves@redhat.com |
---|---|
State | New |
Headers | show |
On 2020-01-23 10:02 p.m., Pedro Alves wrote: > This fixes a latent bug exposed by the multi-target patch (5b6d1e4fa > "Multi-target support). > > The symptom described in the bug is that starting a first inferior, > then trying to run a second (multi-threaded) inferior twice, causes > libthread_db to fail to load, along with other erratic behavior: > > (gdb) run > Starting program: /tmp/foo > warning: td_ta_new failed: generic error > > Going a bit deeply, I found that if the two inferiors have different > symbols, we can see that just after inferior 2 exits, we are left with > inferior 2 selected, which is correct, but the symbols in scope belong > to inferior 1, which is obviously incorrect... > > This problem is that there's a path in > scoped_restore_current_thread::restore() that switches to no thread > selected, and switches the current inferior, but leaves the current > program space as is, resulting in leaving the program space pointing > to the wrong program space (the one of the other inferior). This was > happening after handling TARGET_WAITKIND_NO_RESUMED, which is an event > that triggers after TARGET_WAITKIND_EXITED for the previous inferior > exit. Subsequent symbol lookups find the symbols of the wrong > inferior. > > The fix is to use switch_to_inferior_no_thread in that problem spot. > This function was recently added along with the multi-target work > exactly for these situations. > > As for testing, this patch adds a new testcase that tests symbol > printing just after inferior exit, which exercises the root cause of > the problem more directly. And then, to cover the use case described > in the bug too, it also exercises the lithread_db.so mis-loading, by > using TLS printing as a proxy for being sure that threaded debugging > was activated sucessfully. The testcase fails without the fix like > this, for the "print symbol just after exit" bits: > > ... > [Inferior 1 (process 8719) exited normally] > (gdb) PASS: gdb.multi/multi-re-run.exp: re_run_inf=1: iter=1: continue until exit > print re_run_var_1 > No symbol "re_run_var_1" in current context. > (gdb) FAIL: gdb.multi/multi-re-run.exp: re_run_inf=1: iter=1: print re_run_var_1 > ... > > And like this for the "libthread_db.so loading" bits: > > (gdb) run > Starting program: /home/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.multi/multi-re-run/multi-re-run > warning: td_ta_new failed: generic error > [New LWP 27001] > > Thread 1.1 "multi-re-run" hit Breakpoint 3, all_started () at /home/pedro/gdb/binutils-gdb/build/../src/gdb/testsuite/gdb.multi/multi-re-run.c:44 > 44 } > (gdb) PASS: gdb.multi/multi-re-run.exp: re_run_inf=1: iter=2: running to all_started in runto > print tls_var > Cannot find thread-local storage for LWP 27000, executable file /home/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.multi/multi-re-run/multi-re-run: > Cannot find thread-local variables on this target > (gdb) FAIL: gdb.multi/multi-re-run.exp: re_run_inf=1: iter=2: print tls_var Thanks, that fixes it for me (and LGTM). Simon
On Friday, January 24, 2020 4:02 AM, Pedro Alves wrote: > This fixes a latent bug exposed by the multi-target patch (5b6d1e4fa > "Multi-target support). The patch led to the assertion violation below when running gdb.threads/vfork-follow-child-exit.exp. gdb/progspace.c:243: internal-error: void set_current_program_space(program_space*): Assertion `pspace != NULL' failed. Thanks. -Baris > The symptom described in the bug is that starting a first inferior, > then trying to run a second (multi-threaded) inferior twice, causes > libthread_db to fail to load, along with other erratic behavior: > > (gdb) run > Starting program: /tmp/foo > warning: td_ta_new failed: generic error > > Going a bit deeply, I found that if the two inferiors have different > symbols, we can see that just after inferior 2 exits, we are left with > inferior 2 selected, which is correct, but the symbols in scope belong > to inferior 1, which is obviously incorrect... > > This problem is that there's a path in > scoped_restore_current_thread::restore() that switches to no thread > selected, and switches the current inferior, but leaves the current > program space as is, resulting in leaving the program space pointing > to the wrong program space (the one of the other inferior). This was > happening after handling TARGET_WAITKIND_NO_RESUMED, which is an event > that triggers after TARGET_WAITKIND_EXITED for the previous inferior > exit. Subsequent symbol lookups find the symbols of the wrong > inferior. > > The fix is to use switch_to_inferior_no_thread in that problem spot. > This function was recently added along with the multi-target work > exactly for these situations. > > As for testing, this patch adds a new testcase that tests symbol > printing just after inferior exit, which exercises the root cause of > the problem more directly. And then, to cover the use case described > in the bug too, it also exercises the lithread_db.so mis-loading, by > using TLS printing as a proxy for being sure that threaded debugging > was activated sucessfully. The testcase fails without the fix like > this, for the "print symbol just after exit" bits: > > ... > [Inferior 1 (process 8719) exited normally] > (gdb) PASS: gdb.multi/multi-re-run.exp: re_run_inf=1: iter=1: continue until exit > print re_run_var_1 > No symbol "re_run_var_1" in current context. > (gdb) FAIL: gdb.multi/multi-re-run.exp: re_run_inf=1: iter=1: print re_run_var_1 > ... > > And like this for the "libthread_db.so loading" bits: > > (gdb) run > Starting program: /home/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.multi/multi-re-run/multi-re-run > warning: td_ta_new failed: generic error > [New LWP 27001] > > Thread 1.1 "multi-re-run" hit Breakpoint 3, all_started () at /home/pedro/gdb/binutils- > gdb/build/../src/gdb/testsuite/gdb.multi/multi-re-run.c:44 > 44 } > (gdb) PASS: gdb.multi/multi-re-run.exp: re_run_inf=1: iter=2: running to all_started in runto > print tls_var > Cannot find thread-local storage for LWP 27000, executable file /home/pedro/gdb/binutils- > gdb/build/gdb/testsuite/outputs/gdb.multi/multi-re-run/multi-re-run: > Cannot find thread-local variables on this target > (gdb) FAIL: gdb.multi/multi-re-run.exp: re_run_inf=1: iter=2: print tls_var > > gdb/ChangeLog: > yyyy-mm-dd Pedro Alves <palves@redhat.com> > > PR gdb/25410 > * thread.c (scoped_restore_current_thread::restore): Use > switch_to_inferior_no_thread. > > gdb/testsuite/ChangeLog: > yyyy-mm-dd Pedro Alves <palves@redhat.com> > > PR gdb/25410 > * gdb.multi/multi-re-run-1.c: New. > * gdb.multi/multi-re-run-2.c: New. > * gdb.multi/multi-re-run.exp: New. > --- > gdb/testsuite/gdb.multi/multi-re-run-1.c | 61 ++++++++++++++++ > gdb/testsuite/gdb.multi/multi-re-run-2.c | 61 ++++++++++++++++ > gdb/testsuite/gdb.multi/multi-re-run.exp | 115 +++++++++++++++++++++++++++++++ > gdb/thread.c | 5 +- > 4 files changed, 238 insertions(+), 4 deletions(-) > create mode 100644 gdb/testsuite/gdb.multi/multi-re-run-1.c > create mode 100644 gdb/testsuite/gdb.multi/multi-re-run-2.c > create mode 100644 gdb/testsuite/gdb.multi/multi-re-run.exp > > diff --git a/gdb/testsuite/gdb.multi/multi-re-run-1.c b/gdb/testsuite/gdb.multi/multi-re-run-1.c > new file mode 100644 > index 0000000000..91b0dbce30 > --- /dev/null > +++ b/gdb/testsuite/gdb.multi/multi-re-run-1.c > @@ -0,0 +1,61 @@ > +/* This testcase is part of GDB, the GNU debugger. > + > + Copyright 2020 Free Software Foundation, Inc. > + > + This program is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3 of the License, or > + (at your option) any later version. > + > + This program is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with this program. If not, see <http://www.gnu.org/licenses/>. */ > + > +#include <stdlib.h> > +#include <stdio.h> > +#include <unistd.h> > +#include <limits.h> > +#include <string.h> > +#include <pthread.h> > + > +int re_run_var_1 = 1; > + > +#define NUM_THREADS 1 > + > +__thread int tls_var = 1; > + > +static pthread_barrier_t barrier; > + > +static void * > +thread_start (void *arg) > +{ > + pthread_barrier_wait (&barrier); > + > + while (1) > + sleep (1); > + return NULL; > +} > + > +static void > +all_started (void) > +{ > +} > + > +int > +main (int argc, char ** argv) > +{ > + pthread_t thread; > + int len; > + > + pthread_barrier_init (&barrier, NULL, NUM_THREADS + 1); > + pthread_create (&thread, NULL, thread_start, NULL); > + > + pthread_barrier_wait (&barrier); > + all_started (); > + > + return 0; > +} > diff --git a/gdb/testsuite/gdb.multi/multi-re-run-2.c b/gdb/testsuite/gdb.multi/multi-re-run-2.c > new file mode 100644 > index 0000000000..6925e0cb2d > --- /dev/null > +++ b/gdb/testsuite/gdb.multi/multi-re-run-2.c > @@ -0,0 +1,61 @@ > +/* This testcase is part of GDB, the GNU debugger. > + > + Copyright 2020 Free Software Foundation, Inc. > + > + This program is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3 of the License, or > + (at your option) any later version. > + > + This program is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with this program. If not, see <http://www.gnu.org/licenses/>. */ > + > +#include <stdlib.h> > +#include <stdio.h> > +#include <unistd.h> > +#include <limits.h> > +#include <string.h> > +#include <pthread.h> > + > +int re_run_var_2 = 2; > + > +#define NUM_THREADS 1 > + > +__thread int tls_var = 1; > + > +static pthread_barrier_t barrier; > + > +static void * > +thread_start (void *arg) > +{ > + pthread_barrier_wait (&barrier); > + > + while (1) > + sleep (1); > + return NULL; > +} > + > +static void > +all_started (void) > +{ > +} > + > +int > +main (int argc, char ** argv) > +{ > + pthread_t thread; > + int len; > + > + pthread_barrier_init (&barrier, NULL, NUM_THREADS + 1); > + pthread_create (&thread, NULL, thread_start, NULL); > + > + pthread_barrier_wait (&barrier); > + all_started (); > + > + return 0; > +} > diff --git a/gdb/testsuite/gdb.multi/multi-re-run.exp b/gdb/testsuite/gdb.multi/multi-re-run.exp > new file mode 100644 > index 0000000000..daa4206e0e > --- /dev/null > +++ b/gdb/testsuite/gdb.multi/multi-re-run.exp > @@ -0,0 +1,115 @@ > +# Copyright 2020 Free Software Foundation, Inc. > + > +# This program is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 3 of the License, or > +# (at your option) any later version. > +# > +# This program is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program. If not, see <http://www.gnu.org/licenses/>. > + > +# Test loading two inferiors into GDB, and running one of them twice > +# in a row. GDB used to have a bug that made it so that after an > +# inferior exit, the current program space was left pointing to the > +# wrong inferior's pspace, causing subsequent symbol lookups to > +# misbehave, including failing to load libthread_db.so. See PR > +# gdb/25410. > + > +# Build two executables, with different symbols. > + > +set exec1 "multi-re-run-1" > +set srcfile1 multi-re-run-1.c > +set binfile1 [standard_output_file ${exec1}] > + > +set exec2 "multi-re-run-2" > +set srcfile2 multi-re-run-2.c > +set binfile2 [standard_output_file ${exec2}] > + > +with_test_prefix "exec1" { > + if { [prepare_for_testing "failed to prepare" ${exec1} "${srcfile1}" \ > + [list pthreads debug]] } { > + return -1 > + } > +} > + > +with_test_prefix "exec2" { > + if { [prepare_for_testing "failed to prepare" ${exec2} "${srcfile2}" \ > + [list pthreads debug]] } { > + return -1 > + } > +} > + > +# Start two inferiors, leave one stopped, and run the other a couple > +# times. RE_RUN_INF is the inferior that is re-run. > + > +proc test_re_run {re_run_inf} { > + global binfile1 binfile2 > + global inferior_exited_re > + global gdb_prompt > + > + clean_restart ${binfile1} > + > + delete_breakpoints > + > + # Start another inferior. > + gdb_test "add-inferior" "Added inferior 2.*" \ > + "add empty inferior 2" > + gdb_test "inferior 2" "Switching to inferior 2.*" \ > + "switch to inferior 2" > + gdb_load ${binfile2} > + > + if {$re_run_inf == 1} { > + set steady_inf 2 > + } else { > + set steady_inf 1 > + } > + > + gdb_test "inferior $steady_inf" "Switching to inferior $steady_inf.*" \ > + "switch to steady inferior" > + > + # Run the steady inferior to a breakpoint, and let it stay stopped > + # there. > + if ![runto all_started message] then { > + untested "setup failed" > + return 0 > + } > + > + gdb_test "inferior $re_run_inf" "Switching to inferior $re_run_inf.*" \ > + "switch to re-run inferior" > + > + # Now run the RE_RUN_INF inferior a couple times. GDB used to > + # have a bug that caused the second run to fail to load > + # libthread_db.so. > + foreach_with_prefix iter {1 2} { > + delete_breakpoints > + > + if ![runto all_started message] { > + return 0 > + } > + > + # If a thread_stratum fails to load, then TLS debugging fails > + # too. > + gdb_test "print tls_var" " = 1" > + > + gdb_continue_to_end "" continue 1 > + > + # In the original bug, after an inferior exit, GDB would leave > + # the current program space pointing to the wrong inferior's > + # pspace, and thus the wrong symbols were visible. > + if {$re_run_inf == 1} { > + gdb_test "print re_run_var_1" " = 1" > + } else { > + gdb_test "print re_run_var_2" " = 2" > + } > + } > +} > + > +# For completeness, test re-running either inferior 1 or inferior 2. > +foreach_with_prefix re_run_inf {1 2} { > + test_re_run $re_run_inf > +} > diff --git a/gdb/thread.c b/gdb/thread.c > index 001fdd42c5..302a49e984 100644 > --- a/gdb/thread.c > +++ b/gdb/thread.c > @@ -1429,10 +1429,7 @@ scoped_restore_current_thread::restore () > && m_inf->pid != 0) > switch_to_thread (m_thread); > else > - { > - switch_to_no_thread (); > - set_current_inferior (m_inf); > - } > + switch_to_inferior_no_thread (m_inf); > > /* The running state of the originally selected thread may have > changed, so we have to recheck it here. */ > > base-commit: 1ba1ac88011703abcd0271e4f5d00927dc69a09a > -- > 2.14.5 Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Gary Kershaw Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928
On 2020-01-24 6:22 a.m., Aktemur, Tankut Baris wrote: > On Friday, January 24, 2020 4:02 AM, Pedro Alves wrote: >> This fixes a latent bug exposed by the multi-target patch (5b6d1e4fa >> "Multi-target support). > > The patch led to the assertion violation below when running > gdb.threads/vfork-follow-child-exit.exp. > > gdb/progspace.c:243: internal-error: void set_current_program_space(program_space*): Assertion `pspace != NULL' failed. Oh, I see that too. Simon
diff --git a/gdb/testsuite/gdb.multi/multi-re-run-1.c b/gdb/testsuite/gdb.multi/multi-re-run-1.c new file mode 100644 index 0000000000..91b0dbce30 --- /dev/null +++ b/gdb/testsuite/gdb.multi/multi-re-run-1.c @@ -0,0 +1,61 @@ +/* This testcase is part of GDB, the GNU debugger. + + Copyright 2020 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <http://www.gnu.org/licenses/>. */ + +#include <stdlib.h> +#include <stdio.h> +#include <unistd.h> +#include <limits.h> +#include <string.h> +#include <pthread.h> + +int re_run_var_1 = 1; + +#define NUM_THREADS 1 + +__thread int tls_var = 1; + +static pthread_barrier_t barrier; + +static void * +thread_start (void *arg) +{ + pthread_barrier_wait (&barrier); + + while (1) + sleep (1); + return NULL; +} + +static void +all_started (void) +{ +} + +int +main (int argc, char ** argv) +{ + pthread_t thread; + int len; + + pthread_barrier_init (&barrier, NULL, NUM_THREADS + 1); + pthread_create (&thread, NULL, thread_start, NULL); + + pthread_barrier_wait (&barrier); + all_started (); + + return 0; +} diff --git a/gdb/testsuite/gdb.multi/multi-re-run-2.c b/gdb/testsuite/gdb.multi/multi-re-run-2.c new file mode 100644 index 0000000000..6925e0cb2d --- /dev/null +++ b/gdb/testsuite/gdb.multi/multi-re-run-2.c @@ -0,0 +1,61 @@ +/* This testcase is part of GDB, the GNU debugger. + + Copyright 2020 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <http://www.gnu.org/licenses/>. */ + +#include <stdlib.h> +#include <stdio.h> +#include <unistd.h> +#include <limits.h> +#include <string.h> +#include <pthread.h> + +int re_run_var_2 = 2; + +#define NUM_THREADS 1 + +__thread int tls_var = 1; + +static pthread_barrier_t barrier; + +static void * +thread_start (void *arg) +{ + pthread_barrier_wait (&barrier); + + while (1) + sleep (1); + return NULL; +} + +static void +all_started (void) +{ +} + +int +main (int argc, char ** argv) +{ + pthread_t thread; + int len; + + pthread_barrier_init (&barrier, NULL, NUM_THREADS + 1); + pthread_create (&thread, NULL, thread_start, NULL); + + pthread_barrier_wait (&barrier); + all_started (); + + return 0; +} diff --git a/gdb/testsuite/gdb.multi/multi-re-run.exp b/gdb/testsuite/gdb.multi/multi-re-run.exp new file mode 100644 index 0000000000..daa4206e0e --- /dev/null +++ b/gdb/testsuite/gdb.multi/multi-re-run.exp @@ -0,0 +1,115 @@ +# Copyright 2020 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +# Test loading two inferiors into GDB, and running one of them twice +# in a row. GDB used to have a bug that made it so that after an +# inferior exit, the current program space was left pointing to the +# wrong inferior's pspace, causing subsequent symbol lookups to +# misbehave, including failing to load libthread_db.so. See PR +# gdb/25410. + +# Build two executables, with different symbols. + +set exec1 "multi-re-run-1" +set srcfile1 multi-re-run-1.c +set binfile1 [standard_output_file ${exec1}] + +set exec2 "multi-re-run-2" +set srcfile2 multi-re-run-2.c +set binfile2 [standard_output_file ${exec2}] + +with_test_prefix "exec1" { + if { [prepare_for_testing "failed to prepare" ${exec1} "${srcfile1}" \ + [list pthreads debug]] } { + return -1 + } +} + +with_test_prefix "exec2" { + if { [prepare_for_testing "failed to prepare" ${exec2} "${srcfile2}" \ + [list pthreads debug]] } { + return -1 + } +} + +# Start two inferiors, leave one stopped, and run the other a couple +# times. RE_RUN_INF is the inferior that is re-run. + +proc test_re_run {re_run_inf} { + global binfile1 binfile2 + global inferior_exited_re + global gdb_prompt + + clean_restart ${binfile1} + + delete_breakpoints + + # Start another inferior. + gdb_test "add-inferior" "Added inferior 2.*" \ + "add empty inferior 2" + gdb_test "inferior 2" "Switching to inferior 2.*" \ + "switch to inferior 2" + gdb_load ${binfile2} + + if {$re_run_inf == 1} { + set steady_inf 2 + } else { + set steady_inf 1 + } + + gdb_test "inferior $steady_inf" "Switching to inferior $steady_inf.*" \ + "switch to steady inferior" + + # Run the steady inferior to a breakpoint, and let it stay stopped + # there. + if ![runto all_started message] then { + untested "setup failed" + return 0 + } + + gdb_test "inferior $re_run_inf" "Switching to inferior $re_run_inf.*" \ + "switch to re-run inferior" + + # Now run the RE_RUN_INF inferior a couple times. GDB used to + # have a bug that caused the second run to fail to load + # libthread_db.so. + foreach_with_prefix iter {1 2} { + delete_breakpoints + + if ![runto all_started message] { + return 0 + } + + # If a thread_stratum fails to load, then TLS debugging fails + # too. + gdb_test "print tls_var" " = 1" + + gdb_continue_to_end "" continue 1 + + # In the original bug, after an inferior exit, GDB would leave + # the current program space pointing to the wrong inferior's + # pspace, and thus the wrong symbols were visible. + if {$re_run_inf == 1} { + gdb_test "print re_run_var_1" " = 1" + } else { + gdb_test "print re_run_var_2" " = 2" + } + } +} + +# For completeness, test re-running either inferior 1 or inferior 2. +foreach_with_prefix re_run_inf {1 2} { + test_re_run $re_run_inf +} diff --git a/gdb/thread.c b/gdb/thread.c index 001fdd42c5..302a49e984 100644 --- a/gdb/thread.c +++ b/gdb/thread.c @@ -1429,10 +1429,7 @@ scoped_restore_current_thread::restore () && m_inf->pid != 0) switch_to_thread (m_thread); else - { - switch_to_no_thread (); - set_current_inferior (m_inf); - } + switch_to_inferior_no_thread (m_inf); /* The running state of the originally selected thread may have changed, so we have to recheck it here. */