Switch the inferior too in switch_to_program_space_and_thread (Re: [PATCH v2 08/24] Introduce switch_to_inferior_no_thread)

Message ID 9039e000-1d8c-ffcc-60bd-659b3a983a43@redhat.com
State New, archived
Headers

Commit Message

Pedro Alves Jan. 10, 2020, 2:02 a.m. UTC
  On 1/8/20 3:48 PM, Aktemur, Tankut Baris wrote:
> On Monday, December 23, 2019 8:30 PM, Pedro Alves wrote:
>>
>> On 12/20/19 6:50 PM, Pedro Alves wrote:
>>
>>> Oh wow, thanks much for this.  You're right.  I'm working on
>>> converting your example above to a testsuite testcase.
>>
>> Here it is.  I'm putting this at the end of the series, after the
>> multi-target patches, because before multi-target, the test
>> fails -- GDB sends a spurious Hg0.0 packet to the remote side.
>>
>> From 839bb32983d8afb085eeec1ac6ca499f5bbd6244 Mon Sep 17 00:00:00 2001
>> From: Aleksandar Paunovic <aleksandar.paunovic@intel.com>
>> Date: Mon, 23 Dec 2019 18:04:32 +0000
>> Subject: [PATCH] Switch the inferior too in switch_to_program_space_and_thread
>>
>> With multi-target, each inferior now has its own target connection.
>> The problem in switch_to_program_space_and_thread is that in the
>> current state GDB switches to "no thread" and also sets the program
>> space but because the inferior is not switched, potentially an
>> incorrect target remains selected.
>>
>> Here is a sample scenario that exploits this flow:
>>
>> On terminal 1, start a gdbserver on a program named foo:
>>
>>  $ gdbserver :1234 ./foo
>>
>> On terminal 2, start gdb on a program named bar.  Suppose foo and bar
>> are compiled from foo.c and bar.c.  They are completely separate.  So,
>> bar.c:2 has no meaning for foo.
>>
>>  $ gdb -q ./bar
>>  Reading symbols from ./bar...
>>  (gdb) add-inferior
>>  [New inferior 2]
>>  Added inferior 2
>>  (gdb) inferior 2
>>  [Switching to inferior 2 [<null>] (<noexec>)]
>>  (gdb) target remote :1234
>>  ...
>>  (gdb) set debug remote 2
>>  (gdb) break bar.c:2
>>  Sending packet: $Hgp0.0#ad...Packet received: OK
>>  Sending packet: $m5fa,12#f8...Packet received: E01
>>  Sending packet: $m5fa,1#c6...Packet received: E01
>>  Sending packet: $m5fb,3#c9...Packet received: E01
>>  Sending packet: $m5fe,1#ca...Packet received: E01
>>  Breakpoint 1 at 0x5fe: file bar.c, line 2.
>>  (gdb)
>>
>> Here we have an unnecessary sending of the packets to the gdbserver.
>>
>> With this fix in progspace-and-thread.c, we'll get this:
>>
>>  (gdb) break bar.c:2
>>  Breakpoint 1 at 0x5fe: file bar.c, line 2.
>>  (gdb)
>>
>> Now there is no sending of the packets to gdbserver.
>>
>> gdb/ChangeLog:
>> yyyy-mm-dd  Aleksandar Paunovic  <aleksandar.paunovic@intel.com>
>> 	    Pedro Alves  <palves@redhat.com>
>>
>> 	* progspace-and-thread.c (switch_to_program_space_and_thread):
>> 	Assert there's an inferior for PSPACE.  Use
>> 	switch_to_inferior_no_thread to switch the inferior too.
>>
>> gdb/testsuite/ChangeLog:
>> yyyy-mm-dd  Pedro Alves  <palves@redhat.com>
>>
>> 	* gdb.server/bkpt-other-inferior.exp: New file.
>> ---
>>  gdb/progspace-and-thread.c                       |  6 +-
>>  gdb/testsuite/gdb.server/bkpt-other-inferior.exp | 93 ++++++++++++++++++++++++
>>  2 files changed, 96 insertions(+), 3 deletions(-)
>>  create mode 100644 gdb/testsuite/gdb.server/bkpt-other-inferior.exp
>>
>> diff --git a/gdb/progspace-and-thread.c b/gdb/progspace-and-thread.c
>> index 3c92b5c8e0..d51748035e 100644
>> --- a/gdb/progspace-and-thread.c
>> +++ b/gdb/progspace-and-thread.c
>> @@ -25,8 +25,9 @@ void
>>  switch_to_program_space_and_thread (program_space *pspace)
>>  {
>>    inferior *inf = find_inferior_for_program_space (pspace);
>> +  gdb_assert (inf != nullptr);
> 
> 
> This creates failure in gdb.base/step-over-exit.exp.  

Thanks for spotting this!

> The problem is, a forked
> child terminates, and GDB tries to switch to the pspace of that no-longer-existing
> inferior through a remaining breakpoint location.  Would it make sense to guard
> calls to `switch_to_program_space_and_thread` as below, for example?

Yeah, I don't think so.  That feels like papering over a problem.

> 
> There are a handful other cases that can be guarded/skipped similarly.
> I'm not sure, though, if the problem is deeper and the calls to
> `switch_to_program_space_and_thread` for a non-existing inferior should have
> never occurred.  That is, should the breakpoint locations of exited inferiors
> be cleaned up much earlier, before we come to this point?

Yes.  Thing is, they have!  But then they're recreated...

Here's an updated patch that fixes that latent problem.  See the
newly expanded description therein.

From c8a096b604c41f37a91a595392a0a5c3266ec09f Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Fri, 10 Jan 2020 01:04:37 +0000
Subject: [PATCH] Switch the inferior too in switch_to_program_space_and_thread

With multi-target, each inferior now has its own target connection.
The problem in switch_to_program_space_and_thread is that in the
current state GDB switches to "no thread" and also sets the program
space but because the inferior is not switched, potentially an
incorrect target remains selected.

Here is a sample scenario that exploits this flow:

On terminal 1, start a gdbserver on a program named foo:

 $ gdbserver :1234 ./foo

On terminal 2, start gdb on a program named bar.  Suppose foo and bar
are compiled from foo.c and bar.c.  They are completely separate.  So,
bar.c:2 has no meaning for foo.

 $ gdb -q ./bar
 Reading symbols from ./bar...
 (gdb) add-inferior
 [New inferior 2]
 Added inferior 2
 (gdb) inferior 2
 [Switching to inferior 2 [<null>] (<noexec>)]
 (gdb) target remote :1234
 ...
 (gdb) set debug remote 2
 (gdb) break bar.c:2
 Sending packet: $Hgp0.0#ad...Packet received: OK
 Sending packet: $m5fa,12#f8...Packet received: E01
 Sending packet: $m5fa,1#c6...Packet received: E01
 Sending packet: $m5fb,3#c9...Packet received: E01
 Sending packet: $m5fe,1#ca...Packet received: E01
 Breakpoint 1 at 0x5fe: file bar.c, line 2.
 (gdb)

Here we have an unnecessary sending of the packets to the gdbserver.

With this fix in progspace-and-thread.c, we'll get this:

 (gdb) break bar.c:2
 Breakpoint 1 at 0x5fe: file bar.c, line 2.
 (gdb)

Now there is no sending of the packets to gdbserver.

The changes around clear_symtab_users calls are necessary because
otherwise we regress gdb.base/step-over-exit.exp, hitting the new
assertion in switch_to_program_space_and_thread.  The problem is, a
forked child terminates, and when GDB decides to auto-purge that
inferior, GDB tries to switch to the pspace of that no-longer-existing
inferior.

The root of the problem is within the program_space destructor:

program_space::~program_space ()
{
...
  set_current_program_space (this);        # (1)
...
  breakpoint_program_space_exit (this);    # (2)
...
  free_all_objfiles ();                    # (3)
...
}

We get here from delete_inferior -> delete_program_space.

So we're deleting an inferior, and the inferior to be
deleted is no longer in the inferior list.

At (2), we've deleted all the breakpoints and locations for the
program space being deleted.

The crash happens while doing a breakpoint re-set, called by
clear_symtab_users at the tail end of (3).  That is, while recreating
breakpoints for the current program space, which is the program space
we're tearing down.  During breakpoint re-set, we try to switch to the
new location's pspace (the current pspace set in (1), so the pspace
we're tearing down) with switch_to_program_space_and_thread, and that
hits the failed assertion.  It's the fact that we recreate breakpoints
in the program_space destructor that is the latent bug here.  Just
don't do that, since we've already taken care of breakpoints with
breakpoint_program_space_exit, and we don't end up in the crash situation.

My first approach to fix this added a symfile_add_flags parameter to
program_space::free_all_objfiles, and then passed that down to
clear_symtab_users.  The program_space dtor would then pass down
SYMFILE_DEFER_BP_RESET to free_all_objfiles.  I couldn't help feeling
that adding that parameter to free_all_objfiles looked a little
awkward, so I settled on something a little different -- hoist the
clear_symtab_users call to the callers.  There are only two callers.
I felt that that didn't look as odd, particularly since
remove_symbol_file_command also does:

  objf->unlink ();
  clear_symtab_users (0);

I.e., objfile deletion is already separate from calling
clear_symtab_users in some places.

I'm not hard set on this approach though; I can go with adding the
symfile_add_flags parameter to program_space::free_all_objfiles if
people prefer that.

gdb/ChangeLog:
yyyy-mm-dd  Aleksandar Paunovic  <aleksandar.paunovic@intel.com>
	    Pedro Alves  <palves@redhat.com>

	* progspace-and-thread.c (switch_to_program_space_and_thread):
	Assert there's an inferior for PSPACE.  Use
	switch_to_inferior_no_thread to switch the inferior too.
	* progspace.c (program_space::~program_space): Call
	clear_symtab_users here, with SYMFILE_DEFER_BP_RESET.
	(program_space::free_all_objfiles): Don't call clear_symtab_users
	here.
	* symfile.c (symbol_file_clear): Call clear_symtab_users here.

gdb/testsuite/ChangeLog:
yyyy-mm-dd  Pedro Alves  <palves@redhat.com>

	* gdb.server/bkpt-other-inferior.exp: New file.
---
 gdb/progspace-and-thread.c                       |  6 +-
 gdb/progspace.c                                  |  5 +-
 gdb/symfile.c                                    |  2 +
 gdb/testsuite/gdb.server/bkpt-other-inferior.exp | 93 ++++++++++++++++++++++++
 4 files changed, 101 insertions(+), 5 deletions(-)
 create mode 100644 gdb/testsuite/gdb.server/bkpt-other-inferior.exp
  

Comments

Tankut Baris Aktemur Jan. 10, 2020, 11:33 a.m. UTC | #1
On Friday, January 10, 2020 3:03 AM, Pedro Alves wrote:
> 

> The changes around clear_symtab_users calls are necessary because

> otherwise we regress gdb.base/step-over-exit.exp, hitting the new

> assertion in switch_to_program_space_and_thread.  The problem is, a

> forked child terminates, and when GDB decides to auto-purge that

> inferior, GDB tries to switch to the pspace of that no-longer-existing

> inferior.

> 

> The root of the problem is within the program_space destructor:

> 

> program_space::~program_space ()

> {

> ...

>   set_current_program_space (this);        # (1)

> ...

>   breakpoint_program_space_exit (this);    # (2)

> ...

>   free_all_objfiles ();                    # (3)

> ...

> }

> 

> We get here from delete_inferior -> delete_program_space.

> 

> So we're deleting an inferior, and the inferior to be

> deleted is no longer in the inferior list.

> 

> At (2), we've deleted all the breakpoints and locations for the

> program space being deleted.

> 

> The crash happens while doing a breakpoint re-set, called by

> clear_symtab_users at the tail end of (3).  That is, while recreating

> breakpoints for the current program space, which is the program space

> we're tearing down.  During breakpoint re-set, we try to switch to the

> new location's pspace (the current pspace set in (1), so the pspace

> we're tearing down) with switch_to_program_space_and_thread, and that

> hits the failed assertion.  It's the fact that we recreate breakpoints

> in the program_space destructor that is the latent bug here.  Just

> don't do that, since we've already taken care of breakpoints with

> breakpoint_program_space_exit, and we don't end up in the crash situation.

> 

> My first approach to fix this added a symfile_add_flags parameter to

> program_space::free_all_objfiles, and then passed that down to

> clear_symtab_users.  The program_space dtor would then pass down

> SYMFILE_DEFER_BP_RESET to free_all_objfiles.  I couldn't help feeling

> that adding that parameter to free_all_objfiles looked a little

> awkward, so I settled on something a little different -- hoist the

> clear_symtab_users call to the callers.  There are only two callers.

> I felt that that didn't look as odd, particularly since

> remove_symbol_file_command also does:

> 

>   objf->unlink ();

>   clear_symtab_users (0);

> 

> I.e., objfile deletion is already separate from calling

> clear_symtab_users in some places.

> 

> I'm not hard set on this approach though; I can go with adding the

> symfile_add_flags parameter to program_space::free_all_objfiles if

> people prefer that.


I personally think that in particular the comment and the change below
have educational value.  So, I'd vote for the approach you sent.

> diff --git a/gdb/progspace.c b/gdb/progspace.c

> index 96f8acc64c..1361040347 100644

> --- a/gdb/progspace.c

> +++ b/gdb/progspace.c

> @@ -147,6 +147,9 @@ program_space::~program_space ()

>    no_shared_libraries (NULL, 0);

>    exec_close ();

>    free_all_objfiles ();

> +  /* Defer breakpoint re-set because we don't want to create new

> +     locations for this pspace which we're tearing down.  */

> +  clear_symtab_users (SYMFILE_DEFER_BP_RESET);

>    if (!gdbarch_has_shared_address_space (target_gdbarch ()))

>      free_address_space (this->aspace);

>    clear_section_table (&this->target_sections);


As far as I can tell, you rebased the series on the current master, correct?
Could you update the users/palves/multi-target-v2 branch?  The new patch does not
apply there cleanly.  I attempted to replicate it on users/palves/multi-target-v2,
but gdb.base/step-over-exit.exp still hits the assertion.

-Baris

Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
  
Pedro Alves Jan. 10, 2020, 12:17 p.m. UTC | #2
On 1/10/20 11:33 AM, Aktemur, Tankut Baris wrote:

> As far as I can tell, you rebased the series on the current master, correct?
> Could you update the users/palves/multi-target-v2 branch?  The new patch does not
> apply there cleanly.  I attempted to replicate it on users/palves/multi-target-v2,
> but gdb.base/step-over-exit.exp still hits the assertion.
Sorry, should have thought to do that.  Done now.

Thanks,
Pedro Alves
  
Tankut Baris Aktemur Jan. 10, 2020, 1:51 p.m. UTC | #3
On Friday, January 10, 2020 1:18 PM, Pedro Alves wrote:
> 

> > As far as I can tell, you rebased the series on the current master, correct?

> > Could you update the users/palves/multi-target-v2 branch?  The new patch does not

> > apply there cleanly.  I attempted to replicate it on users/palves/multi-target-v2,

> > but gdb.base/step-over-exit.exp still hits the assertion.

> Sorry, should have thought to do that.  Done now.


Got it, and ran the test without problems.  Thank you.

-Baris

Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
  
Tom Tromey Jan. 10, 2020, 2:34 p.m. UTC | #4
>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:

Pedro> The crash happens while doing a breakpoint re-set, called by
Pedro> clear_symtab_users at the tail end of (3).  That is, while recreating
Pedro> breakpoints for the current program space, which is the program space
Pedro> we're tearing down.  During breakpoint re-set, we try to switch to the
Pedro> new location's pspace (the current pspace set in (1), so the pspace
Pedro> we're tearing down) with switch_to_program_space_and_thread, and that
Pedro> hits the failed assertion.

Whatever happened to fine-grained breakpoint resetting?
Among other things maybe that would help with this sort of problem.

Pedro> I'm not hard set on this approach though; I can go with adding the
Pedro> symfile_add_flags parameter to program_space::free_all_objfiles if
Pedro> people prefer that.

This approach seems fine.
Longer term I would like to get rid of clear_symtab_users.  Maybe it
could be replaced with more precisely targeted observers instead.

thanks,
Tom
  
Pedro Alves Jan. 10, 2020, 8:03 p.m. UTC | #5
On 1/10/20 2:34 PM, Tom Tromey wrote:
>>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:
> 
> Pedro> The crash happens while doing a breakpoint re-set, called by
> Pedro> clear_symtab_users at the tail end of (3).  That is, while recreating
> Pedro> breakpoints for the current program space, which is the program space
> Pedro> we're tearing down.  During breakpoint re-set, we try to switch to the
> Pedro> new location's pspace (the current pspace set in (1), so the pspace
> Pedro> we're tearing down) with switch_to_program_space_and_thread, and that
> Pedro> hits the failed assertion.
> 
> Whatever happened to fine-grained breakpoint resetting?
> Among other things maybe that would help with this sort of problem.

Yeah...  Many years ago Keith gave it a try, and then he had to go do
something else.  And then few years later, I gave it another go.
I got to somewhere quite close to useful, but never managed to get back
to finishing it.  The branch doesn't look like is on my github, I'd
have to dig it out.  This was pre-C++...  Maybe starting to scratch
is better...

> 
> Pedro> I'm not hard set on this approach though; I can go with adding the
> Pedro> symfile_add_flags parameter to program_space::free_all_objfiles if
> Pedro> people prefer that.
> 
> This approach seems fine.
> Longer term I would like to get rid of clear_symtab_users.  Maybe it
> could be replaced with more precisely targeted observers instead.

Thanks,
Pedro Alves
  

Patch

diff --git a/gdb/progspace-and-thread.c b/gdb/progspace-and-thread.c
index a1824e8935..698661d43f 100644
--- a/gdb/progspace-and-thread.c
+++ b/gdb/progspace-and-thread.c
@@ -25,8 +25,9 @@  void
 switch_to_program_space_and_thread (program_space *pspace)
 {
   inferior *inf = find_inferior_for_program_space (pspace);
+  gdb_assert (inf != nullptr);
 
-  if (inf != NULL && inf->pid != 0)
+  if (inf->pid != 0)
     {
       thread_info *tp = any_live_thread_of_inferior (inf);
 
@@ -39,6 +40,5 @@  switch_to_program_space_and_thread (program_space *pspace)
 	}
     }
 
-  switch_to_no_thread ();
-  set_current_program_space (pspace);
+  switch_to_inferior_no_thread (inf);
 }
diff --git a/gdb/progspace.c b/gdb/progspace.c
index 96f8acc64c..1361040347 100644
--- a/gdb/progspace.c
+++ b/gdb/progspace.c
@@ -147,6 +147,9 @@  program_space::~program_space ()
   no_shared_libraries (NULL, 0);
   exec_close ();
   free_all_objfiles ();
+  /* Defer breakpoint re-set because we don't want to create new
+     locations for this pspace which we're tearing down.  */
+  clear_symtab_users (SYMFILE_DEFER_BP_RESET);
   if (!gdbarch_has_shared_address_space (target_gdbarch ()))
     free_address_space (this->aspace);
   clear_section_table (&this->target_sections);
@@ -168,8 +171,6 @@  program_space::free_all_objfiles ()
 
   while (!objfiles_list.empty ())
     objfiles_list.front ()->unlink ();
-
-  clear_symtab_users (0);
 }
 
 /* See progspace.h.  */
diff --git a/gdb/symfile.c b/gdb/symfile.c
index 86d6174107..d5f70d7f03 100644
--- a/gdb/symfile.c
+++ b/gdb/symfile.c
@@ -1253,6 +1253,8 @@  symbol_file_clear (int from_tty)
 
   current_program_space->free_all_objfiles ();
 
+  clear_symtab_users (0);
+
   gdb_assert (symfile_objfile == NULL);
   if (from_tty)
     printf_filtered (_("No symbol file now.\n"));
diff --git a/gdb/testsuite/gdb.server/bkpt-other-inferior.exp b/gdb/testsuite/gdb.server/bkpt-other-inferior.exp
new file mode 100644
index 0000000000..e2a7be5cdb
--- /dev/null
+++ b/gdb/testsuite/gdb.server/bkpt-other-inferior.exp
@@ -0,0 +1,93 @@ 
+# Copyright 2019 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test that GDB does not access the remote target's memory when
+# setting a breakpoint on a function that only exists in an inferior
+# that is not bound to the remote target.
+
+load_lib gdbserver-support.exp
+
+standard_testfile server.c
+
+if { [skip_gdbserver_tests] } {
+    return 0
+}
+
+if { [prepare_for_testing "failed to prepare" ${binfile} "${srcfile}" \
+	  {debug pthreads}] } {
+    return
+}
+
+# Make sure we're disconnected, in case we're testing with an
+# extended-remote board, therefore already connected.
+gdb_test "disconnect" ".*"
+
+# Leave inferior 1 with the exec target, not connected.  Add another
+# inferior, and connect it to gdbserver.
+
+gdb_test "add-inferior" "Added inferior 2" \
+    "add inferior 2"
+gdb_test "inferior 2" "Switching to inferior 2.*" \
+    "switch to inferior 2"
+gdb_test "file ${binfile}" ".*" "load file in inferior 2"
+
+set target_exec [gdbserver_download_current_prog]
+
+# Start GDBserver.
+set res [gdbserver_start "" $target_exec]
+
+# Connect to GDBserver.
+set gdbserver_protocol [lindex $res 0]
+set gdbserver_gdbport [lindex $res 1]
+gdb_target_cmd $gdbserver_protocol $gdbserver_gdbport
+
+# Discard any symbol files that we have opened.
+set test "discard symbol table"
+gdb_test_multiple "file" $test {
+    -re "A program is being debugged already..*Are you sure you want to change the file.*y or n. $" {
+	gdb_test "y" ".*" $test \
+	    {Discard symbol table from `.*'\? \(y or n\) } "y"
+    }
+}
+
+# At this point:
+#
+# - inferior 1 has symbols, and is not connected to any target.
+# - inferior 2 has no symbols, and is connected to gdbserver.
+
+# Setting a breakpoint at some function by name should set a
+# breakpoint on inferior 1, since it has symbols, and should not
+# result in any access to inferior 2's remote target.
+
+gdb_test_no_output "set debug remote 1"
+
+foreach inf_sel {1 2} {
+    with_test_prefix "inf $inf_sel" {
+	gdb_test "inferior $inf_sel" "Switching to inferior $inf_sel.*" \
+	    "switch to inferior"
+
+	set test "set breakpoint"
+	gdb_test_multiple "break main" $test {
+	    -re "Sending packet.*$gdb_prompt $" {
+		fail $test
+	    }
+	    -re "^break main\r\nBreakpoint .* at .*$gdb_prompt $" {
+		pass $test
+	    }
+	}
+
+	delete_breakpoints
+    }
+}