gdb: fix target_ops reference count for some cases

  This commit started as an investigation into why the test
gdb.python/py-inferior.exp crashes when GDB exits, leaving a core file
behind.

The crash occurs in connpy_connection_dealloc, and is actually
triggered by this assert:

  gdb_assert (conn_obj->target == nullptr);

Now a little aside...

... the assert is never actually printed, instead GDB crashes due to
calling a pure virtual function.  The backtrace at the point of crash
looks like this:

  #7  0x00007fef7e2cf747 in std::terminate() () from /lib64/libstdc++.so.6
  #8  0x00007fef7e2d0515 in __cxa_pure_virtual () from /lib64/libstdc++.so.6
  #9  0x0000000000de334d in target_stack::find_beneath (this=0x4934d78, t=0x2bda270 <the_dummy_target>) at ../../src/gdb/target.c:3606
  #10 0x0000000000df4380 in inferior::find_target_beneath (this=0x4934b50, t=0x2bda270 <the_dummy_target>) at ../../src/gdb/inferior.h:377
  #11 0x0000000000de2381 in target_ops::beneath (this=0x2bda270 <the_dummy_target>) at ../../src/gdb/target.c:3047
  #12 0x0000000000de68aa in target_ops::supports_terminal_ours (this=0x2bda270 <the_dummy_target>) at ../../src/gdb/target-delegates.c:1223
  #13 0x0000000000dde6b9 in target_supports_terminal_ours () at ../../src/gdb/target.c:1112
  #14 0x0000000000ee55f1 in internal_vproblem(internal_problem *, const char *, int, const char *, typedef __va_list_tag __va_list_tag *) (problem=0x2bdab00 <internal_error_problem>, file=0x198acf0 "../../src/gdb/python/py-connection.c", line=193, fmt=0x198ac9f "%s: Assertion `%s' failed.", ap=0x7ffdc26109d8) at ../../src/gdb/utils.c:379

Notice in frame #12 we called target_ops::supports_terminal_ours,
however, this is the_dummy_target, which is of type dummy_target, and
so we should have called dummy_target::supports_terminal_ours.  I
believe the reason we ended up in the wrong implementation of
supports_terminal_ours (which is a virtual function) is because we
made the call during GDB's shut-down, and, I suspect, the vtables were
in a weird state.

Anyway, the point of this patch is not to fix GDB's ability to print
an assert during exit, but to address the root cause of the assert.
With that aside out of the way, we can return to the main story...

Connections are represented in Python with gdb.TargtetConnection
objects (or its sub-classes).  The assert in question confirms that
when a gdb.TargtetConnection is deallocated, the underlying GDB
connection has itself been removed from GDB.  If this is not true then
we risk creating multiple different gdb.TargtetConnection objects for
the same connection, which would be bad.

When a connection removed in GDB the connection_removed observer
fires, which we catch with connpy_connection_removed, this function
then sets conn_obj->target to nullptr.

The first issue here is that connpy_connection_dealloc is being called
as part of GDB's exit code, which is run after the Python interpreter
has been shut down.  The connpy_connection_dealloc function is used to
deallocate the gdb.TargtetConnection Python object.  Surely it is
wrong for us to be deallocating Python objects after the interpreter
has been shut down.

The reason why connpy_connection_dealloc is called during GDB's exit
is that the global all_connection_objects map is holding a reference
to the gdb.TargtetConnection object.  When the map is destroyed during
GDB's exit, the gdb.TargtetConnection objects within the map can
finally be deallocated.

Another job of connpy_connection_removed (the function we mentioned
earlier) is to remove connections from the all_connection_objects map
when the connection is removed from GDB.

And so, the reason why all_connection_objects has contents when GDB
exits, and the reason the assert fires, is that, when GDB exits, there
are still some connections that have not yet been removed from GDB,
that is, they have a non-zero reference count.

If we take a look at quit_force (top.c) you can see that, for each
inferior, we call pop_all_targets before we (later in the function)
call do_final_cleanups.  It is the do_final_cleanups call that is
responsible for shutting down the Python interpreter.

So, in theory, we should have popped all targets be the time GDB
exits, this should have reduced their reference counts to zero, which
in turn should have triggered the connection_removed observer, and
resulted in the connection being removed from all_connection_objects,
and the gdb.TargtetConnection object being deallocated.

That this is not happening indicates that earlier, somewhere else in
GDB, we are leaking references to GDB's connections.

I tracked the problem down to the 'remove-inferiors' command,
implemented with the remove_inferior_command function (in inferior.c).
This function calls delete_inferior for each inferior the user
specifies.

In delete_inferior we do some house keeping, and then delete the
inferior object, which calls inferior::~inferior.

In neither delete_inferior or inferior::~inferior do we call
pop_all_targets, and it is this missing call that means we leak some
references to the target_ops objects on the inferior's target_stack.

To fix this we need to add a pop_all_targets call either in
delete_inferior or in inferior::~inferior.  Currently, I think that we
should place the call in delete_inferior.

Before calling pop_all_targets the inferior for which we are popping
needs to be made current, along with the program_space associated with
the inferior.

At the moment the inferior's program_space is deleted in
delete_inferior before we call inferior::~inferior, so, I think, to
place the pop_all_targets call into inferior::~inferior would require
additional adjustment to GDB.  As delete_inferior already exists, and
includes various house keeping tasks, it doesn't seem unreasonable to
place the pop_all_targets call there.

Now when I run py-inferior.exp, by the time GDB exits, the reference
counts are correct.  The final pop_all_targets calls in quit_force
reduce the reference counts to zero, which means the connections are
removed before the Python interpreter is shut down.  When GDB actually
exits the all_connection_objects map is empty, and no further Python
objects are deallocated at that point.  The test now exits cleanly
without creating a core file.

I've made some additional, related, changes in this commit.

In inferior::~inferior I've added a new assert that ensures, by the
time the inferior is destructed, the inferior's target stack is
empty (with the exception of the dummy_target).  If this is not true
then we will be loosing a reference to a target_ops object.

It is worth noting that we are loosing references to the dummy_target
object, however, I've not tried to fix that problem in this patch, as
I don't think it is as important.  The dummy target is a global
singleton, there's no observer for when the dummy target is deleted,
so no other parts of GDB care when the object is deleted.  As a global
it is always just deleted as part of the exit code, and we never
really care what its reference count is.  So, though it is a little
annoying that its reference count is wrong, it doesn't really matter.
Maybe I'll come back in a later patch and try to clean that up... but
that's for another day.

When I tested the changes above I ran into a failure from 'maint
selftest infrun_thread_ptid_changed'.

The problem is with scoped_mock_context.  This object creates a new
inferior (called mock_inferior), with a thread, and some other
associated state, and then select this new inferior.  We also push a
process_stratum_target sub-class onto the new inferior's target stack.

In ~scoped_mock_context we call:

  pop_all_targets_at_and_above (process_stratum);

this will remove all target_ops objects from the mock_inferior's
target stack, but leaves anything at the dummy_stratum and the
file_stratum (which I find a little weird, but more on this later).

The problem though is that pop_all_targets_at_and_above, just like
pop_all_targets, removes things from the target stack of the current
inferior.  In ~scoped_mock_context we don't ensure that the
mock_inferior associated with the current scoped_mock_context is
actually selected.

In most tests we create a single scoped_mock_context, which
automatically selects its contained mock_inferior.  However, in the
test infrun_thread_ptid_changed, we create multiple
scoped_mock_context, and then change which inferior is currently
selected.

As a result, in one case, we end up in ~scoped_mock_context with the
wrong inferior selected.  The pop_all_targets_at_and_above call then
removes the target_ops from the wrong inferior's target stack.  This
leaves the target_ops on the scoped_mock_context::mock_inferior's
target stack, and, when the mock_inferior is destructed, we loose
some references, this triggers the assert I placed in
inferior::~inferior.

To fix this I added a switch_to_inferior_no_thread call within the
~scoped_mock_context function.

As I mention above, it seems weird that we call
pop_all_targets_at_and_above instead of pop_all_targets, so I've
changed that.  I didn't see any test regressions after this, so I'm
assuming this is fine.

Finally, I've made use of the Python connection_removed event listener
API to add a test for this issue, but in addition the py-inferior.exp
test now runs without crashing (and creating a core file) on exit.
---
 gdb/inferior.c                                | 30 ++++++
 gdb/scoped-mock-context.h                     |  5 +-
 .../gdb.python/py-connection-removed.exp      | 92 +++++++++++++++++++
 3 files changed, 126 insertions(+), 1 deletion(-)
 create mode 100644 gdb/testsuite/gdb.python/py-connection-removed.exp

Message ID	20220921131200.3983844-1-aburgess@redhat.com
State	Superseded
Delegated to:	Simon Marchi
Headers	Return-Path: <gdb-patches-bounces+patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6701C385828E for <patchwork@sourceware.org>; Wed, 21 Sep 2022 13:13:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6701C385828E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1663766019; bh=3WKnRGeiUhYtYgtsBosuzuynrtYtUd2Vnene6ptKmkA=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=yLEkPQ/qvQfoyCONhEAGPr4TkVElQSU4n0lemNDNCN3xsc6Dx068YGJq7whNL3GHz sqYEu0pFgGc4lxsFAu/1pOTr5e4x/Clw3rU12NQl0GfzYQfE1OADD5viPp7slnvZhu gN5aDNjFivdw7YdK+HMMscpNdrz1ER9hBA6qwRoM= X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 30E943858C55 for <gdb-patches@sourceware.org>; Wed, 21 Sep 2022 13:12:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 30E943858C55 Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-379-Vl1YoH5sNlKh7ltUr6msjw-1; Wed, 21 Sep 2022 09:12:05 -0400 X-MC-Unique: Vl1YoH5sNlKh7ltUr6msjw-1 Received: by mail-wm1-f70.google.com with SMTP id 84-20020a1c0257000000b003b4be28d7e3so1283319wmc.0 for <gdb-patches@sourceware.org>; Wed, 21 Sep 2022 06:12:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date; bh=3WKnRGeiUhYtYgtsBosuzuynrtYtUd2Vnene6ptKmkA=; b=Cq2Kgh077AWMFFijM5tZQwXp2UBm9bj3lrYPANciLIQxgqEf/kxRaPxIWrV1aRBxFv 7tSNhF5ftMuY94YUlLOem1vlwektEaTF/h897K45EMIbI2StT7pVprdwakui9R3RJJAJ Q0r+bquddZsPeZIMEJTKGPIZGg4mCPfPwKlqTibic2ADAssa7OlkZ3JJLXh6e1liVkrG 2Ongt98GLI154A58Pku7rtZs0KkGBqBKJuwmtRQ24hoWetBe52SiZNZXmnWwyZ4fnQlB zT2qnznrVpCASvnADg54wSy5FVJhbYqnP9cB+fpZrT+SLVKEwB+TPEzdRWNSogZUkHQW 3fVA== X-Gm-Message-State: ACrzQf0csroEteHEtaMR/RT5mq0DS0ylOiv2xVTS28AiFXOO04Ov0A2G caXH4uyzRRD6Mp0BeAMf94r0d4TWUN1ldrm5Xe1Qu39lOQ4VXUA1Evr/FY8dGb962S0d383NexA p8NxJOp7e80vqd75u92ePvLLBs3A9CvUJHY4Q0PyMWzYc4gra+43rEWr0s7VW/lDruTK2cRGHGQ == X-Received: by 2002:a05:6000:797:b0:22a:c13a:a971 with SMTP id bu23-20020a056000079700b0022ac13aa971mr17261397wrb.320.1663765923788; Wed, 21 Sep 2022 06:12:03 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7FI+H5XH1ynOcOv+YODDIYpmm1181LEb17B1Tyr7KvtLnC8S10ll7KD1pU61Sju3uljmWQFg== X-Received: by 2002:a05:6000:797:b0:22a:c13a:a971 with SMTP id bu23-20020a056000079700b0022ac13aa971mr17261364wrb.320.1663765923200; Wed, 21 Sep 2022 06:12:03 -0700 (PDT) Received: from localhost (52.72.115.87.dyn.plus.net. [87.115.72.52]) by smtp.gmail.com with ESMTPSA id f8-20020adfb608000000b00228692033dcsm2369950wre.91.2022.09.21.06.12.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Sep 2022 06:12:02 -0700 (PDT) To: gdb-patches@sourceware.org Subject: [PATCH] gdb: fix target_ops reference count for some cases Date: Wed, 21 Sep 2022 14:12:00 +0100 Message-Id: <20220921131200.3983844-1-aburgess@redhat.com> X-Mailer: git-send-email 2.25.4 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII"; x-default=true X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>, <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>, <mailto:gdb-patches-request@sourceware.org?subject=subscribe> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> Reply-To: Andrew Burgess <aburgess@redhat.com> Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Sender: "Gdb-patches" <gdb-patches-bounces+patchwork=sourceware.org@sourceware.org>
Series	gdb: fix target_ops reference count for some cases \| gdb: fix target_ops reference count for some cases

gdb: fix target_ops reference count for some cases

Commit Message

Comments

Patch