Message ID | 15b31a68-82f4-b9e9-7b9a-d9a82f8903d9@codesourcery.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 28507 invoked by alias); 16 Jan 2019 21:18:19 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 28491 invoked by uid 89); 16 Jan 2019 21:18:18 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=gotten, consolidate, sk:server, Hx-languages-length:2903 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 16 Jan 2019 21:18:16 +0000 Received: from svr-orw-mbx-03.mgc.mentorg.com ([147.34.90.203]) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1gjsZi-0006DB-T5 from Sandra_Loosemore@mentor.com for gdb-patches@sourceware.org; Wed, 16 Jan 2019 13:18:14 -0800 Received: from [127.0.0.1] (147.34.91.1) by svr-orw-mbx-03.mgc.mentorg.com (147.34.90.203) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Wed, 16 Jan 2019 13:18:12 -0800 To: "gdb-patches@sourceware.org" <gdb-patches@sourceware.org> From: Sandra Loosemore <sandra@codesourcery.com> Subject: [PATCH, RFC] fix gdbserver channel leaks Message-ID: <15b31a68-82f4-b9e9-7b9a-d9a82f8903d9@codesourcery.com> Date: Wed, 16 Jan 2019 14:18:08 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------18012F31A1A150A25D2D3708" |
Commit Message
Sandra Loosemore
Jan. 16, 2019, 9:18 p.m. UTC
I've been trying to remove some ancient cruft in our local remote-target gdbserver test harness that has caused problems with some newer GDB tests. This has exposed what I think is a problem in GDB's own gdbserver-support.exp: having opened a channel to the gdbserver spawn in gdbserver_start, in many cases it is failing to close it cleanly, causing it to leak so many open channels that eventually TCL runs out and starts giving ERRORs. Specifically, gdbserver_start is discarding any already-open server_spawn_id without closing it when opening a new channel, and the logic in gdbserver_gdb_exit that tries to clean up the gdbserver connection is failing to actually close the channel before discarding server_spawn_id. I've thought it best to consolidate the logic to manage server_spawn_id in start_gdbserver and close_gdbserver. It did seem plausible to put the logic to shut down any old server_spawn_id channel before opening a new one in gdbserver_run instead of in start_gdbserver, but there are places that bypass that function (e.g., in the mi test support). The alternative to fixing this in GDB's own test support seems to be doing it at the DejaGnu board level instead, which is what I'm trying to get rid of in our local test harness. There's no documentation about the expectations and it seems pretty hacky and a violation of abstractions to do it in the low-level board support. I'm wondering, can other people who do remote-target gdbserver testing help try out this patch and see if it works for them? I've gotten good results with my nios2-linux-gnu test setup but I don't know what kind of board support other people might be using. Also, assuming the patch is OK, I don't know if it's inappropriate or too late for the 8.3 release. -Sandra
Comments
>>>>> "Sandra" == Sandra Loosemore <sandra@codesourcery.com> writes:
Sandra> I've thought it best to consolidate the logic to manage
Sandra> server_spawn_id in start_gdbserver and close_gdbserver. It did seem
Sandra> plausible to put the logic to shut down any old server_spawn_id
Sandra> channel before opening a new one in gdbserver_run instead of in
Sandra> start_gdbserver, but there are places that bypass that function (e.g.,
Sandra> in the mi test support).
FWIW this patch looks reasonable to me.
Sandra> I'm wondering, can other people who do remote-target gdbserver testing
Sandra> help try out this patch and see if it works for them?
I can't really help with that, though, I'm afraid.
Tom
On 01/16/2019 09:18 PM, Sandra Loosemore wrote: > I've been trying to remove some ancient cruft in our local remote-target gdbserver test harness that has caused problems with some newer GDB tests. This has exposed what I think is a problem in GDB's own gdbserver-support.exp: having opened a channel to the gdbserver spawn in gdbserver_start, in many cases it is failing to close it cleanly, causing it to leak so many open channels that eventually TCL runs out and starts giving ERRORs. Specifically, gdbserver_start is discarding any already-open server_spawn_id without closing it when opening a new channel, and the logic in gdbserver_gdb_exit that tries to clean up the gdbserver connection is failing to actually close the channel before discarding server_spawn_id. When I read this, a question comes to mind: How does it happen that we're starting a new gdbserver without closing the original one? It sounds like this may be papering over a bug elsewhere? Shouldn't that be an error instead? Why are we "failing to close it cleanly"? I mean, default_gdb_spawn doesn't start a new gdb if there's already one either. Thanks, Pedro Alves > > I've thought it best to consolidate the logic to manage server_spawn_id in start_gdbserver and close_gdbserver. It did seem plausible to put the logic to shut down any old server_spawn_id channel before opening a new one in gdbserver_run instead of in start_gdbserver, but there are places that bypass that function (e.g., in the mi test support). > > The alternative to fixing this in GDB's own test support seems to be doing it at the DejaGnu board level instead, which is what I'm trying to get rid of in our local test harness. There's no documentation about the expectations and it seems pretty hacky and a violation of abstractions to do it in the low-level board support. > > I'm wondering, can other people who do remote-target gdbserver testing help try out this patch and see if it works for them? I've gotten good results with my nios2-linux-gnu test setup but I don't know what kind of board support other people might be using. Also, assuming the patch is OK, I don't know if it's inappropriate or too late for the 8.3 release.
On 1/17/19 9:12 AM, Pedro Alves wrote: > On 01/16/2019 09:18 PM, Sandra Loosemore wrote: >> I've been trying to remove some ancient cruft in our local remote-target gdbserver test harness that has caused problems with some newer GDB tests. This has exposed what I think is a problem in GDB's own gdbserver-support.exp: having opened a channel to the gdbserver spawn in gdbserver_start, in many cases it is failing to close it cleanly, causing it to leak so many open channels that eventually TCL runs out and starts giving ERRORs. Specifically, gdbserver_start is discarding any already-open server_spawn_id without closing it when opening a new channel, and the logic in gdbserver_gdb_exit that tries to clean up the gdbserver connection is failing to actually close the channel before discarding server_spawn_id. > > When I read this, a question comes to mind: > > How does it happen that we're starting a new gdbserver > without closing the original one? It sounds like this may > be papering over a bug elsewhere? Shouldn't that be > an error instead? Why are we "failing to close it cleanly"? > > I mean, default_gdb_spawn doesn't start a new gdb if > there's already one either. Well, I initially thought this should be an error too. But here is one example I tracked down where the error was triggering. (There might be others as well.) gdbserver_gdb_load (in config/gdbserver.exp) calls gdbserver_spawn. We're using our own hacked-up version but it was derived from from this source; presumably this behavior is part of the interface description of what this function is supposed to do. mi_gdb_target_load calls gdbserver_gdb_load. It issues the "kill" command to kill any already-running program first, but doesn't do anything to close the open gdbserver channel. mi_run_cmd calls mi_gdb_target_load indirectly via mi_run_cmd_full. mi_runto calls mi_run_cmd indirectly via mi_runto_helper. There are a lot of individual tests that call mi_runto without doing anything to shut down gdbserver first. -Sandra
Ping. Any additional thoughts on this? On 1/17/19 12:32 PM, Sandra Loosemore wrote: > On 1/17/19 9:12 AM, Pedro Alves wrote: >> On 01/16/2019 09:18 PM, Sandra Loosemore wrote: >>> I've been trying to remove some ancient cruft in our local >>> remote-target gdbserver test harness that has caused problems with >>> some newer GDB tests. This has exposed what I think is a problem in >>> GDB's own gdbserver-support.exp: having opened a channel to the >>> gdbserver spawn in gdbserver_start, in many cases it is failing to >>> close it cleanly, causing it to leak so many open channels that >>> eventually TCL runs out and starts giving ERRORs. Specifically, >>> gdbserver_start is discarding any already-open server_spawn_id >>> without closing it when opening a new channel, and the logic in >>> gdbserver_gdb_exit that tries to clean up the gdbserver connection is >>> failing to actually close the channel before discarding server_spawn_id. >> >> When I read this, a question comes to mind: >> >> How does it happen that we're starting a new gdbserver >> without closing the original one? It sounds like this may >> be papering over a bug elsewhere? Shouldn't that be >> an error instead? Why are we "failing to close it cleanly"? >> >> I mean, default_gdb_spawn doesn't start a new gdb if >> there's already one either. > > Well, I initially thought this should be an error too. But here is one > example I tracked down where the error was triggering. (There might be > others as well.) > > gdbserver_gdb_load (in config/gdbserver.exp) calls gdbserver_spawn. > We're using our own hacked-up version but it was derived from from this > source; presumably this behavior is part of the interface description of > what this function is supposed to do. > > mi_gdb_target_load calls gdbserver_gdb_load. It issues the "kill" > command to kill any already-running program first, but doesn't do > anything to close the open gdbserver channel. > > mi_run_cmd calls mi_gdb_target_load indirectly via mi_run_cmd_full. > > mi_runto calls mi_run_cmd indirectly via mi_runto_helper. > > There are a lot of individual tests that call mi_runto without doing > anything to shut down gdbserver first. > > -Sandra
diff --git a/gdb/testsuite/lib/gdbserver-support.exp b/gdb/testsuite/lib/gdbserver-support.exp index 05234c4..42ccbfa 100644 --- a/gdb/testsuite/lib/gdbserver-support.exp +++ b/gdb/testsuite/lib/gdbserver-support.exp @@ -303,7 +303,12 @@ proc gdbserver_start { options arguments } { append gdbserver_command " $arguments" } + # Cleanly shut down any existing gdbserver spawn before + # starting a new one. global server_spawn_id + if { [info exists server_spawn_id] } { + close_gdbserver + } set server_spawn_id [remote_spawn target $gdbserver_command] # GDBserver doesn't do inferior I/O through GDB. But we can @@ -422,8 +427,7 @@ proc gdbserver_gdb_exit { is_mi } { exp_continue } -i "$server_spawn_id" eof { - wait -i $expect_out(spawn_id) - unset server_spawn_id + # close_gdbserver below will clean up the spawn state. } } }