Message ID | 20140316135916.GA31463@host2.jankratochvil.net |
---|---|
State | Superseded |
Headers |
Return-Path: <x14314964@homiemail-mx20.g.dreamhost.com> X-Original-To: siddhesh@wilcox.dreamhost.com Delivered-To: siddhesh@wilcox.dreamhost.com Received: from homiemail-mx20.g.dreamhost.com (caibbdcaaahc.dreamhost.com [208.113.200.72]) by wilcox.dreamhost.com (Postfix) with ESMTP id B47FB3601A7 for <siddhesh@wilcox.dreamhost.com>; Sun, 16 Mar 2014 06:59:25 -0700 (PDT) Received: by homiemail-mx20.g.dreamhost.com (Postfix, from userid 14314964) id 648E44057103F; Sun, 16 Mar 2014 06:59:25 -0700 (PDT) X-Original-To: gdb@patchwork.siddhesh.in Delivered-To: x14314964@homiemail-mx20.g.dreamhost.com Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by homiemail-mx20.g.dreamhost.com (Postfix) with ESMTPS id 2549B40E2F95E for <gdb@patchwork.siddhesh.in>; Sun, 16 Mar 2014 06:59:25 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:subject:message-id:mime-version :content-type; q=dns; s=default; b=Sw84qTMDA2No72eAv+fUe6F/T1GAH pBBLKTNHrS8d1Pqg9kZpG7rDj4zo1Rqf5SNOArBEFsUyc87v5+NAbw4kyQqnv0Gh b8VdcpHjiJBs2GPKQA2K59GgNsm9K8poxHtLs8AmUkQY6lM/w8tk3Z2o9p3oqGUW HCnVA4qNJT/bCg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:subject:message-id:mime-version :content-type; s=default; bh=NHRFzqRORJDrrnhsxyGMdnRxK1s=; b=sz6 ZPgdXmacZFF+tAY2IrqzCNyj7mkBpDFWN6vFQL5heguxcr+21mMsSdxgQ71n/9+d HLqTohoSfaDeE6yT6EDICj2TceS/M8yaVqk/bXP/k6B2KZhKtiJDMC7/W1MGPDgw TnDPxKDPbweDdCtJmZFYetdMNOiOZ1yRc1P9lg3w= Received: (qmail 5440 invoked by alias); 16 Mar 2014 13:59:23 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-gdb=patchwork.siddhesh.in@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 5428 invoked by uid 89); 16 Mar 2014 13:59:22 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.1 required=5.0 tests=AWL, BAYES_00, SPF_HELO_PASS, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 16 Mar 2014 13:59:20 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s2GDxJBg020309 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for <gdb-patches@sourceware.org>; Sun, 16 Mar 2014 09:59:19 -0400 Received: from host2.jankratochvil.net (ovpn-116-86.ams2.redhat.com [10.36.116.86]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s2GDxGKh029043 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for <gdb-patches@sourceware.org>; Sun, 16 Mar 2014 09:59:18 -0400 Date: Sun, 16 Mar 2014 14:59:16 +0100 From: Jan Kratochvil <jan.kratochvil@redhat.com> To: gdb-patches@sourceware.org Subject: [patch] linux_nat_kill() compat. with linux-2.4.x Message-ID: <20140316135916.GA31463@host2.jankratochvil.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes X-DH-Original-To: gdb@patchwork.siddhesh.in |
Commit Message
Jan Kratochvil
March 16, 2014, 1:59 p.m. UTC
Hi, it had been already approved by Tom that time so I will check it in in some time. [patch] Fix SIGTERM signal safety (PR gdb/15358) https://sourceware.org/ml/gdb-patches/2013-07/msg00094.html Message-ID: <20130702200010.GA23478@host2.jankratochvil.net> Doug has just asked to split it out of: Re: [patchv2] Fix SIGTERM signal safety (PR gdb/15358) [refresh] https://sourceware.org/ml/gdb-patches/2014-03/msg00342.html Message-ID: <21284.44419.745786.47756@ruffy.mtv.corp.google.com> The testcase of the patch above still PASSes for me on Fedora 20 x86_64 even without this patch, I guess this patch was needed only on that 2.4.x Linux kernel it mentions. I am no longer interested in 2.4.x so if anyone has any concerns I am also fine with dropping this patch. Thanks, Jan gdb/ 2014-03-16 Jan Kratochvil <jan.kratochvil@redhat.com> PR gdb/15358 * linux-nat.c (linux_nat_kill): Use kill_callback first. Extend the comment for stop_callback.
Comments
Hi Jan, On 03/16/2014 01:59 PM, Jan Kratochvil wrote: > gdb/ > 2014-03-16 Jan Kratochvil <jan.kratochvil@redhat.com> > > PR gdb/15358 > * linux-nat.c (linux_nat_kill): Use kill_callback first. > Extend the comment for stop_callback. > > diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c > index b615423..ec84188 100644 > --- a/gdb/linux-nat.c > +++ b/gdb/linux-nat.c > @@ -3777,8 +3777,15 @@ linux_nat_kill (struct target_ops *ops) > { > ptid_t ptid = pid_to_ptid (ptid_get_pid (inferior_ptid)); > > + /* Kill all LWP's before trying to stop them. In rare cases the > + lwp_info state may not match the inferior and > + stop_wait_callback could lock up. */ Hmm, I find this comment confusing and not really enlightening. What sort of rare cases? It that PR15713? Best just fix that. I've sent a patch: https://sourceware.org/ml/gdb-patches/2014-05/msg00473.html
On Wed, 21 May 2014 14:49:43 +0200, Pedro Alves wrote: > On 03/16/2014 01:59 PM, Jan Kratochvil wrote: > > diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c > > index b615423..ec84188 100644 > > --- a/gdb/linux-nat.c > > +++ b/gdb/linux-nat.c > > @@ -3777,8 +3777,15 @@ linux_nat_kill (struct target_ops *ops) > > { > > ptid_t ptid = pid_to_ptid (ptid_get_pid (inferior_ptid)); > > > > + /* Kill all LWP's before trying to stop them. In rare cases the > > + lwp_info state may not match the inferior and > > + stop_wait_callback could lock up. */ > > Hmm, I find this comment confusing and not really enlightening. > What sort of rare cases? It that PR15713? Best just fix that. > I've sent a patch: > https://sourceware.org/ml/gdb-patches/2014-05/msg00473.html The reproducible case is that PR15713 and it is sure great you have fixed it. Fine with dropping the patch although I still do not find it obvious the patch is no longer relevant. FSF GDB now relies on fact that ptraced inferior state always matches lp->stopped and there is a matching signal to wait for etc. In some cases GDB hangs during quit (and inferiors cleanup) and one has to kill GDB itself. Reasons are not known to me as I do not know how to reproduce it. (It may be also possible all such reasons have been fixed now.) It also may hang somewhere else and not in linux_nat_kill(). This patch made GDB foolproof against any state of inferior when killing the inferior so that GDB could no longer hang. But it would hide some possible remaining bugs in the code (which may be causing the GDB hangs). Jan
On 06/06/2014 09:15 PM, Jan Kratochvil wrote: > On Wed, 21 May 2014 14:49:43 +0200, Pedro Alves wrote: >> On 03/16/2014 01:59 PM, Jan Kratochvil wrote: >>> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c >>> index b615423..ec84188 100644 >>> --- a/gdb/linux-nat.c >>> +++ b/gdb/linux-nat.c >>> @@ -3777,8 +3777,15 @@ linux_nat_kill (struct target_ops *ops) >>> { >>> ptid_t ptid = pid_to_ptid (ptid_get_pid (inferior_ptid)); >>> >>> + /* Kill all LWP's before trying to stop them. In rare cases the >>> + lwp_info state may not match the inferior and >>> + stop_wait_callback could lock up. */ >> >> Hmm, I find this comment confusing and not really enlightening. >> What sort of rare cases? It that PR15713? Best just fix that. >> I've sent a patch: >> https://sourceware.org/ml/gdb-patches/2014-05/msg00473.html > > The reproducible case is that PR15713 and it is sure great you have fixed it. Great. > Fine with dropping the patch although I still do not find it obvious the patch > is no longer relevant. Thanks, I'd rather drop it. > FSF GDB now relies on fact that ptraced inferior state always matches > lp->stopped and there is a matching signal to wait for etc. In some cases GDB > hangs during quit (and inferiors cleanup) and one has to kill GDB itself. > Reasons are not known to me as I do not know how to reproduce it. > (It may be also possible all such reasons have been fixed now.) I think any case that this bullet proofing patch could work around have now been fixed. > It also may hang somewhere else and not in linux_nat_kill(). Right. I'd say it's much more likely (even though not very likely, tbc) that we see hangs elsewhere than in linux_nat_kill now. E.g., GDB core's is_executing state getting out of sync with lwp->stopped. > This patch made GDB foolproof against any state of inferior when killing the > inferior so that GDB could no longer hang. But it would hide some possible > remaining bugs in the code (which may be causing the GDB hangs). Right, I'd rather just fix the GDB bugs, and only consider such a patch if we need to work around a kernel bug (though I'd rather just ignore very old kernels like 2.4 by now).
On Mon, 09 Jun 2014 12:29:28 +0200, Pedro Alves wrote: > On 06/06/2014 09:15 PM, Jan Kratochvil wrote: > > It also may hang somewhere else and not in linux_nat_kill(). > > Right. I'd say it's much more likely (even though not very likely, tbc) > that we see hangs elsewhere than in linux_nat_kill now. E.g., GDB core's > is_executing state getting out of sync with lwp->stopped. > > > This patch made GDB foolproof against any state of inferior when killing the > > inferior so that GDB could no longer hang. But it would hide some possible > > remaining bugs in the code (which may be causing the GDB hangs). > > Right, I'd rather just fix the GDB bugs, and only consider such a patch > if we need to work around a kernel bug (though I'd rather just > ignore very old kernels like 2.4 by now). OK, so according to the approach chosen above considering this patch dropped. Jan
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c index b615423..ec84188 100644 --- a/gdb/linux-nat.c +++ b/gdb/linux-nat.c @@ -3777,8 +3777,15 @@ linux_nat_kill (struct target_ops *ops) { ptid_t ptid = pid_to_ptid (ptid_get_pid (inferior_ptid)); + /* Kill all LWP's before trying to stop them. In rare cases the + lwp_info state may not match the inferior and + stop_wait_callback could lock up. */ + iterate_over_lwps (ptid, kill_callback, NULL); + /* Stop all threads before killing them, since ptrace requires - that the thread is stopped to sucessfully PTRACE_KILL. */ + that the thread is stopped to sucessfully PTRACE_KILL. + kill_callback normally already turned the inferior into a zombie + except for old Linux kernels 2.4.x. */ iterate_over_lwps (ptid, stop_callback, NULL); /* ... and wait until all of them have reported back that they're no longer running. */