Message ID | 86ziy2xdt7.fsf@gmail.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 109555 invoked by alias); 25 Nov 2015 16:07:56 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 109545 invoked by uid 89); 25 Nov 2015 16:07:56 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pa0-f48.google.com Received: from mail-pa0-f48.google.com (HELO mail-pa0-f48.google.com) (209.85.220.48) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Wed, 25 Nov 2015 16:07:55 +0000 Received: by pacej9 with SMTP id ej9so61371698pac.2 for <gdb-patches@sourceware.org>; Wed, 25 Nov 2015 08:07:53 -0800 (PST) X-Received: by 10.98.2.70 with SMTP id 67mr32489016pfc.133.1448467673216; Wed, 25 Nov 2015 08:07:53 -0800 (PST) Received: from E107787-LIN (gcc1-power7.osuosl.org. [140.211.15.137]) by smtp.gmail.com with ESMTPSA id r79sm21758135pfa.61.2015.11.25.08.07.50 for <gdb-patches@sourceware.org> (version=TLS1_2 cipher=AES128-SHA bits=128/128); Wed, 25 Nov 2015 08:07:52 -0800 (PST) From: Yao Qi <qiyaoltc@gmail.com> To: gdb-patches@sourceware.org Subject: exceptions.KeyboardInterrupt is thrown in gdb.base/random-signal.exp Date: Wed, 25 Nov 2015 16:07:48 +0000 Message-ID: <86ziy2xdt7.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes |
Commit Message
Yao Qi
Nov. 25, 2015, 4:07 p.m. UTC
Hi, I am fixing a fail in gdb.base/random-signal.exp like this, Continuing.^M PASS: gdb.base/random-signal.exp: continue ^CPython Exception <type 'exceptions.KeyboardInterrupt'> <type 'exceptions.KeyboardInterrupt'>: ^M FAIL: gdb.base/random-signal.exp: stop with control-c (timeout) I only see this fail out of 15~20 runs each time. Is it because GDB received SIGINT before async_handle_remote_sigint is installed? so handle_sigint is still the SIGINT handler, set_quit_flag will call python stuff, and KeyboardInterrupt is raised as a result. In the test, we've already been aware of that the signal handler isn't ready, so "Continuing" is consumed and ctrl-c is delayed in 500ms. # For this to work we must be sure to consume the "Continuing." # message first, or GDB's signal handler may not be in place. after 500 {send_gdb "\003"} After I read the tcl manul about "after", I feel the usage of "after" above isn't 100% correct. As the manual says, the "after" command returns immediately, and the tcl command {send_gdb "\003"} will be executed 500 ms later. It is an asynchronous flavour, but what we want is a synchronous operation, like this, after 500 send_gdb "\003" with this change, I don't see the timeout fail again. Is it a fix or a hack?
Comments
On 11/25/2015 04:07 PM, Yao Qi wrote: > Hi, > I am fixing a fail in gdb.base/random-signal.exp like this, > > Continuing.^M > PASS: gdb.base/random-signal.exp: continue > ^CPython Exception <type 'exceptions.KeyboardInterrupt'> <type 'exceptions.KeyboardInterrupt'>: ^M > FAIL: gdb.base/random-signal.exp: stop with control-c (timeout) > > I only see this fail out of 15~20 runs each time. Is it because GDB > received SIGINT before async_handle_remote_sigint is installed? so > handle_sigint is still the SIGINT handler, set_quit_flag will call > python stuff, and KeyboardInterrupt is raised as a result. > > In the test, we've already been aware of that the signal handler isn't > ready, so "Continuing" is consumed and ctrl-c is delayed in 500ms. > > # For this to work we must be sure to consume the "Continuing." > # message first, or GDB's signal handler may not be in place. > after 500 {send_gdb "\003"} > > After I read the tcl manul about "after", I feel the usage of "after" > above isn't 100% correct. As the manual says, the "after" command > returns immediately, and the tcl command {send_gdb "\003"} will be > executed 500 ms later. It is an asynchronous flavour, but what we want is > a synchronous operation, like this, > > after 500 > send_gdb "\003" > > with this change, I don't see the timeout fail again. Is it a fix or a > hack? > Seems like a hack -- I don't see how that can make a difference. In both cases, we send \003 after 500ms. The test sets a software watchpoint, and resumes the target. That means the program will be constantly single-stepping, and gdb will be evaluating the watched expression at each single-step. I'd suspect that the problem is likely that while the program is stopped to evaluate the watched expression, something is calling target_terminal_ours, which restores handle_sigint as SIGINT handler. Then somehow you're unlucky to manage to ctrl-c at that exact time. The fix in that case is likely to be to call target_terminal_ours_for_output instead, which doesn't touch the SIGINT handler. Thanks, Pedro Alves
Pedro Alves <palves@redhat.com> writes: > Seems like a hack -- I don't see how that can make a difference. In both > cases, we send \003 after 500ms. The only difference is that {send_gdb "\003"} will be executed in an event handler later while tcl has already been in gdb_test. > > The test sets a software watchpoint, and resumes the target. That means > the program will be constantly single-stepping, and gdb will be evaluating > the watched expression at each single-step. I'd suspect that the problem > is likely that while the program is stopped to evaluate the watched > expression, something is calling target_terminal_ours, which restores > handle_sigint as SIGINT handler. Then somehow you're unlucky to manage to > ctrl-c at that exact time. The fix in that case is likely to be to call > target_terminal_ours_for_output instead, which doesn't touch the SIGINT > handler. That was one of my clues... I set breakpoint on target_terminal_ours, but it isn't hit. Anyway, I'll look into it further.
Pedro Alves <palves@redhat.com> writes: > The test sets a software watchpoint, and resumes the target. That means > the program will be constantly single-stepping, and gdb will be evaluating > the watched expression at each single-step. I'd suspect that the problem > is likely that while the program is stopped to evaluate the watched > expression, something is calling target_terminal_ours, which restores > handle_sigint as SIGINT handler. Then somehow you're unlucky to manage to > ctrl-c at that exact time. The fix in that case is likely to be to call > target_terminal_ours_for_output instead, which doesn't touch the SIGINT > handler. The cause of this problem is the SIGINT is handled by python. As you said, program is being single-stepped constantly, and in each stop, python unwinder sniffer is used, #0 pyuw_sniffer (self=<optimised out>, this_frame=<optimised out>, cache_ptr=0xd554f8) at /home/yao/SourceCode/gnu/gdb/git/gdb/python/py-unwind.c:608 #1 0x00000000006a10ae in frame_unwind_try_unwinder (this_frame=this_frame@entry=0xd554e0, this_cache=this_cache@entry=0xd554f8, unwinder=0xecd540) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame-unwind.c:107 #2 0x00000000006a143f in frame_unwind_find_by_frame (this_frame=this_frame@entry=0xd554e0, this_cache=this_cache@entry=0xd554f8) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame-unwind.c:163 #3 0x000000000069dc6b in compute_frame_id (fi=0xd554e0) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:454 #4 get_prev_frame_if_no_cycle (this_frame=this_frame@entry=0xd55410) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1781 #5 0x000000000069fdb9 in get_prev_frame_always_1 (this_frame=0xd55410) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1955 #6 get_prev_frame_always (this_frame=this_frame@entry=0xd55410) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1971 #7 0x00000000006a04b1 in get_prev_frame (this_frame=this_frame@entry=0xd55410) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:2213 and the extension language is set, so GDB changes to cooperative SIGINT handling by the extension language. See extension.c:set_active_ext_lang. If ctrl-c is pressed at that time, it is reasonable to me that python exception KeyboardInterrupt is thrown out. In the previous discussion https://www.sourceware.org/ml/gdb-patches/2014-01/msg00106.html Tom Tromey wrote: > The basic idea is that if some extension code is running, then C-c ought > to interrupt that code in the way expected by programmers writing code > in that language. As a GDB developer, I can understand that when python is running, Ctrl-c should trigger KeyboardInterrupt, and when the inferior is running, Ctrl-c should send interrupt sequence to the remote target. If we accept this fact, we can fix test case, since output of KeyboardInterrupt is also correct. However, as a GDB user, it is quite confusing on the expected behavior for ctrl-c, which depends on the time ctrl-c is pressed, but I don't have a good idea to fix it, because there is no such clear boundary of GDB code and python code from both developers' and users' point of view.
On 12/01/2015 05:14 PM, Yao Qi wrote: > Pedro Alves <palves@redhat.com> writes: > >> The test sets a software watchpoint, and resumes the target. That means >> the program will be constantly single-stepping, and gdb will be evaluating >> the watched expression at each single-step. I'd suspect that the problem >> is likely that while the program is stopped to evaluate the watched >> expression, something is calling target_terminal_ours, which restores >> handle_sigint as SIGINT handler. Then somehow you're unlucky to manage to >> ctrl-c at that exact time. The fix in that case is likely to be to call >> target_terminal_ours_for_output instead, which doesn't touch the SIGINT >> handler. > > The cause of this problem is the SIGINT is handled by python. > > As you said, program is being single-stepped constantly, and in each > stop, python unwinder sniffer is used, > > #0 pyuw_sniffer (self=<optimised out>, this_frame=<optimised out>, cache_ptr=0xd554f8) at /home/yao/SourceCode/gnu/gdb/git/gdb/python/py-unwind.c:608 > #1 0x00000000006a10ae in frame_unwind_try_unwinder (this_frame=this_frame@entry=0xd554e0, this_cache=this_cache@entry=0xd554f8, unwinder=0xecd540) > at /home/yao/SourceCode/gnu/gdb/git/gdb/frame-unwind.c:107 > #2 0x00000000006a143f in frame_unwind_find_by_frame (this_frame=this_frame@entry=0xd554e0, this_cache=this_cache@entry=0xd554f8) > at /home/yao/SourceCode/gnu/gdb/git/gdb/frame-unwind.c:163 > #3 0x000000000069dc6b in compute_frame_id (fi=0xd554e0) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:454 > #4 get_prev_frame_if_no_cycle (this_frame=this_frame@entry=0xd55410) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1781 > #5 0x000000000069fdb9 in get_prev_frame_always_1 (this_frame=0xd55410) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1955 > #6 get_prev_frame_always (this_frame=this_frame@entry=0xd55410) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1971 > #7 0x00000000006a04b1 in get_prev_frame (this_frame=this_frame@entry=0xd55410) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:2213 > > and the extension language is set, so GDB changes to cooperative SIGINT > handling by the extension language. See extension.c:set_active_ext_lang. > If ctrl-c is pressed at that time, it is reasonable to me that python > exception KeyboardInterrupt is thrown out. In the previous discussion > https://www.sourceware.org/ml/gdb-patches/2014-01/msg00106.html > > Tom Tromey wrote: >> The basic idea is that if some extension code is running, then C-c ought >> to interrupt that code in the way expected by programmers writing code >> in that language. > > As a GDB developer, I can understand that when python is running, Ctrl-c > should trigger KeyboardInterrupt, and when the inferior is running, > Ctrl-c should send interrupt sequence to the remote target. If we > accept this fact, we can fix test case, since output of > KeyboardInterrupt is also correct. However, as a GDB user, it is quite > confusing on the expected behavior for ctrl-c, which depends on the time > ctrl-c is pressed, but I don't have a good idea to fix it, because there > is no such clear boundary of GDB code and python code from both > developers' and users' point of view. IMO, if the inferior is running and target_terminal_inferior is not in effect (*) then the ctrl-c should _not_ trigger a Python KeyboardInterrupt, but instead be sent to the target -- if the target is running and we're running some not-supposed-to-be-interactive Python unwinder code while processing some internal stop event, we know that the Python code will finish quickly and the target should stop for the SIGINT very soon. IOW, we treat the ctrl-c at exactly the wrong time as if it had been pressed a little sooner or later, outside Python. (*) - and it shouldn't, while an internal event is being processed. To handle the case of something going wrong and gdb getting stuck in a loop too long that the target's SIGINT takes forever to be processed, we could make gdb react to a _second_ (impatient) ctrl-c as "okay, I'm sick of waiting, please stop whatever you're doing". This is like how remote.c handles ctrl-c at exactly the wrong moment (while an internal event is processed) nowadays: https://sourceware.org/ml/gdb-patches/2015-08/msg00574.html Thanks, Pedro Alves
Pedro Alves <palves@redhat.com> writes: > IMO, if the inferior is running and target_terminal_inferior is not in > effect (*) then the ctrl-c should _not_ trigger a Python > KeyboardInterrupt, but instead > be sent to the target -- if the target is running and we're running some > not-supposed-to-be-interactive Python unwinder code while processing > some internal stop > event, we know that the Python code will finish quickly and the target > should stop > for the SIGINT very soon. IOW, we treat the ctrl-c at exactly the > wrong time as if > it had been pressed a little sooner or later, outside Python. > > (*) - and it shouldn't, while an internal event is being processed. If I understand you correctly, ctrl-c shouldn't trigger a Python KeyboardInterrupt, and we should fix it somewhere in GDB. > > To handle the case of something going wrong and gdb getting stuck in a loop too > long that the target's SIGINT takes forever to be processed, we could make gdb > react to a _second_ (impatient) ctrl-c as "okay, I'm sick of waiting, > please stop > whatever you're doing". This is like how remote.c handles ctrl-c at exactly the > wrong moment (while an internal event is processed) nowadays: > > https://sourceware.org/ml/gdb-patches/2015-08/msg00574.html I don't know how is this problem related to the second ctrl-c. It is expected that single ctrl-c should interrupt the target, why do we need the second ctrl-c in this case?
On 12/03/2015 05:09 PM, Yao Qi wrote: > Pedro Alves <palves@redhat.com> writes: > >> IMO, if the inferior is running and target_terminal_inferior is not in >> effect (*) then the ctrl-c should _not_ trigger a Python >> KeyboardInterrupt, but instead >> be sent to the target -- if the target is running and we're running some >> not-supposed-to-be-interactive Python unwinder code while processing >> some internal stop >> event, we know that the Python code will finish quickly and the target >> should stop >> for the SIGINT very soon. IOW, we treat the ctrl-c at exactly the >> wrong time as if >> it had been pressed a little sooner or later, outside Python. >> >> (*) - and it shouldn't, while an internal event is being processed. > > If I understand you correctly, ctrl-c shouldn't trigger a Python > KeyboardInterrupt, and we should fix it somewhere in GDB. Yes. > >> >> To handle the case of something going wrong and gdb getting stuck in a loop too >> long that the target's SIGINT takes forever to be processed, we could make gdb >> react to a _second_ (impatient) ctrl-c as "okay, I'm sick of waiting, >> please stop >> whatever you're doing". This is like how remote.c handles ctrl-c at exactly the >> wrong moment (while an internal event is processed) nowadays: >> >> https://sourceware.org/ml/gdb-patches/2015-08/msg00574.html > > I don't know how is this problem related to the second ctrl-c. It is > expected that single ctrl-c should interrupt the target, why do we need > the second ctrl-c in this case? > Say the Python code has a bug and ends up stuck in a loop and doesn't finish quickly what it's supposed to do (the target is never re-resumed and the SIGINT stop never processed). The user types ctrl-c, and, nothing happens. After a while, the user gets sick of waiting, and presses ctrl-c again. At this point, we could ask the user what do to, raise KeyboardInterrupt, Quit, etc. Thanks, Pedro Alves
diff --git a/gdb/testsuite/gdb.base/random-signal.exp b/gdb/testsuite/gdb.base/random-signal.exp index 566668a..59c8f5b 100644 --- a/gdb/testsuite/gdb.base/random-signal.exp +++ b/gdb/testsuite/gdb.base/random-signal.exp @@ -38,5 +38,6 @@ gdb_test_multiple "continue" "continue" { # For this to work we must be sure to consume the "Continuing." # message first, or GDB's signal handler may not be in place. -after 500 {send_gdb "\003"} +after 500 +send_gdb "\003"