Message ID | 1453480183-5131-1-git-send-email-yao.qi@linaro.org |
---|---|
State | New, archived |
Headers |
Received: (qmail 121185 invoked by alias); 22 Jan 2016 16:29:54 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 121170 invoked by uid 89); 22 Jan 2016 16:29:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=sk:interac, Hx-languages-length:1335, sk:interru X-HELO: mail-pf0-f178.google.com Received: from mail-pf0-f178.google.com (HELO mail-pf0-f178.google.com) (209.85.192.178) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 22 Jan 2016 16:29:51 +0000 Received: by mail-pf0-f178.google.com with SMTP id e65so44774229pfe.0 for <gdb-patches@sourceware.org>; Fri, 22 Jan 2016 08:29:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:subject:date:message-id; bh=Dt5hbrWL4GYW6ctvQcV0yrkfaLgIZIPoBO7Pb4NhDqs=; b=DZOc4u7mQWRZuGb3g6JKwtZY3ujcSAr8QZsWMDeo7vrt/ZnULmaatIbyZZOU3N3MhW qGqNq7uypnzyuLfa7cYVQJnroSC7Lmr2yrG7j6MJFsIbWUWHm3nyi9ypzuLWG+8CjEI8 7q3pvtx89ExTgUTfaB1FyfvgbDqSYUQnAFZVPYhZhK199w/cS2JnhQbNfDVixhyuM45j D54DNgWzGACKOfFG5BCeDaVqw3dBznpbZ+m+3yUyNMce6S6eZkRP15abBgt7ZPOATIcY HaH4XNlX8MKt7bXVSRrqeChBLmvs627IV2ZEZQ9Vstklf2a4mt9/GYf3UHm8WZafZRtX D2CA== X-Gm-Message-State: AG10YOSZm+m2VI3XE76CI5jtrZcxzPum6Hwkq4dFet9z6kaae7MVEI2A2fhVfkr1DoqH2Q== X-Received: by 10.98.68.211 with SMTP id m80mr5718089pfi.117.1453480189504; Fri, 22 Jan 2016 08:29:49 -0800 (PST) Received: from E107787-LIN.cambridge.arm.com (gcc1-power7.osuosl.org. [140.211.15.137]) by smtp.gmail.com with ESMTPSA id c86sm10770324pfd.75.2016.01.22.08.29.47 for <gdb-patches@sourceware.org> (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 22 Jan 2016 08:29:48 -0800 (PST) From: Yao Qi <qiyaoltc@gmail.com> X-Google-Original-From: Yao Qi <yao.qi@linaro.org> To: gdb-patches@sourceware.org Subject: [PATCH] Fix fail in gdb.base/interrupt-noterm.exp Date: Fri, 22 Jan 2016 16:29:43 +0000 Message-Id: <1453480183-5131-1-git-send-email-yao.qi@linaro.org> X-IsSubscribed: yes |
Commit Message
Yao Qi
Jan. 22, 2016, 4:29 p.m. UTC
Hi, In my testing, I see the following fail intermittently, interrupt (gdb) PASS: gdb.base/interrupt-noterm.exp: interrupt [Inferior 1 (process 13407) exited normally] Child exited with status 0 FAIL: gdb.base/interrupt-noterm.exp: inferior received SIGINT (timeout) because the interrupt packet may be sent to GDBserver before the SIGIO handler is installed. The fix in this patch is to let GDB wait for 500 ms between "continue &" and "interrupt" to make sure SIGIO handler is installed already in GDBserver side. gdb/testsuite: 2016-01-22 Yao Qi <yao.qi@linaro.org> * gdb.base/interrupt-noterm.exp: Add "after 500". --- gdb/testsuite/gdb.base/interrupt-noterm.exp | 3 +++ 1 file changed, 3 insertions(+)
Comments
On 01/22/2016 04:29 PM, Yao Qi wrote: > Hi, > In my testing, I see the following fail intermittently, > > interrupt > (gdb) PASS: gdb.base/interrupt-noterm.exp: interrupt > [Inferior 1 (process 13407) exited normally] > > Child exited with status 0 > FAIL: gdb.base/interrupt-noterm.exp: inferior received SIGINT (timeout) > > because the interrupt packet may be sent to GDBserver before the SIGIO > handler is installed. The fix in this patch is to let GDB wait > for 500 ms between "continue &" and "interrupt" to make sure > SIGIO handler is installed already in GDBserver side. Can you expand the rationale some more? E.g., why is this not a gdbserver bug? Instintively I'd say it is. Thanks, Pedro Alves > > gdb/testsuite: > > 2016-01-22 Yao Qi <yao.qi@linaro.org> > > * gdb.base/interrupt-noterm.exp: Add "after 500". > --- > gdb/testsuite/gdb.base/interrupt-noterm.exp | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/gdb/testsuite/gdb.base/interrupt-noterm.exp b/gdb/testsuite/gdb.base/interrupt-noterm.exp > index 05f6076..9b5bb17 100644 > --- a/gdb/testsuite/gdb.base/interrupt-noterm.exp > +++ b/gdb/testsuite/gdb.base/interrupt-noterm.exp > @@ -55,6 +55,9 @@ if { $async_supported < 0 } { > return 1 > } > > +# Wait a while so that GDBserver's SIGIO handler is in place. > +after 500 > + > # With native debugging, and no terminal (emulated by interactive-mode > # off, above), GDB had a bug where "interrupt" would send SIGINT to > # its own process group, instead of the inferior's. >
Pedro Alves <palves@redhat.com> writes: > Can you expand the rationale some more? > > E.g., why is this not a gdbserver bug? Instintively I'd say it is. The interaction between GDB and GDBserver is like this, 1. GDB sends vCont;c and doesn't wait for the stop reply because "continue &" is background command, 2. GDBserver receives vCont;c, enables the async i/o (by enable_async_io) and resumes the inferior. 3. GDB sends interrupt packet, #1 happens before #2 and #3, but the order of #2 and #3 is not determined. If #2 happens before #3, it is fine, otherwise, the GDBserver doesn't know the interrupt from GDB.
On 01/22/2016 05:14 PM, Yao Qi wrote: > Pedro Alves <palves@redhat.com> writes: > >> Can you expand the rationale some more? >> >> E.g., why is this not a gdbserver bug? Instintively I'd say it is. > > The interaction between GDB and GDBserver is like this, > > 1. GDB sends vCont;c and doesn't wait for the stop reply because > "continue &" is background command, > 2. GDBserver receives vCont;c, enables the async i/o (by > enable_async_io) and resumes the inferior. > 3. GDB sends interrupt packet, > > #1 happens before #2 and #3, but the order of #2 and #3 is not > determined. If #2 happens before #3, it is fine, otherwise, the > GDBserver doesn't know the interrupt from GDB. If 1. is followed by 3., then the \\003 is always read by gdb after the vCont;c. We call enable_async_io before reaching mywait. Since we're in all-stop, that means we'll block inside mywait -> waitpid, all the while \\003 is already available to read in the socket. Since we're blocked in waitpid, we won't see the \\003 until after the next time the program happens to stop. Agree? It still seems to me like a gdbserver bug. I think that after calling enable_async_io, we need to check whether input is already pending from GDB, and if so, process it immediately -- we know the only input coming from GDB at this point is a \\003. IOW, I think we need to call input_interrupt after calling enable_async_io. input_interrupt already uses select before reading, so it handles the case of there being no input available without blocking. However, we need to be careful, because a SIGIO can race with calling input_interrupt from mainline code... Thanks, Pedro Alves
On 01/22/2016 05:35 PM, Pedro Alves wrote: > On 01/22/2016 05:14 PM, Yao Qi wrote: >> Pedro Alves <palves@redhat.com> writes: >> >>> Can you expand the rationale some more? >>> >>> E.g., why is this not a gdbserver bug? Instintively I'd say it is. >> >> The interaction between GDB and GDBserver is like this, >> >> 1. GDB sends vCont;c and doesn't wait for the stop reply because >> "continue &" is background command, >> 2. GDBserver receives vCont;c, enables the async i/o (by >> enable_async_io) and resumes the inferior. >> 3. GDB sends interrupt packet, >> >> #1 happens before #2 and #3, but the order of #2 and #3 is not >> determined. If #2 happens before #3, it is fine, otherwise, the >> GDBserver doesn't know the interrupt from GDB. > > If 1. is followed by 3., then the \\003 is always read by gdb > after the vCont;c. We call enable_async_io before reaching > mywait. Since we're in all-stop, that means we'll block > inside mywait -> waitpid, all the while \\003 is already available to > read in the socket. Since we're blocked in waitpid, we won't see > the \\003 until after the next time the program happens to stop. > > Agree? > > It still seems to me like a gdbserver bug. > > I think that after calling enable_async_io, we need to check whether > input is already pending from GDB, and if so, process it immediately -- we > know the only input coming from GDB at this point is a \\003. IOW, I think > we need to call input_interrupt after calling enable_async_io. input_interrupt > already uses select before reading, so it handles the case of there > being no input available without blocking. > > However, we need to be careful, because a SIGIO can race with calling > input_interrupt from mainline code... Might be simpler to always have the SIGIO handler installed (install it early), and change enable_async_io/disable_async_io to use sigprocmask to block/unblock the signal. That way, if input comes before the signal is unblocked, the handler is called immediately when enable_async_io is called. Thanks, Pedro Alves
Pedro Alves <palves@redhat.com> writes: > If 1. is followed by 3., then the \\003 is always read by gdb s/ready by/sent by/ ? > after the vCont;c. We call enable_async_io before reaching > mywait. Since we're in all-stop, that means we'll block Although we call enable_async_io earlier, so the window between received vCont;c and calling enable_async_io is tiny, it is still possible that \\003 arrives at that period. > inside mywait -> waitpid, all the while \\003 is already available to > read in the socket. Since we're blocked in waitpid, we won't see > the \\003 until after the next time the program happens to stop. > > Agree? Yes, I agree. > > It still seems to me like a gdbserver bug. > > I think that after calling enable_async_io, we need to check whether > input is already pending from GDB, and if so, process it immediately -- we > know the only input coming from GDB at this point is a \\003. IOW, I think > we need to call input_interrupt after calling enable_async_io. input_interrupt > already uses select before reading, so it handles the case of there > being no input available without blocking. > > However, we need to be careful, because a SIGIO can race with calling > input_interrupt from mainline code... What you mean here is that we can call input_interrupt after calling enable_async_io, but meanwhile, \\0003 arrives, and input_interrupt is invoked as a SIGIO handler, so there is a race. Is it correct? I agree your next email about the approach of block/unblock SIGIO is better. I'll give a fix that way.
On 01/25/2016 09:30 AM, Yao Qi wrote: > Pedro Alves <palves@redhat.com> writes: > >> If 1. is followed by 3., then the \\003 is always read by gdb > > s/ready by/sent by/ ? I meant, s/by gdb/by gdbserver/. >> >> It still seems to me like a gdbserver bug. >> >> I think that after calling enable_async_io, we need to check whether >> input is already pending from GDB, and if so, process it immediately -- we >> know the only input coming from GDB at this point is a \\003. IOW, I think >> we need to call input_interrupt after calling enable_async_io. input_interrupt >> already uses select before reading, so it handles the case of there >> being no input available without blocking. >> >> However, we need to be careful, because a SIGIO can race with calling >> input_interrupt from mainline code... > > What you mean here is that we can call input_interrupt after calling > enable_async_io, but meanwhile, \\0003 arrives, and input_interrupt is > invoked as a SIGIO handler, so there is a race. Is it correct? That's correct. > > I agree your next email about the approach of block/unblock SIGIO is > better. I'll give a fix that way. > Great, thanks!
Hi, In my test, I see the following fail intermittently, interrupt (gdb) PASS: gdb.base/interrupt-noterm.exp: interrupt [Inferior 1 (process 13407) exited normally] Child exited with status 0 FAIL: gdb.base/interrupt-noterm.exp: inferior received SIGINT (timeout) because interrupt character isn't processed by GDBserver in time due to two different problems: 1. GDBserver may receive vCont;c and '\003' together, and read them into the buffer. GDBserver will wait for the inferior but there won't be SIGIO because GDBserver has already read '\003' into the buffer. 2. GDBserver may receive vCont;c only. GDBserver will resume the inferior and wait for it. If '\003' arrives at the moment between GDBserver received vCont;c and install SIGIO handler, GDBserver can't receive SIGIO. these two problems above cause GDB wait for the inferior and leave the interrupt there, so the test is timeout. Two patches in the series fix these two problems separately. Patch 1 fixes the problem 1 by checking the buffer *after* reading in each packet. Patch 2 fixes the problem 2 by block/unblock SIGIO rather than install/uninstall signal handler. These two patches are tested on {x86_64,arm,aarch64}-linux. No regression. *** BLURB HERE *** Yao Qi (2): [GDBserver] Check input interrupt after reading in a packet [GDBserver] Block and unblock SIGIO gdb/gdbserver/remote-utils.c | 60 ++++++++++++++++++++++++++++---------------- 1 file changed, 39 insertions(+), 21 deletions(-)
diff --git a/gdb/testsuite/gdb.base/interrupt-noterm.exp b/gdb/testsuite/gdb.base/interrupt-noterm.exp index 05f6076..9b5bb17 100644 --- a/gdb/testsuite/gdb.base/interrupt-noterm.exp +++ b/gdb/testsuite/gdb.base/interrupt-noterm.exp @@ -55,6 +55,9 @@ if { $async_supported < 0 } { return 1 } +# Wait a while so that GDBserver's SIGIO handler is in place. +after 500 + # With native debugging, and no terminal (emulated by interactive-mode # off, above), GDB had a bug where "interrupt" would send SIGINT to # its own process group, instead of the inferior's.