Message ID | 54F4C5A4.40503@redhat.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 49995 invoked by alias); 2 Mar 2015 20:18:51 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 49982 invoked by uid 89); 2 Mar 2015 20:18:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL, BAYES_00, SPF_HELO_PASS, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Mon, 02 Mar 2015 20:18:49 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t22KIlcQ011202 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 2 Mar 2015 15:18:47 -0500 Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t22KIidk012589; Mon, 2 Mar 2015 15:18:45 -0500 Message-ID: <54F4C5A4.40503@redhat.com> Date: Mon, 02 Mar 2015 20:18:44 +0000 From: Pedro Alves <palves@redhat.com> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Don Breazeal <donb@codesourcery.com>, gdb-patches@sourceware.org Subject: [PATCH] Tighten gdb.base/disp-step-syscall.exp (was: Re: [PATCH v5 0/6] Remote fork events) References: <54C566F2.2020302@codesourcery.com> <1424997977-13316-1-git-send-email-donb@codesourcery.com> In-Reply-To: <1424997977-13316-1-git-send-email-donb@codesourcery.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit |
Commit Message
Pedro Alves
March 2, 2015, 8:18 p.m. UTC
On 02/27/2015 12:46 AM, Don Breazeal wrote: > - There are a couple of tests that show new failures that actually > fail in the current mainline. Details of these are as follows: > > * when vfork events are enabled, gdb.base/disp-step-syscall.exp > shows PASS => FAIL in .sum diffs. The test actually always > fails. With native/master, we see > > stepi^M > FAIL: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn > (timeout) > Hmm, I don't see that here. I get a full pass on x86_64 Fedora 20. Can you try "set debug infrun 1" / "set debug lin-lwp 1" / "set debug displaced 1" to check if there's a gdb or kernel bug here? > With remote and extended-remote/master, we see a bogus PASS result: > stepi^M > [Inferior 1 (process 9399) exited normally]^M > (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn > > The criteria to pass that test are pretty lax: > gdb_test "stepi" ".*" "stepi $syscall insn" Yeah. I see several other problems. Here's a patch to improve it. Comments? Unfortunately, with your full series applied, I get this: (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" stepi Detaching from process 29944 Killing process(es): 29942 29944 /home/pedro/gdb/mygit/src/gdb/gdbserver/linux-low.c:998: A problem internal to GDBserver has been detected. kill_wait_lwp: Assertion `res > 0' failed. /home/pedro/gdb/mygit/src/gdb/thread.c:1182: internal-error: switch_to_thread: Assertion `inf != NULL' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) FAIL: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn (GDB internal error) Resyncing due to internal error. n Note, you'll need this one: https://sourceware.org/ml/gdb-patches/2015-03/msg00045.html for that internal error to result in a quick bail... ---------- From 1f825812d3f17a2940065d0de38592700e7437bc Mon Sep 17 00:00:00 2001 From: Pedro Alves <palves@redhat.com> Date: Mon, 2 Mar 2015 20:16:23 +0000 Subject: [PATCH] Tighten gdb.base/disp-step-syscall.exp This fixes several problems with this test. E.g,. with --target_board=native-extended-gdbserver on x86_64 Fedora 20, I get: Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/disp-step-syscall.exp ... FAIL: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" (timeout) FAIL: gdb.base/disp-step-syscall.exp: vfork: single step over vfork final pc FAIL: gdb.base/disp-step-syscall.exp: vfork: delete break vfork insn FAIL: gdb.base/disp-step-syscall.exp: vfork: continue to marker (vfork) (the program is no longer running) And with --target=native-gdbserver, I get: Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/disp-step-syscall.exp ... KPASS: gdb.base/disp-step-syscall.exp: vfork: single step over vfork (PRMS server/13796) FAIL: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" (timeout) FAIL: gdb.base/disp-step-syscall.exp: vfork: single step over vfork final pc FAIL: gdb.base/disp-step-syscall.exp: vfork: delete break vfork insn FAIL: gdb.base/disp-step-syscall.exp: vfork: continue to marker (vfork) (the program is no longer running) First, the lack of fork support on remote targets is supposed to be kfailed, so the KPASS is obviously bogus. The extended-remote board should have KFAILed too. The problem is that the test is using "is_remote" instead of gdb_is_target_remote. And then, I get: (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: set displaced-stepping on stepi Program terminated with signal SIGSEGV, Segmentation fault. The program no longer exists. (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: single step over vfork Obviously, that should be a FAIL. The problem is that the test only expects SIGILL, not SIGSEGV. It also doesn't bail correctly if an internal error or some other pattern caught by gdb_test_multiple matches. The test doesn't really need to match specific exits/crashes patterns, if the PASS regex is improved, like in ... ... this and the other "stepi" tests are a bit too lax, passing on ".*". This tightens those up to expect "x/i" and the "=>" current PC indicator, like in: 1: x/i $pc => 0x3b36abc9e2 <vfork+34>: syscall On x86_64 Fedora 20, I now get a quick KFAIL instead of timeouts with both the native-extended-gdbserver and native-gdbserver boards: PASS: gdb.base/disp-step-syscall.exp: vfork: delete break vfork PASS: gdb.base/disp-step-syscall.exp: vfork: continue to syscall insn vfork PASS: gdb.base/disp-step-syscall.exp: vfork: set displaced-stepping on KFAIL: gdb.base/disp-step-syscall.exp: vfork: single step over vfork (PRMS: server/13796) and a full pass with native testing. gdb/testsuite/ 2015-03-02 Pedro Alves <palves@redhat.com> * gdb.base/disp-step-syscall.exp (disp_step_cross_syscall.exp): Use gdb_is_target_remote instead of is_remote. Use gdb_test_multiple instead of gdb_expect. Exit early if gdb_test_multiple hits its internal matches. Tighten stepi tests expected output. Fail on exit with any signal, instead of just SIGILL. --- gdb/testsuite/gdb.base/disp-step-syscall.exp | 53 ++++++++++++++-------------- 1 file changed, 26 insertions(+), 27 deletions(-)
Comments
On 3/2/2015 12:18 PM, Pedro Alves wrote: > On 02/27/2015 12:46 AM, Don Breazeal wrote: >> - There are a couple of tests that show new failures that actually >> fail in the current mainline. Details of these are as follows: >> >> * when vfork events are enabled, gdb.base/disp-step-syscall.exp >> shows PASS => FAIL in .sum diffs. The test actually always >> fails. With native/master, we see >> >> stepi^M >> FAIL: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn >> (timeout) >> > > Hmm, I don't see that here. I get a full pass on x86_64 Fedora 20. > Can you try "set debug infrun 1" / "set debug lin-lwp 1" / "set debug displaced 1" > to check if there's a gdb or kernel bug here? I am traveling this week, so I haven't had a chance to look at this much. With your two patches below applied, I get a full pass on x86_64 Ubuntu 14.04 (which I hadn't tried before) and I see the timeout failure on x86_64 Ubuntu 10.04. Here is the relevant portion of the gdb.log file from my failing run on Ubuntu 10.04 with the debugging turned on, if you want to look at it. Otherwise I will try to look at it in the next day or so. (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" stepi^M infrun: clear_proceed_status_thread (process 6486)^M infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT, step=1)^M infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 6486] at 0x2aaaaaffcf02^M LLR: Preparing to step process 6486, 0, inferior_ptid process 6486^M LLR: PTRACE_SINGLESTEP process 6486, 0 (resume event thread)^M sigchld^M linux_nat_wait: [process -1], [TARGET_WNOHANG]^M LLW: enter^M LNW: waitpid(-1, ...) returned 6837, No child processes^M LLW: waitpid 6837 received Trace/breakpoint trap (stopped)^M LHEW: saving LWP 6837 status Trace/breakpoint trap (stopped) in stopped_pids list^M LNW: waitpid(-1, ...) returned 6486, No child processes^M LLW: waitpid 6486 received Trace/breakpoint trap (stopped)^M LLW: Handling extended status 0x02057f^M LNW: waitpid(-1, ...) returned 0, No child processes^M SEL: Select single-step process 6486^M LLW: trap ptid is process 6486.^M LLW: exit^M sigchld^M infrun: target_wait (-1, status) =^M infrun: 6486 [process 6486],^M infrun: status->kind = vforked^M infrun: TARGET_WAITKIND_VFORKED^M Detaching after vfork from child process 6837.^M sigchld^M LCFF: waiting for VFORK_DONE on 6486^M infrun: resume : clear step^M infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 6486] at 0x2aaaaaffcf04^M LLR: Preparing to resume process 6486, 0, inferior_ptid process 6486^M LLR: PTRACE_CONT process 6486, 0 (resume event thread)^M infrun: prepare_to_wait^M linux_nat_wait: [process -1], [TARGET_WNOHANG]^M LLW: enter^M LNW: waitpid(-1, ...) returned 0, No child processes^M LLW: exit (ignore)^M infrun: target_wait (-1, status) =^M infrun: -1 [process -1],^M infrun: status->kind = ignore^M infrun: TARGET_WAITKIND_IGNORE^M infrun: prepare_to_wait^M FAIL: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn (timeout) testcase /scratch/dbreazea/sandbox/gdb-with-exec-6/binutils-gdb/gdb/testsuite/gdb.base/disp-step-syscall.exp completed in 21 seconds > >> With remote and extended-remote/master, we see a bogus PASS result: >> stepi^M >> [Inferior 1 (process 9399) exited normally]^M >> (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn >> >> The criteria to pass that test are pretty lax: >> gdb_test "stepi" ".*" "stepi $syscall insn" > > Yeah. I see several other problems. Here's a patch to improve it. > > Comments? I will try to walk through this in the next day or so. > > Unfortunately, with your full series applied, I get this: > > (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" > stepi > Detaching from process 29944 > Killing process(es): 29942 29944 > /home/pedro/gdb/mygit/src/gdb/gdbserver/linux-low.c:998: A problem internal to GDBserver has been detected. > kill_wait_lwp: Assertion `res > 0' failed. > /home/pedro/gdb/mygit/src/gdb/thread.c:1182: internal-error: switch_to_thread: Assertion `inf != NULL' failed. > A problem internal to GDB has been detected, > further debugging may prove unreliable. > Quit this debugging session? (y or n) FAIL: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn (GDB internal error) > Resyncing due to internal error. > n With the two patches below applied, disp-step-syscall.exp passes consistently for me using native-extended-gdbserver on x86_64 Ubuntu 10.04 and Ubuntu 14.04. I haven't looked (yet) at how your patches might have caused this change in behavior, or at how I might be able to reproduce the failure you are seeing. I have seen the "inf != NULL" assertion before, when stopped at a remote fork/vfork catchpoint and executing "info threads". In that case gdbserver was reporting the new thread created by the fork. It was added to the host-side thread list, but the new inferior had not been created yet on the host side. That specific scenario should be prevented now in the remote follow fork patch series by not reporting the forked child's thread until the follow_fork has been completed. (If I am remembering that right.) Thanks --Don > > Note, you'll need this one: > > https://sourceware.org/ml/gdb-patches/2015-03/msg00045.html > > for that internal error to result in a quick bail... > > ---------- > From 1f825812d3f17a2940065d0de38592700e7437bc Mon Sep 17 00:00:00 2001 > From: Pedro Alves <palves@redhat.com> > Date: Mon, 2 Mar 2015 20:16:23 +0000 > Subject: [PATCH] Tighten gdb.base/disp-step-syscall.exp > > This fixes several problems with this test. > > E.g,. with --target_board=native-extended-gdbserver on > x86_64 Fedora 20, I get: > > Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/disp-step-syscall.exp ... > FAIL: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" (timeout) > FAIL: gdb.base/disp-step-syscall.exp: vfork: single step over vfork final pc > FAIL: gdb.base/disp-step-syscall.exp: vfork: delete break vfork insn > FAIL: gdb.base/disp-step-syscall.exp: vfork: continue to marker (vfork) (the program is no longer running) > > And with --target=native-gdbserver, I get: > > Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/disp-step-syscall.exp ... > KPASS: gdb.base/disp-step-syscall.exp: vfork: single step over vfork (PRMS server/13796) > FAIL: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" (timeout) > FAIL: gdb.base/disp-step-syscall.exp: vfork: single step over vfork final pc > FAIL: gdb.base/disp-step-syscall.exp: vfork: delete break vfork insn > FAIL: gdb.base/disp-step-syscall.exp: vfork: continue to marker (vfork) (the program is no longer running) > > First, the lack of fork support on remote targets is supposed to be > kfailed, so the KPASS is obviously bogus. The extended-remote board > should have KFAILed too. > > The problem is that the test is using "is_remote" instead of > gdb_is_target_remote. > > And then, I get: > > (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: set displaced-stepping on > stepi > > Program terminated with signal SIGSEGV, Segmentation fault. > The program no longer exists. > (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: single step over vfork > > Obviously, that should be a FAIL. The problem is that the test only > expects SIGILL, not SIGSEGV. It also doesn't bail correctly if an > internal error or some other pattern caught by gdb_test_multiple > matches. The test doesn't really need to match specific exits/crashes > patterns, if the PASS regex is improved, like in ... > > ... this and the other "stepi" tests are a bit too lax, passing on > ".*". This tightens those up to expect "x/i" and the "=>" current PC > indicator, like in: > > 1: x/i $pc > => 0x3b36abc9e2 <vfork+34>: syscall > > On x86_64 Fedora 20, I now get a quick KFAIL instead of timeouts with > both the native-extended-gdbserver and native-gdbserver boards: > > PASS: gdb.base/disp-step-syscall.exp: vfork: delete break vfork > PASS: gdb.base/disp-step-syscall.exp: vfork: continue to syscall insn vfork > PASS: gdb.base/disp-step-syscall.exp: vfork: set displaced-stepping on > KFAIL: gdb.base/disp-step-syscall.exp: vfork: single step over vfork (PRMS: server/13796) > > and a full pass with native testing. > > gdb/testsuite/ > 2015-03-02 Pedro Alves <palves@redhat.com> > > * gdb.base/disp-step-syscall.exp (disp_step_cross_syscall.exp): > Use gdb_is_target_remote instead of is_remote. Use > gdb_test_multiple instead of gdb_expect. Exit early if > gdb_test_multiple hits its internal matches. Tighten stepi tests > expected output. Fail on exit with any signal, instead of just > SIGILL. > --- > gdb/testsuite/gdb.base/disp-step-syscall.exp | 53 ++++++++++++++-------------- > 1 file changed, 26 insertions(+), 27 deletions(-) > > diff --git a/gdb/testsuite/gdb.base/disp-step-syscall.exp b/gdb/testsuite/gdb.base/disp-step-syscall.exp > index ff66f83..b13dce4 100644 > --- a/gdb/testsuite/gdb.base/disp-step-syscall.exp > +++ b/gdb/testsuite/gdb.base/disp-step-syscall.exp > @@ -49,6 +49,8 @@ proc disp_step_cross_syscall { syscall } { > return > } > > + set is_target_remote [gdb_is_target_remote] > + > # Delete the breakpoint on main. > gdb_test_no_output "delete break 1" > > @@ -77,27 +79,34 @@ proc disp_step_cross_syscall { syscall } { > gdb_test "display/i \$pc" ".*" > > > - # Single step until we see sysall insn or we reach the upper bound of loop > - # iterations. > - set see_syscall_insn 0 > - > - for {set i 0} {$i < 1000 && $see_syscall_insn == 0} {incr i} { > - send_gdb "stepi\n" > - gdb_expect { > - -re ".*$syscall_insn.*$gdb_prompt $" { > - set see_syscall_insn 1 > + # Single step until we see a syscall insn or we reach the > + # upper bound of loop iterations. > + set msg "find syscall insn in $syscall" > + set steps 0 > + set max_steps 1000 > + gdb_test_multiple "stepi" $msg { > + -re ".*$syscall_insn.*$gdb_prompt $" { > + pass $msg > + } > + -re "x/i .*=>.*\r\n$gdb_prompt $" { > + incr steps > + if {$steps == $max_steps} { > + fail $msg > + } else { > + send_gdb "stepi\n" > + exp_continue > } > - -re ".*$gdb_prompt $" {} > } > } > > - if {$see_syscall_insn == 0} then { > - fail "find syscall insn in $syscall" > + if {$steps == $max_steps} { > return -1 > } > > set syscall_insn_addr [get_hexadecimal_valueof "\$pc" "0"] > - gdb_test "stepi" ".*" "stepi $syscall insn" > + if {[gdb_test "stepi" "x/i .*=>.*" "stepi $syscall insn"] != 0} { > + return -1 > + } > set syscall_insn_next_addr [get_hexadecimal_valueof "\$pc" "0"] > > gdb_test "continue" "Continuing\\..*Breakpoint \[0-9\]+, (.* in |__libc_|)$syscall \\(\\).*" \ > @@ -121,22 +130,12 @@ proc disp_step_cross_syscall { syscall } { > gdb_test_no_output "set displaced-stepping on" > > # Check the address of next instruction of syscall. > - if {$syscall == "vfork" && [is_remote target]} { > + if {$syscall == "vfork" && $is_target_remote} { > setup_kfail server/13796 "*-*-*" > } > - set test "single step over $syscall" > - gdb_test_multiple "stepi" $test { > - -re "Program terminated with signal SIGILL,.*\r\n$gdb_prompt $" { > - fail $test > - return > - } > - -re "\\\[Inferior .* exited normally\\\].*\r\n$gdb_prompt $" { > - fail $test > - return > - } > - -re "\r\n$gdb_prompt $" { > - pass $test > - } > + > + if {[gdb_test "stepi" "x/i .*=>.*" "single step over $syscall"] != 0} { > + return -1 > } > > set syscall_insn_next_addr_found [get_hexadecimal_valueof "\$pc" "0"] >
On 03/03/2015 06:20 AM, Breazeal, Don wrote: > infrun: target_wait (-1, status) =^M > infrun: 6486 [process 6486],^M > infrun: status->kind = vforked^M > infrun: TARGET_WAITKIND_VFORKED^M > Detaching after vfork from child process 6837.^M > sigchld^M > LCFF: waiting for VFORK_DONE on 6486^M > infrun: resume : clear step^M > infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current > thread [process 6486] at 0x2aaaaaffcf04^M > LLR: Preparing to resume process 6486, 0, inferior_ptid process 6486^M > LLR: PTRACE_CONT process 6486, 0 (resume event thread)^M > infrun: prepare_to_wait^M > linux_nat_wait: [process -1], [TARGET_WNOHANG]^M > LLW: enter^M > LNW: waitpid(-1, ...) returned 0, No child processes^M > LLW: exit (ignore)^M > infrun: target_wait (-1, status) =^M > infrun: -1 [process -1],^M > infrun: status->kind = ignore^M > infrun: TARGET_WAITKIND_IGNORE^M > infrun: prepare_to_wait^M > FAIL: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn (timeout) > testcase > /scratch/dbreazea/sandbox/gdb-with-exec-6/binutils-gdb/gdb/testsuite/gdb.base/disp-step-syscall.exp > completed in 21 seconds > Odd. Quite possibly a kernel bug. Looks like ptrace never reports the VFORK_DONE, or it does but SIGCHLD was never generated and thus we got stuck in the event loop. >> >> Unfortunately, with your full series applied, I get this: >> >> (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" >> stepi >> Detaching from process 29944 >> Killing process(es): 29942 29944 >> /home/pedro/gdb/mygit/src/gdb/gdbserver/linux-low.c:998: A problem internal to GDBserver has been detected. >> kill_wait_lwp: Assertion `res > 0' failed. >> /home/pedro/gdb/mygit/src/gdb/thread.c:1182: internal-error: switch_to_thread: Assertion `inf != NULL' failed. >> A problem internal to GDB has been detected, >> further debugging may prove unreliable. >> Quit this debugging session? (y or n) FAIL: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn (GDB internal error) >> Resyncing due to internal error. >> n > > With the two patches below applied, disp-step-syscall.exp passes > consistently for me using native-extended-gdbserver on x86_64 Ubuntu > 10.04 and Ubuntu 14.04. I haven't looked (yet) at how your patches > might have caused this change in behavior, or at how I might be able to > reproduce the failure you are seeing. TBC, I get the internal errors (F20, x86_64) without my patches too. The only difference is that without my patches the FAIL is following by slow timeouts: Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/disp-step-syscall.exp ... FAIL: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn (GDB internal error) FAIL: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" (timeout) FAIL: gdb.base/disp-step-syscall.exp: vfork: continue to vfork (3rd time) (GDB internal error) FAIL: gdb.base/disp-step-syscall.exp: vfork: continue to syscall insn vfork (the program is no longer running) > > I have seen the "inf != NULL" assertion before, when stopped at a remote > fork/vfork catchpoint and executing "info threads". In that case > gdbserver was reporting the new thread created by the fork. It was > added to the host-side thread list, but the new inferior had not been > created yet on the host side. That specific scenario should be > prevented now in the remote follow fork patch series by not reporting > the forked child's thread until the follow_fork has been completed. (If > I am remembering that right.) Adding infrun/remote logging, I see: infrun: target_wait (-1, status) = infrun: 26217 [Thread 26217.26217], infrun: status->kind = vforked infrun: TARGET_WAITKIND_VFORKED Sending packet: $z0,400624,1#63...Packet received: OK Sending packet: $z0,3b36603966,1#6f...Packet received: OK Sending packet: $z0,3b36613970,1#6b...Packet received: OK Sending packet: $z0,3b36614891,1#6e...Packet received: OK Sending packet: $z0,3b36abc9c0,1#23...Packet received: OK Detaching after vfork from child process 26219. Sending packet: $D;666b#83...Detaching from process 26219 Killing process(es): 26217 26219 /home/pedro/gdb/mygit/src/gdb/gdbserver/linux-low.c:998: A problem internal to GDBserver has been detected. kill_wait_lwp: Assertion `res > 0' failed. Packet received: E01 /home/pedro/gdb/mygit/src/gdb/thread.c:1182: internal-error: switch_to_thread: Assertion `inf != NULL' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) FAIL: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn (GDB internal error) Resyncing due to internal error. The backtrace: ... #2 0x000000000041451e in internal_verror (file=0x456008 "/home/pedro/gdb/mygit/src/gdb/gdbserver/linux-low.c", line=998, fmt=0x455fe6 "%s: Assertion `%s' failed.", args=0x7fff7b828258) at /home/pedro/gdb/mygit/src/gdb/gdbserver/utils.c:106 #3 0x0000000000426ea8 in internal_error (file=0x456008 "/home/pedro/gdb/mygit/src/gdb/gdbserver/linux-low.c", line=998, fmt=0x455fe6 "%s: Assertion `%s' failed.") at /home/pedro/gdb/mygit/src/gdb/gdbserver/../common/errors.c:55 #4 0x00000000004293cb in kill_wait_lwp (lwp=0x14a38b0) at /home/pedro/gdb/mygit/src/gdb/gdbserver/linux-low.c:998 #5 0x0000000000429543 in linux_kill (pid=26624) at /home/pedro/gdb/mygit/src/gdb/gdbserver/linux-low.c:1050 #6 0x00000000004140ea in kill_inferior (pid=26624) at /home/pedro/gdb/mygit/src/gdb/gdbserver/target.c:219 #7 0x00000000004110e1 in detach_or_kill_inferior_callback (entry=0x14a2ad0) at /home/pedro/gdb/mygit/src/gdb/gdbserver/server.c:3087 #8 0x00000000004064da in for_each_inferior (list=0x670110 <all_processes>, action=0x41107f <detach_or_kill_inferior_callback>) at /home/pedro/gdb/mygit/src/gdb/gdbserver/inferiors.c:55 #9 0x0000000000411258 in detach_or_kill_for_exit () at /home/pedro/gdb/mygit/src/gdb/gdbserver/server.c:3148 #10 0x0000000000411295 in detach_or_kill_for_exit_cleanup (ignore=0x0) at /home/pedro/gdb/mygit/src/gdb/gdbserver/server.c:3163 #11 0x0000000000427178 in do_my_cleanups (pmy_chain=0x668938 <cleanup_chain>, old_chain=0x44c440 <sentinel_cleanup>) at /home/pedro/gdb/mygit/src/gdb/gdbserver/../common/cleanups.c:155 #12 0x00000000004271e5 in do_cleanups (old_chain=0x44c440 <sentinel_cleanup>) at /home/pedro/gdb/mygit/src/gdb/gdbserver/../common/cleanups.c:177 #13 0x00000000004276bf in throw_exception (exception=...) at /home/pedro/gdb/mygit/src/gdb/gdbserver/../common/common-exceptions.c:215 #14 0x0000000000427843 in throw_it (reason=RETURN_ERROR, error=GENERIC_ERROR, fmt=0x4480e7 "%s.", ap=0x7fff7b828648) at /home/pedro/gdb/mygit/src/gdb/gdbserver/../common/common-exceptions.c:274 #15 0x000000000042786d in throw_verror (error=GENERIC_ERROR, fmt=0x4480e7 "%s.", ap=0x7fff7b828648) at /home/pedro/gdb/mygit/src/gdb/gdbserver/../common/common-exceptions.c:280 #16 0x0000000000414456 in verror (string=0x4480e7 "%s.", args=0x7fff7b828648) at /home/pedro/gdb/mygit/src/gdb/gdbserver/utils.c:85 #17 0x0000000000426e00 in error (fmt=0x4480e7 "%s.") at /home/pedro/gdb/mygit/src/gdb/gdbserver/../common/errors.c:43 #18 0x0000000000414431 in perror_with_name (string=0x4435a6 "Can't determine port") at /home/pedro/gdb/mygit/src/gdb/gdbserver/utils.c:71 #19 0x0000000000407e49 in remote_open (name=0x7fff7b82a6fc ":2347") at /home/pedro/gdb/mygit/src/gdb/gdbserver/remote-utils.c:389 #20 0x0000000000411b12 in captured_main (argc=4, argv=0x7fff7b828a68) at /home/pedro/gdb/mygit/src/gdb/gdbserver/server.c:3414 #21 0x0000000000411ca2 in main (argc=4, argv=0x7fff7b828a68) at /home/pedro/gdb/mygit/src/gdb/gdbserver/server.c:3490 ... Seems like gdb disconnects, and we end up in remote_open again. Then probably due to --once (the list descriptor is closed), that fails and throws, which runs the "kill or detach everything" cleanup (detach_or_kill_for_exit_cleanup). And that ends up in your new code here: static int linux_kill (int pid) { struct process_info *process; struct lwp_info *lwp; struct target_waitstatus last; ptid_t last_ptid; /* If we're stopped while forking and we haven't followed yet, kill the child task. We need to do this first because the parent will be sleeping if this is a vfork. */ get_last_target_status (&last_ptid, &last); if (last.kind == TARGET_WAITKIND_FORKED || last.kind == TARGET_WAITKIND_VFORKED) { lwp = find_lwp_pid (last.value.related_pid); gdb_assert (lwp != NULL); kill_wait_lwp (lwp); process = find_process_pid (ptid_get_pid (last.value.related_pid)); the_target->mourn (process); } trying to kill the vfork child. Really get_last_target_status is not a good idea. It's broken on the native side already, and adding it to gdbserver too is not a good idea. E.g., consider scheduler-locking or non-stop. Other events on other processes/threads can easily happen and thus overwrite the last target status, before something decides to kill the fork parent. Thanks, Pedro Alves
On 3/2/2015 12:18 PM, Pedro Alves wrote: > On 02/27/2015 12:46 AM, Don Breazeal wrote: >> - There are a couple of tests that show new failures that actually >> fail in the current mainline. Details of these are as follows: >> >> * when vfork events are enabled, gdb.base/disp-step-syscall.exp >> shows PASS => FAIL in .sum diffs. The test actually always >> fails. With native/master, we see >> >> stepi^M >> FAIL: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn >> (timeout) >> > > Hmm, I don't see that here. I get a full pass on x86_64 Fedora 20. > Can you try "set debug infrun 1" / "set debug lin-lwp 1" / "set debug displaced 1" > to check if there's a gdb or kernel bug here? > >> With remote and extended-remote/master, we see a bogus PASS result: >> stepi^M >> [Inferior 1 (process 9399) exited normally]^M >> (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn >> >> The criteria to pass that test are pretty lax: >> gdb_test "stepi" ".*" "stepi $syscall insn" > > Yeah. I see several other problems. Here's a patch to improve it. > > Comments? Hi Pedro, This patch makes sense to me, and it has been working great for me while debugging my updates to the follow-fork patchset. We will need to update this once the remote follow patches are committed, I guess, since presumably the kfail 13796 will be resolved then. > > Unfortunately, with your full series applied, I get this: > > (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" > stepi > Detaching from process 29944 > Killing process(es): 29942 29944 > /home/pedro/gdb/mygit/src/gdb/gdbserver/linux-low.c:998: A problem internal to GDBserver has been detected. > kill_wait_lwp: Assertion `res > 0' failed. > /home/pedro/gdb/mygit/src/gdb/thread.c:1182: internal-error: switch_to_thread: Assertion `inf != NULL' failed. > A problem internal to GDB has been detected, > further debugging may prove unreliable. > Quit this debugging session? (y or n) FAIL: gdb.base/disp-step-syscall.exp: vfork: stepi vfork insn (GDB internal error) > Resyncing due to internal error. > n The updated patchset just posted should address this issue. https://sourceware.org/ml/gdb-patches/2015-03/msg00503.html Thanks, --Don > > Note, you'll need this one: > > https://sourceware.org/ml/gdb-patches/2015-03/msg00045.html > > for that internal error to result in a quick bail... > > ---------- > From 1f825812d3f17a2940065d0de38592700e7437bc Mon Sep 17 00:00:00 2001 > From: Pedro Alves <palves@redhat.com> > Date: Mon, 2 Mar 2015 20:16:23 +0000 > Subject: [PATCH] Tighten gdb.base/disp-step-syscall.exp > > This fixes several problems with this test. > > E.g,. with --target_board=native-extended-gdbserver on > x86_64 Fedora 20, I get: > > Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/disp-step-syscall.exp ... > FAIL: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" (timeout) > FAIL: gdb.base/disp-step-syscall.exp: vfork: single step over vfork final pc > FAIL: gdb.base/disp-step-syscall.exp: vfork: delete break vfork insn > FAIL: gdb.base/disp-step-syscall.exp: vfork: continue to marker (vfork) (the program is no longer running) > > And with --target=native-gdbserver, I get: > > Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/disp-step-syscall.exp ... > KPASS: gdb.base/disp-step-syscall.exp: vfork: single step over vfork (PRMS server/13796) > FAIL: gdb.base/disp-step-syscall.exp: vfork: get hexadecimal valueof "$pc" (timeout) > FAIL: gdb.base/disp-step-syscall.exp: vfork: single step over vfork final pc > FAIL: gdb.base/disp-step-syscall.exp: vfork: delete break vfork insn > FAIL: gdb.base/disp-step-syscall.exp: vfork: continue to marker (vfork) (the program is no longer running) > > First, the lack of fork support on remote targets is supposed to be > kfailed, so the KPASS is obviously bogus. The extended-remote board > should have KFAILed too. > > The problem is that the test is using "is_remote" instead of > gdb_is_target_remote. > > And then, I get: > > (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: set displaced-stepping on > stepi > > Program terminated with signal SIGSEGV, Segmentation fault. > The program no longer exists. > (gdb) PASS: gdb.base/disp-step-syscall.exp: vfork: single step over vfork > > Obviously, that should be a FAIL. The problem is that the test only > expects SIGILL, not SIGSEGV. It also doesn't bail correctly if an > internal error or some other pattern caught by gdb_test_multiple > matches. The test doesn't really need to match specific exits/crashes > patterns, if the PASS regex is improved, like in ... > > ... this and the other "stepi" tests are a bit too lax, passing on > ".*". This tightens those up to expect "x/i" and the "=>" current PC > indicator, like in: > > 1: x/i $pc > => 0x3b36abc9e2 <vfork+34>: syscall > > On x86_64 Fedora 20, I now get a quick KFAIL instead of timeouts with > both the native-extended-gdbserver and native-gdbserver boards: > > PASS: gdb.base/disp-step-syscall.exp: vfork: delete break vfork > PASS: gdb.base/disp-step-syscall.exp: vfork: continue to syscall insn vfork > PASS: gdb.base/disp-step-syscall.exp: vfork: set displaced-stepping on > KFAIL: gdb.base/disp-step-syscall.exp: vfork: single step over vfork (PRMS: server/13796) > > and a full pass with native testing. > > gdb/testsuite/ > 2015-03-02 Pedro Alves <palves@redhat.com> > > * gdb.base/disp-step-syscall.exp (disp_step_cross_syscall.exp): > Use gdb_is_target_remote instead of is_remote. Use > gdb_test_multiple instead of gdb_expect. Exit early if > gdb_test_multiple hits its internal matches. Tighten stepi tests > expected output. Fail on exit with any signal, instead of just > SIGILL. > --- > gdb/testsuite/gdb.base/disp-step-syscall.exp | 53 ++++++++++++++-------------- > 1 file changed, 26 insertions(+), 27 deletions(-) > > diff --git a/gdb/testsuite/gdb.base/disp-step-syscall.exp b/gdb/testsuite/gdb.base/disp-step-syscall.exp > index ff66f83..b13dce4 100644 > --- a/gdb/testsuite/gdb.base/disp-step-syscall.exp > +++ b/gdb/testsuite/gdb.base/disp-step-syscall.exp > @@ -49,6 +49,8 @@ proc disp_step_cross_syscall { syscall } { > return > } > > + set is_target_remote [gdb_is_target_remote] > + > # Delete the breakpoint on main. > gdb_test_no_output "delete break 1" > > @@ -77,27 +79,34 @@ proc disp_step_cross_syscall { syscall } { > gdb_test "display/i \$pc" ".*" > > > - # Single step until we see sysall insn or we reach the upper bound of loop > - # iterations. > - set see_syscall_insn 0 > - > - for {set i 0} {$i < 1000 && $see_syscall_insn == 0} {incr i} { > - send_gdb "stepi\n" > - gdb_expect { > - -re ".*$syscall_insn.*$gdb_prompt $" { > - set see_syscall_insn 1 > + # Single step until we see a syscall insn or we reach the > + # upper bound of loop iterations. > + set msg "find syscall insn in $syscall" > + set steps 0 > + set max_steps 1000 > + gdb_test_multiple "stepi" $msg { > + -re ".*$syscall_insn.*$gdb_prompt $" { > + pass $msg > + } > + -re "x/i .*=>.*\r\n$gdb_prompt $" { > + incr steps > + if {$steps == $max_steps} { > + fail $msg > + } else { > + send_gdb "stepi\n" > + exp_continue > } > - -re ".*$gdb_prompt $" {} > } > } > > - if {$see_syscall_insn == 0} then { > - fail "find syscall insn in $syscall" > + if {$steps == $max_steps} { > return -1 > } > > set syscall_insn_addr [get_hexadecimal_valueof "\$pc" "0"] > - gdb_test "stepi" ".*" "stepi $syscall insn" > + if {[gdb_test "stepi" "x/i .*=>.*" "stepi $syscall insn"] != 0} { > + return -1 > + } > set syscall_insn_next_addr [get_hexadecimal_valueof "\$pc" "0"] > > gdb_test "continue" "Continuing\\..*Breakpoint \[0-9\]+, (.* in |__libc_|)$syscall \\(\\).*" \ > @@ -121,22 +130,12 @@ proc disp_step_cross_syscall { syscall } { > gdb_test_no_output "set displaced-stepping on" > > # Check the address of next instruction of syscall. > - if {$syscall == "vfork" && [is_remote target]} { > + if {$syscall == "vfork" && $is_target_remote} { > setup_kfail server/13796 "*-*-*" > } > - set test "single step over $syscall" > - gdb_test_multiple "stepi" $test { > - -re "Program terminated with signal SIGILL,.*\r\n$gdb_prompt $" { > - fail $test > - return > - } > - -re "\\\[Inferior .* exited normally\\\].*\r\n$gdb_prompt $" { > - fail $test > - return > - } > - -re "\r\n$gdb_prompt $" { > - pass $test > - } > + > + if {[gdb_test "stepi" "x/i .*=>.*" "single step over $syscall"] != 0} { > + return -1 > } > > set syscall_insn_next_addr_found [get_hexadecimal_valueof "\$pc" "0"] >
Hi Don, On 03/17/2015 09:18 PM, Breazeal, Don wrote: > Hi Pedro, > This patch makes sense to me, and it has been working great for me while > debugging my updates to the follow-fork patchset. We will need to > update this once the remote follow patches are committed, I guess, > since presumably the kfail 13796 will be resolved then. Alright, might as well push it in then. I've done that now. Thanks, Pedro Alves
diff --git a/gdb/testsuite/gdb.base/disp-step-syscall.exp b/gdb/testsuite/gdb.base/disp-step-syscall.exp index ff66f83..b13dce4 100644 --- a/gdb/testsuite/gdb.base/disp-step-syscall.exp +++ b/gdb/testsuite/gdb.base/disp-step-syscall.exp @@ -49,6 +49,8 @@ proc disp_step_cross_syscall { syscall } { return } + set is_target_remote [gdb_is_target_remote] + # Delete the breakpoint on main. gdb_test_no_output "delete break 1" @@ -77,27 +79,34 @@ proc disp_step_cross_syscall { syscall } { gdb_test "display/i \$pc" ".*" - # Single step until we see sysall insn or we reach the upper bound of loop - # iterations. - set see_syscall_insn 0 - - for {set i 0} {$i < 1000 && $see_syscall_insn == 0} {incr i} { - send_gdb "stepi\n" - gdb_expect { - -re ".*$syscall_insn.*$gdb_prompt $" { - set see_syscall_insn 1 + # Single step until we see a syscall insn or we reach the + # upper bound of loop iterations. + set msg "find syscall insn in $syscall" + set steps 0 + set max_steps 1000 + gdb_test_multiple "stepi" $msg { + -re ".*$syscall_insn.*$gdb_prompt $" { + pass $msg + } + -re "x/i .*=>.*\r\n$gdb_prompt $" { + incr steps + if {$steps == $max_steps} { + fail $msg + } else { + send_gdb "stepi\n" + exp_continue } - -re ".*$gdb_prompt $" {} } } - if {$see_syscall_insn == 0} then { - fail "find syscall insn in $syscall" + if {$steps == $max_steps} { return -1 } set syscall_insn_addr [get_hexadecimal_valueof "\$pc" "0"] - gdb_test "stepi" ".*" "stepi $syscall insn" + if {[gdb_test "stepi" "x/i .*=>.*" "stepi $syscall insn"] != 0} { + return -1 + } set syscall_insn_next_addr [get_hexadecimal_valueof "\$pc" "0"] gdb_test "continue" "Continuing\\..*Breakpoint \[0-9\]+, (.* in |__libc_|)$syscall \\(\\).*" \ @@ -121,22 +130,12 @@ proc disp_step_cross_syscall { syscall } { gdb_test_no_output "set displaced-stepping on" # Check the address of next instruction of syscall. - if {$syscall == "vfork" && [is_remote target]} { + if {$syscall == "vfork" && $is_target_remote} { setup_kfail server/13796 "*-*-*" } - set test "single step over $syscall" - gdb_test_multiple "stepi" $test { - -re "Program terminated with signal SIGILL,.*\r\n$gdb_prompt $" { - fail $test - return - } - -re "\\\[Inferior .* exited normally\\\].*\r\n$gdb_prompt $" { - fail $test - return - } - -re "\r\n$gdb_prompt $" { - pass $test - } + + if {[gdb_test "stepi" "x/i .*=>.*" "single step over $syscall"] != 0} { + return -1 } set syscall_insn_next_addr_found [get_hexadecimal_valueof "\$pc" "0"]