Message ID | 20230531200602.165033-1-thiago.bauermann@linaro.org |
---|---|
State | New |
Headers |
Return-Path: <gdb-patches-bounces+patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B8BFA3854E46 for <patchwork@sourceware.org>; Wed, 31 May 2023 20:07:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B8BFA3854E46 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1685563669; bh=DcQXE1ra+4QMe9+hspjjdLYWjv520shC7VVv+cPzuUo=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=E6JGKe+PkLLJ0Vi3ynAXfJDVeJzDRecrjmERNr1Wac7KsA5gGzzegi5ryEKMckyh3 ilS5nFYceFPJb+bZwwHGfIdroHa15X6DfEjtH99jR346sPMfYeO0DjDV5IxF3W8Bys O73/ZIRWRn4/aqsfaQUWrKYKo1fnU8G1cPwy4Ijk= X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from mail-lj1-x236.google.com (mail-lj1-x236.google.com [IPv6:2a00:1450:4864:20::236]) by sourceware.org (Postfix) with ESMTPS id 441FE3858024 for <gdb-patches@sourceware.org>; Wed, 31 May 2023 20:06:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 441FE3858024 Received: by mail-lj1-x236.google.com with SMTP id 38308e7fff4ca-2af98eb6ef0so949351fa.3 for <gdb-patches@sourceware.org>; Wed, 31 May 2023 13:06:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685563583; x=1688155583; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DcQXE1ra+4QMe9+hspjjdLYWjv520shC7VVv+cPzuUo=; b=YRlY8q66/iAZ6tcWgUm+mPlLWeaeFmqKqhfmvpEpnW5IwSYH6e7ewlJf4O+/SDQWl8 9LgndbWDqYQ0aKaYjAoTj4UF4oU/0H+kpl15qwZNiwoK3ZoUGeTM217K2GefupiwDmpa gPYyh2b1rccRJHzTi6B9Dvc7Fx0gjqm9LlQiCxYIDdg8h7RCgkyBOIL6eyB81+f+j5bn udL/QRH3ITfp78GpA/7DMdy3w6eCWpMfXqQdTPBEt5HYC5j3H9XBLax2bqbAP05aX1TV uHx9uBCBdQdScUvHLR8Az+7hNUiznGPVhzDdPM0YeRpWfizrZDDiG8o/xd7l1jbsuqIR 16oQ== X-Gm-Message-State: AC+VfDwHCoPzfbznnyPsb47OTnaD5D32mDuOsu8qA3QoGTmNUlEtobaz wu1OXpaf+pJqVoT+Nk9BxEN4zi/nPeTC5dhiy4Cwrw== X-Google-Smtp-Source: ACHHUZ5x1pyutUkHwb+wvtmO8TAd27R7cf5B+TmPw9jWQZHZCPQbZDG2lMN2+PnD8CDUrU5PFPtSAQ== X-Received: by 2002:a2e:a3d0:0:b0:2a7:b0b4:4fa with SMTP id w16-20020a2ea3d0000000b002a7b0b404famr3751332lje.12.1685563582690; Wed, 31 May 2023 13:06:22 -0700 (PDT) Received: from localhost (p200300eaff33e0911da3b049df5a0eae.dip0.t-ipconnect.de. [2003:ea:ff33:e091:1da3:b049:df5a:eae]) by smtp.gmail.com with ESMTPSA id bm22-20020a0564020b1600b00511aea132b9sm6246024edb.3.2023.05.31.13.06.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 May 2023 13:06:22 -0700 (PDT) To: gdb-patches@sourceware.org Cc: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Subject: [PATCH] gdb/testsuite: Avoid infinite loop in gdb.reverse/step-reverse.exp Date: Wed, 31 May 2023 22:06:02 +0200 Message-Id: <20230531200602.165033-1-thiago.bauermann@linaro.org> X-Mailer: git-send-email 2.40.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>, <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>, <mailto:gdb-patches-request@sourceware.org?subject=subscribe> From: Thiago Jung Bauermann via Gdb-patches <gdb-patches@sourceware.org> Reply-To: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Sender: "Gdb-patches" <gdb-patches-bounces+patchwork=sourceware.org@sourceware.org> |
Series |
gdb/testsuite: Avoid infinite loop in gdb.reverse/step-reverse.exp
|
|
Commit Message
Thiago Jung Bauermann
May 31, 2023, 8:06 p.m. UTC
This testcase sometimes gets stuck in a loop for hours when running in our CI. The problem is that due to an issue unrelated to reverse debugging the inferior exits early, and because of the overly generic ".*" pattern the testcase keeps sending the "next" command without noticing that the inferior is gone. gdb_test_multiple has a pattern to detect that "The program is not being run.", but since it is placed after the patterns from the caller it won't be triggered. It also has a timeout pattern, but for some reason I don't understand it often doesn't trigger. Since the test binary is compiled with debug information, fix by changing the pattern to match the source code line number that is shown by GDB right after the "step" command. --- gdb/testsuite/gdb.reverse/step-reverse.exp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Comments
On 31/05/2023 22:06, Thiago Jung Bauermann via Gdb-patches wrote: > This testcase sometimes gets stuck in a loop for hours when running in our > CI. The problem is that due to an issue unrelated to reverse debugging the > inferior exits early, and because of the overly generic ".*" pattern the > testcase keeps sending the "next" command without noticing that the > inferior is gone. > > gdb_test_multiple has a pattern to detect that "The program is not being > run.", but since it is placed after the patterns from the caller it won't > be triggered. It also has a timeout pattern, but for some reason I don't > understand it often doesn't trigger. > > Since the test binary is compiled with debug information, fix by changing > the pattern to match the source code line number that is shown by GDB right > after the "step" command. > --- Oops, sorry about that, that never even crossed my mind. This change looks pretty simple IMO. Reviewed-By: Bruno Larsen <blarsen@redhat.com>
On 5/31/23 22:06, Thiago Jung Bauermann via Gdb-patches wrote: > This testcase sometimes gets stuck in a loop for hours when running in our > CI. The problem is that due to an issue unrelated to reverse debugging the > inferior exits early, and because of the overly generic ".*" pattern the > testcase keeps sending the "next" command without noticing that the > inferior is gone. > > gdb_test_multiple has a pattern to detect that "The program is not being > run.", but since it is placed after the patterns from the caller it won't > be triggered. It also has a timeout pattern, but for some reason I don't > understand it often doesn't trigger. > > Since the test binary is compiled with debug information, fix by changing > the pattern to match the source code line number that is shown by GDB right > after the "step" command. > --- > gdb/testsuite/gdb.reverse/step-reverse.exp | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/gdb/testsuite/gdb.reverse/step-reverse.exp b/gdb/testsuite/gdb.reverse/step-reverse.exp > index 729218d4cb8c..766ca02910af 100644 > --- a/gdb/testsuite/gdb.reverse/step-reverse.exp > +++ b/gdb/testsuite/gdb.reverse/step-reverse.exp > @@ -261,7 +261,7 @@ if { "$step_out" == 1 } { > -re -wrap "NEXT OVER THIS RECURSION.*" { > set step_out 0 > } > - -re -wrap ".*" { > + -re -wrap "^\[0-9\].*" { > send_gdb "next\n" > exp_continue > } > @@ -286,7 +286,7 @@ gdb_test_multiple "next" "step over recursion inside the recursion" { > gdb_assert {"$seen_recursive_call" == 1} \ > "step over recursion inside the recursion" > } > - -re -wrap ".*" { > + -re -wrap "^\[0-9\].*" { > send_gdb "next\n" > exp_continue > } Hi, the fix as is addresses the immediate problem. Introducing an iteration bound as a safety net is probably a good idea as well, given the problem you ran into. Also, I prefer a different style of fix. I think that this gdb_test_multiple is overly complicated by dealing with the prompt in two clauses. I came up with this as first version which removes the need for any .* usage (though to be clear, -wrap "" and -wrap ".*" is the same thing): ... if { "$step_out" == 1 } { gdb_test_multiple "next" "stepping out of recursion" { -re "NEXT OVER THIS RECURSION" { set step_out 0 exp_continue } -re -wrap "" { if { $step_out == 1 } { send_gdb "next\n" exp_continue } pass $gdb_test_name } } } ... and this after adding the iteration bound: ... if { "$step_out" == 1 } { set iterations 0 set max_iterations 10 gdb_test_multiple "next" "stepping out of recursion" { -re "NEXT OVER THIS RECURSION" { set step_out 0 exp_continue } -re -wrap "" { if { $step_out == 1 } { incr iterations if { $iterations == $max_iterations } { fail $gdb_test_name return } send_gdb "next\n" exp_continue } pass $gdb_test_name } } } ... [ The gdb_test_multiple can be even further simplified, by wrapping it in a while that takes care of the iteration. ] This code doesn't actually trigger for me, so I added this: ... -gdb_test_multiple "next" "reverse next over recursion" { +gdb_test_multiple "step" "reverse next over recursion" { ... That gave me the default fail: ... (gdb) PASS: gdb.reverse/step-reverse.exp: reverse next over call step^M recursive_callee (val=17) at /data/vries/gdb/src/gdb/testsuite/gdb.reverse/step-reverse.c:41^M 41 } /* EXIT RECURSIVE FUNCTION */^M (gdb) FAIL: gdb.reverse/step-reverse.exp: reverse next over recursion ... which did not set step_out to 1, so still the code didn't get triggered,so I did: ... - -re -wrap ".*RECURSIVE CALL.*" { + -re -wrap ".*(RECURSIVE CALL|EXIT RECURSIVE FUNCTION).*" { ... ISTM that the test-case contains complicated and hard-to-trigger code, which could be fixed by merging the code for the next and step scenario. Anyway, I'm fine with your fix, if an iteration bound is added. But I think at least this part of the test-case could do with a rewrite. Thanks, - Tom
Thiago Jung Bauermann via Gdb-patches <gdb-patches@sourceware.org> writes: > This testcase sometimes gets stuck in a loop for hours when running in our > CI. The problem is that due to an issue unrelated to reverse debugging the > inferior exits early, and because of the overly generic ".*" pattern the > testcase keeps sending the "next" command without noticing that the > inferior is gone. > > gdb_test_multiple has a pattern to detect that "The program is not being > run.", but since it is placed after the patterns from the caller it won't > be triggered. It also has a timeout pattern, but for some reason I don't > understand it often doesn't trigger. The timeout is triggered between successful matches. So each time the test matches the '-re -wrap ".*"' this counts as a successful match and the timeout is reset. Thanks, Andrew > > Since the test binary is compiled with debug information, fix by changing > the pattern to match the source code line number that is shown by GDB right > after the "step" command. > --- > gdb/testsuite/gdb.reverse/step-reverse.exp | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/gdb/testsuite/gdb.reverse/step-reverse.exp b/gdb/testsuite/gdb.reverse/step-reverse.exp > index 729218d4cb8c..766ca02910af 100644 > --- a/gdb/testsuite/gdb.reverse/step-reverse.exp > +++ b/gdb/testsuite/gdb.reverse/step-reverse.exp > @@ -261,7 +261,7 @@ if { "$step_out" == 1 } { > -re -wrap "NEXT OVER THIS RECURSION.*" { > set step_out 0 > } > - -re -wrap ".*" { > + -re -wrap "^\[0-9\].*" { > send_gdb "next\n" > exp_continue > } > @@ -286,7 +286,7 @@ gdb_test_multiple "next" "step over recursion inside the recursion" { > gdb_assert {"$seen_recursive_call" == 1} \ > "step over recursion inside the recursion" > } > - -re -wrap ".*" { > + -re -wrap "^\[0-9\].*" { > send_gdb "next\n" > exp_continue > }
On 01/06/2023 11:15, Tom de Vries wrote: > On 5/31/23 22:06, Thiago Jung Bauermann via Gdb-patches wrote: >> This testcase sometimes gets stuck in a loop for hours when running >> in our >> CI. The problem is that due to an issue unrelated to reverse >> debugging the >> inferior exits early, and because of the overly generic ".*" pattern the >> testcase keeps sending the "next" command without noticing that the >> inferior is gone. >> >> gdb_test_multiple has a pattern to detect that "The program is not being >> run.", but since it is placed after the patterns from the caller it >> won't >> be triggered. It also has a timeout pattern, but for some reason I don't >> understand it often doesn't trigger. >> >> Since the test binary is compiled with debug information, fix by >> changing >> the pattern to match the source code line number that is shown by GDB >> right >> after the "step" command. >> --- >> gdb/testsuite/gdb.reverse/step-reverse.exp | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/gdb/testsuite/gdb.reverse/step-reverse.exp >> b/gdb/testsuite/gdb.reverse/step-reverse.exp >> index 729218d4cb8c..766ca02910af 100644 >> --- a/gdb/testsuite/gdb.reverse/step-reverse.exp >> +++ b/gdb/testsuite/gdb.reverse/step-reverse.exp >> @@ -261,7 +261,7 @@ if { "$step_out" == 1 } { >> -re -wrap "NEXT OVER THIS RECURSION.*" { >> set step_out 0 >> } >> - -re -wrap ".*" { >> + -re -wrap "^\[0-9\].*" { >> send_gdb "next\n" >> exp_continue >> } >> @@ -286,7 +286,7 @@ gdb_test_multiple "next" "step over recursion >> inside the recursion" { >> gdb_assert {"$seen_recursive_call" == 1} \ >> "step over recursion inside the recursion" >> } >> - -re -wrap ".*" { >> + -re -wrap "^\[0-9\].*" { >> send_gdb "next\n" >> exp_continue >> } > > Hi, > > the fix as is addresses the immediate problem. > > Introducing an iteration bound as a safety net is probably a good idea > as well, given the problem you ran into. > > Also, I prefer a different style of fix. I think that this > gdb_test_multiple is overly complicated by dealing with the prompt in > two clauses. > > I came up with this as first version which removes the need for any .* > usage (though to be clear, -wrap "" and -wrap ".*" is the same thing): > ... > if { "$step_out" == 1 } { > gdb_test_multiple "next" "stepping out of recursion" { > -re "NEXT OVER THIS RECURSION" { > set step_out 0 > exp_continue > } > -re -wrap "" { > if { $step_out == 1 } { > send_gdb "next\n" > exp_continue > } > pass $gdb_test_name > } > } > } > ... > and this after adding the iteration bound: > ... > if { "$step_out" == 1 } { > set iterations 0 > set max_iterations 10 > gdb_test_multiple "next" "stepping out of recursion" { > -re "NEXT OVER THIS RECURSION" { > set step_out 0 > exp_continue > } > -re -wrap "" { > if { $step_out == 1 } { > incr iterations > if { $iterations == $max_iterations } { > fail $gdb_test_name > return > } > > send_gdb "next\n" > exp_continue > } > pass $gdb_test_name > } > } > } > ... > > [ The gdb_test_multiple can be even further simplified, by wrapping it > in a while that takes care of the iteration. ] > > This code doesn't actually trigger for me, so I added this: > ... > -gdb_test_multiple "next" "reverse next over recursion" { > +gdb_test_multiple "step" "reverse next over recursion" { > ... > > That gave me the default fail: > ... > (gdb) PASS: gdb.reverse/step-reverse.exp: reverse next over call > step^M > recursive_callee (val=17) at > /data/vries/gdb/src/gdb/testsuite/gdb.reverse/step-reverse.c:41^M > 41 } /* EXIT RECURSIVE FUNCTION */^M > (gdb) FAIL: gdb.reverse/step-reverse.exp: reverse next over recursion > ... > which did not set step_out to 1, so still the code didn't get > triggered,so I did: > ... > - -re -wrap ".*RECURSIVE CALL.*" { > + -re -wrap ".*(RECURSIVE CALL|EXIT RECURSIVE FUNCTION).*" { > ... > > ISTM that the test-case contains complicated and hard-to-trigger code, > which could be fixed by merging the code for the next and step scenario. > Some of the changes you introduced are not correct with the original idea of the test. As a refresher, the problem that is being tested is that GDB would not be able to reverse-next and skip recursive functions (details: PR 16678). It would skip the first call then stop, which in this test case would still be 4 calls deep into the recursion when we wanted to stop on main. The reason the test works in 2 clauses is because only the first part is actually testing the fix (that we can reverse-next without stopping at a recursive call site), the second case is just resetting the inferior's state into a known position in case the first test fails, so we dont get the whole test being useless if this feature ever regresses. That's why the variable I used is called "step_out", it knows if we have to manually step out of all recursive calls. You could simplify this case into a single test by only emitting a pass/fail when you leave the loop and asserting that step_out is 0, but I thought it would be even harder to figure out what the test was doing in the future. Another problem with adding a maximum iteration count (and the reason I didn't go for it originally) is because you have to have enough iterations to step out of 4 function calls, and I didn't want to hard code that number because changes to the function would not be immediately obvious while the use case continued to work. Another way to know when the inferior is at the expected state is to look for the message that we're back at the main function, so we avoid using ".*" in the match.
Hello Andrew, Andrew Burgess <aburgess@redhat.com> writes: > Thiago Jung Bauermann via Gdb-patches <gdb-patches@sourceware.org> > writes: > >> This testcase sometimes gets stuck in a loop for hours when running in our >> CI. The problem is that due to an issue unrelated to reverse debugging the >> inferior exits early, and because of the overly generic ".*" pattern the >> testcase keeps sending the "next" command without noticing that the >> inferior is gone. >> >> gdb_test_multiple has a pattern to detect that "The program is not being >> run.", but since it is placed after the patterns from the caller it won't >> be triggered. It also has a timeout pattern, but for some reason I don't >> understand it often doesn't trigger. > > The timeout is triggered between successful matches. So each time the > test matches the '-re -wrap ".*"' this counts as a successful match and > the timeout is reset. Thank you for the explanation, it was very helpful. If you don't mind, I will incorporate it in the patch description: > gdb_test_multiple has a pattern to detect that "The program is not being > run.", but since it is placed after the patterns from the caller it won't > be triggered. It also has a timeout pattern but because it is triggered > between successful matches, each time the test matches the '-re -wrap ".*"' > this counts as a successful match and the timeout is reset.
Hello Tom, Hello Bruno, Bruno Larsen <blarsen@redhat.com> writes: > On 01/06/2023 11:15, Tom de Vries wrote: >> On 5/31/23 22:06, Thiago Jung Bauermann via Gdb-patches wrote: >>> This testcase sometimes gets stuck in a loop for hours when running in our >>> CI. The problem is that due to an issue unrelated to reverse debugging the >>> inferior exits early, and because of the overly generic ".*" pattern the >>> testcase keeps sending the "next" command without noticing that the >>> inferior is gone. >>> >>> gdb_test_multiple has a pattern to detect that "The program is not being >>> run.", but since it is placed after the patterns from the caller it won't >>> be triggered. It also has a timeout pattern, but for some reason I don't >>> understand it often doesn't trigger. >>> >>> Since the test binary is compiled with debug information, fix by changing >>> the pattern to match the source code line number that is shown by GDB right >>> after the "step" command. >>> --- >>> gdb/testsuite/gdb.reverse/step-reverse.exp | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/gdb/testsuite/gdb.reverse/step-reverse.exp >>> b/gdb/testsuite/gdb.reverse/step-reverse.exp >>> index 729218d4cb8c..766ca02910af 100644 >>> --- a/gdb/testsuite/gdb.reverse/step-reverse.exp >>> +++ b/gdb/testsuite/gdb.reverse/step-reverse.exp >>> @@ -261,7 +261,7 @@ if { "$step_out" == 1 } { >>> -re -wrap "NEXT OVER THIS RECURSION.*" { >>> set step_out 0 >>> } >>> - -re -wrap ".*" { >>> + -re -wrap "^\[0-9\].*" { >>> send_gdb "next\n" >>> exp_continue >>> } >>> @@ -286,7 +286,7 @@ gdb_test_multiple "next" "step over recursion inside the recursion" >>> { >>> gdb_assert {"$seen_recursive_call" == 1} \ >>> "step over recursion inside the recursion" >>> } >>> - -re -wrap ".*" { >>> + -re -wrap "^\[0-9\].*" { >>> send_gdb "next\n" >>> exp_continue >>> } >> >> Hi, >> >> the fix as is addresses the immediate problem. >> >> Introducing an iteration bound as a safety net is probably a good idea as well, given >> the problem you ran into. >> >> Also, I prefer a different style of fix. I think that this gdb_test_multiple is overly >> complicated by dealing with the prompt in two clauses. >> >> I came up with this as first version which removes the need for any .* usage (though to >> be clear, -wrap "" and -wrap ".*" is the same thing): >> ... >> if { "$step_out" == 1 } { >> gdb_test_multiple "next" "stepping out of recursion" { >> -re "NEXT OVER THIS RECURSION" { >> set step_out 0 >> exp_continue >> } >> -re -wrap "" { >> if { $step_out == 1 } { >> send_gdb "next\n" >> exp_continue >> } >> pass $gdb_test_name >> } >> } >> } >> ... >> and this after adding the iteration bound: >> ... >> if { "$step_out" == 1 } { >> set iterations 0 >> set max_iterations 10 >> gdb_test_multiple "next" "stepping out of recursion" { >> -re "NEXT OVER THIS RECURSION" { >> set step_out 0 >> exp_continue >> } >> -re -wrap "" { >> if { $step_out == 1 } { >> incr iterations >> if { $iterations == $max_iterations } { >> fail $gdb_test_name >> return >> } >> >> send_gdb "next\n" >> exp_continue >> } >> pass $gdb_test_name >> } >> } >> } >> ... >> >> [ The gdb_test_multiple can be even further simplified, by wrapping it in a while that >> takes care of the iteration. ] >> >> This code doesn't actually trigger for me, so I added this: >> ... >> -gdb_test_multiple "next" "reverse next over recursion" { >> +gdb_test_multiple "step" "reverse next over recursion" { >> ... >> >> That gave me the default fail: >> ... >> (gdb) PASS: gdb.reverse/step-reverse.exp: reverse next over call >> step^M >> recursive_callee (val=17) at >> /data/vries/gdb/src/gdb/testsuite/gdb.reverse/step-reverse.c:41^M >> 41 } /* EXIT RECURSIVE FUNCTION */^M >> (gdb) FAIL: gdb.reverse/step-reverse.exp: reverse next over recursion >> ... >> which did not set step_out to 1, so still the code didn't get triggered,so I did: >> ... >> - -re -wrap ".*RECURSIVE CALL.*" { >> + -re -wrap ".*(RECURSIVE CALL|EXIT RECURSIVE FUNCTION).*" { >> ... >> >> ISTM that the test-case contains complicated and hard-to-trigger code, which could be >> fixed by merging the code for the next and step scenario. >> > Some of the changes you introduced are not correct with the original idea of the test. As > a refresher, the problem that is being tested is that GDB would not be able to > reverse-next and skip recursive functions (details: PR 16678). It would skip the first > call then stop, which in this test case would still be 4 calls deep into the recursion > when we wanted to stop on main. > > The reason the test works in 2 clauses is because only the first part is actually testing > the fix (that we can reverse-next without stopping at a recursive call site), the second > case is just resetting the inferior's state into a known position in case the first test > fails, so we dont get the whole test being useless if this feature ever regresses. That's > why the variable I used is called "step_out", it knows if we have to manually step out of > all recursive calls. You could simplify this case into a single test by only emitting a > pass/fail when you leave the loop and asserting that step_out is 0, but I thought it would > be even harder to figure out what the test was doing in the future. > > Another problem with adding a maximum iteration count (and the reason I didn't go for it > originally) is because you have to have enough iterations to step out of 4 function calls, > and I didn't want to hard code that number because changes to the function would not be > immediately obvious while the use case continued to work. Another way to know when the > inferior is at the expected state is to look for the message that we're back at the main > function, so we avoid using ".*" in the match. Thank you both for the quick review and feedback. Regarding adding a maximum iteration count, I agree with Tom that some ceiling is useful to avoid the possibility of the testcase running for hours, as has been happening in my case. Would it be ok if the maximum number of iterations was very big, such as 100 or 1000? That would leave room for future changes to the testcase while providing a bounded run time.
On 01/06/2023 15:27, Thiago Jung Bauermann wrote: > Hello Tom, Hello Bruno, > > Bruno Larsen <blarsen@redhat.com> writes: > >> On 01/06/2023 11:15, Tom de Vries wrote: >>> On 5/31/23 22:06, Thiago Jung Bauermann via Gdb-patches wrote: >>>> This testcase sometimes gets stuck in a loop for hours when running in our >>>> CI. The problem is that due to an issue unrelated to reverse debugging the >>>> inferior exits early, and because of the overly generic ".*" pattern the >>>> testcase keeps sending the "next" command without noticing that the >>>> inferior is gone. >>>> >>>> gdb_test_multiple has a pattern to detect that "The program is not being >>>> run.", but since it is placed after the patterns from the caller it won't >>>> be triggered. It also has a timeout pattern, but for some reason I don't >>>> understand it often doesn't trigger. >>>> >>>> Since the test binary is compiled with debug information, fix by changing >>>> the pattern to match the source code line number that is shown by GDB right >>>> after the "step" command. >>>> --- >>>> gdb/testsuite/gdb.reverse/step-reverse.exp | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/gdb/testsuite/gdb.reverse/step-reverse.exp >>>> b/gdb/testsuite/gdb.reverse/step-reverse.exp >>>> index 729218d4cb8c..766ca02910af 100644 >>>> --- a/gdb/testsuite/gdb.reverse/step-reverse.exp >>>> +++ b/gdb/testsuite/gdb.reverse/step-reverse.exp >>>> @@ -261,7 +261,7 @@ if { "$step_out" == 1 } { >>>> -re -wrap "NEXT OVER THIS RECURSION.*" { >>>> set step_out 0 >>>> } >>>> - -re -wrap ".*" { >>>> + -re -wrap "^\[0-9\].*" { >>>> send_gdb "next\n" >>>> exp_continue >>>> } >>>> @@ -286,7 +286,7 @@ gdb_test_multiple "next" "step over recursion inside the recursion" >>>> { >>>> gdb_assert {"$seen_recursive_call" == 1} \ >>>> "step over recursion inside the recursion" >>>> } >>>> - -re -wrap ".*" { >>>> + -re -wrap "^\[0-9\].*" { >>>> send_gdb "next\n" >>>> exp_continue >>>> } >>> Hi, >>> >>> the fix as is addresses the immediate problem. >>> >>> Introducing an iteration bound as a safety net is probably a good idea as well, given >>> the problem you ran into. >>> >>> Also, I prefer a different style of fix. I think that this gdb_test_multiple is overly >>> complicated by dealing with the prompt in two clauses. >>> >>> I came up with this as first version which removes the need for any .* usage (though to >>> be clear, -wrap "" and -wrap ".*" is the same thing): >>> ... >>> if { "$step_out" == 1 } { >>> gdb_test_multiple "next" "stepping out of recursion" { >>> -re "NEXT OVER THIS RECURSION" { >>> set step_out 0 >>> exp_continue >>> } >>> -re -wrap "" { >>> if { $step_out == 1 } { >>> send_gdb "next\n" >>> exp_continue >>> } >>> pass $gdb_test_name >>> } >>> } >>> } >>> ... >>> and this after adding the iteration bound: >>> ... >>> if { "$step_out" == 1 } { >>> set iterations 0 >>> set max_iterations 10 >>> gdb_test_multiple "next" "stepping out of recursion" { >>> -re "NEXT OVER THIS RECURSION" { >>> set step_out 0 >>> exp_continue >>> } >>> -re -wrap "" { >>> if { $step_out == 1 } { >>> incr iterations >>> if { $iterations == $max_iterations } { >>> fail $gdb_test_name >>> return >>> } >>> >>> send_gdb "next\n" >>> exp_continue >>> } >>> pass $gdb_test_name >>> } >>> } >>> } >>> ... >>> >>> [ The gdb_test_multiple can be even further simplified, by wrapping it in a while that >>> takes care of the iteration. ] >>> >>> This code doesn't actually trigger for me, so I added this: >>> ... >>> -gdb_test_multiple "next" "reverse next over recursion" { >>> +gdb_test_multiple "step" "reverse next over recursion" { >>> ... >>> >>> That gave me the default fail: >>> ... >>> (gdb) PASS: gdb.reverse/step-reverse.exp: reverse next over call >>> step^M >>> recursive_callee (val=17) at >>> /data/vries/gdb/src/gdb/testsuite/gdb.reverse/step-reverse.c:41^M >>> 41 } /* EXIT RECURSIVE FUNCTION */^M >>> (gdb) FAIL: gdb.reverse/step-reverse.exp: reverse next over recursion >>> ... >>> which did not set step_out to 1, so still the code didn't get triggered,so I did: >>> ... >>> - -re -wrap ".*RECURSIVE CALL.*" { >>> + -re -wrap ".*(RECURSIVE CALL|EXIT RECURSIVE FUNCTION).*" { >>> ... >>> >>> ISTM that the test-case contains complicated and hard-to-trigger code, which could be >>> fixed by merging the code for the next and step scenario. >>> >> Some of the changes you introduced are not correct with the original idea of the test. As >> a refresher, the problem that is being tested is that GDB would not be able to >> reverse-next and skip recursive functions (details: PR 16678). It would skip the first >> call then stop, which in this test case would still be 4 calls deep into the recursion >> when we wanted to stop on main. >> >> The reason the test works in 2 clauses is because only the first part is actually testing >> the fix (that we can reverse-next without stopping at a recursive call site), the second >> case is just resetting the inferior's state into a known position in case the first test >> fails, so we dont get the whole test being useless if this feature ever regresses. That's >> why the variable I used is called "step_out", it knows if we have to manually step out of >> all recursive calls. You could simplify this case into a single test by only emitting a >> pass/fail when you leave the loop and asserting that step_out is 0, but I thought it would >> be even harder to figure out what the test was doing in the future. >> >> Another problem with adding a maximum iteration count (and the reason I didn't go for it >> originally) is because you have to have enough iterations to step out of 4 function calls, >> and I didn't want to hard code that number because changes to the function would not be >> immediately obvious while the use case continued to work. Another way to know when the >> inferior is at the expected state is to look for the message that we're back at the main >> function, so we avoid using ".*" in the match. > Thank you both for the quick review and feedback. Regarding adding a > maximum iteration count, I agree with Tom that some ceiling is useful to > avoid the possibility of the testcase running for hours, as has been > happening in my case. Would it be ok if the maximum number of iterations > was very big, such as 100 or 1000? That would leave room for future > changes to the testcase while providing a bounded run time. > I'm not against this idea in itself, but I think having the regex contain "main" so that it picks up the following output: (gdb) reverse-next main () at /home/blarsen/Documents/fsf_build/gdb/testsuite/../../../binutils-gdb/gdb/testsuite/gdb.reverse/step-reverse.c:69 69 recursive_callee (32); /* NEXT OVER THIS RECURSION */ which is when we left recursive_callee. This would make it so gdb_test_multiple picks up the fact that the inferior finished running.
Bruno Larsen <blarsen@redhat.com> writes: > On 01/06/2023 15:27, Thiago Jung Bauermann wrote: >> Hello Tom, Hello Bruno, >> >> Bruno Larsen <blarsen@redhat.com> writes: >> >>> On 01/06/2023 11:15, Tom de Vries wrote: >>>> On 5/31/23 22:06, Thiago Jung Bauermann via Gdb-patches wrote: >>>>> This testcase sometimes gets stuck in a loop for hours when running in our >>>>> CI. The problem is that due to an issue unrelated to reverse debugging the >>>>> inferior exits early, and because of the overly generic ".*" pattern the >>>>> testcase keeps sending the "next" command without noticing that the >>>>> inferior is gone. >>>>> >>>>> gdb_test_multiple has a pattern to detect that "The program is not being >>>>> run.", but since it is placed after the patterns from the caller it won't >>>>> be triggered. It also has a timeout pattern, but for some reason I don't >>>>> understand it often doesn't trigger. >>>>> >>>>> Since the test binary is compiled with debug information, fix by changing >>>>> the pattern to match the source code line number that is shown by GDB right >>>>> after the "step" command. >>>>> --- >>>>> gdb/testsuite/gdb.reverse/step-reverse.exp | 4 ++-- >>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/gdb/testsuite/gdb.reverse/step-reverse.exp >>>>> b/gdb/testsuite/gdb.reverse/step-reverse.exp >>>>> index 729218d4cb8c..766ca02910af 100644 >>>>> --- a/gdb/testsuite/gdb.reverse/step-reverse.exp >>>>> +++ b/gdb/testsuite/gdb.reverse/step-reverse.exp >>>>> @@ -261,7 +261,7 @@ if { "$step_out" == 1 } { >>>>> -re -wrap "NEXT OVER THIS RECURSION.*" { >>>>> set step_out 0 >>>>> } >>>>> - -re -wrap ".*" { >>>>> + -re -wrap "^\[0-9\].*" { >>>>> send_gdb "next\n" >>>>> exp_continue >>>>> } >>>>> @@ -286,7 +286,7 @@ gdb_test_multiple "next" "step over recursion inside the recursion" >>>>> { >>>>> gdb_assert {"$seen_recursive_call" == 1} \ >>>>> "step over recursion inside the recursion" >>>>> } >>>>> - -re -wrap ".*" { >>>>> + -re -wrap "^\[0-9\].*" { >>>>> send_gdb "next\n" >>>>> exp_continue >>>>> } >>>> Hi, >>>> >>>> the fix as is addresses the immediate problem. >>>> >>>> Introducing an iteration bound as a safety net is probably a good idea as well, given >>>> the problem you ran into. >>>> >>>> Also, I prefer a different style of fix. I think that this gdb_test_multiple is overly >>>> complicated by dealing with the prompt in two clauses. >>>> >>>> I came up with this as first version which removes the need for any .* usage (though to >>>> be clear, -wrap "" and -wrap ".*" is the same thing): >>>> ... >>>> if { "$step_out" == 1 } { >>>> gdb_test_multiple "next" "stepping out of recursion" { >>>> -re "NEXT OVER THIS RECURSION" { >>>> set step_out 0 >>>> exp_continue >>>> } >>>> -re -wrap "" { >>>> if { $step_out == 1 } { >>>> send_gdb "next\n" >>>> exp_continue >>>> } >>>> pass $gdb_test_name >>>> } >>>> } >>>> } >>>> ... >>>> and this after adding the iteration bound: >>>> ... >>>> if { "$step_out" == 1 } { >>>> set iterations 0 >>>> set max_iterations 10 >>>> gdb_test_multiple "next" "stepping out of recursion" { >>>> -re "NEXT OVER THIS RECURSION" { >>>> set step_out 0 >>>> exp_continue >>>> } >>>> -re -wrap "" { >>>> if { $step_out == 1 } { >>>> incr iterations >>>> if { $iterations == $max_iterations } { >>>> fail $gdb_test_name >>>> return >>>> } >>>> >>>> send_gdb "next\n" >>>> exp_continue >>>> } >>>> pass $gdb_test_name >>>> } >>>> } >>>> } >>>> ... >>>> >>>> [ The gdb_test_multiple can be even further simplified, by wrapping it in a while that >>>> takes care of the iteration. ] >>>> >>>> This code doesn't actually trigger for me, so I added this: >>>> ... >>>> -gdb_test_multiple "next" "reverse next over recursion" { >>>> +gdb_test_multiple "step" "reverse next over recursion" { >>>> ... >>>> >>>> That gave me the default fail: >>>> ... >>>> (gdb) PASS: gdb.reverse/step-reverse.exp: reverse next over call >>>> step^M >>>> recursive_callee (val=17) at >>>> /data/vries/gdb/src/gdb/testsuite/gdb.reverse/step-reverse.c:41^M >>>> 41 } /* EXIT RECURSIVE FUNCTION */^M >>>> (gdb) FAIL: gdb.reverse/step-reverse.exp: reverse next over recursion >>>> ... >>>> which did not set step_out to 1, so still the code didn't get triggered,so I did: >>>> ... >>>> - -re -wrap ".*RECURSIVE CALL.*" { >>>> + -re -wrap ".*(RECURSIVE CALL|EXIT RECURSIVE FUNCTION).*" { >>>> ... >>>> >>>> ISTM that the test-case contains complicated and hard-to-trigger code, which could be >>>> fixed by merging the code for the next and step scenario. >>>> >>> Some of the changes you introduced are not correct with the original idea of the test. As >>> a refresher, the problem that is being tested is that GDB would not be able to >>> reverse-next and skip recursive functions (details: PR 16678). It would skip the first >>> call then stop, which in this test case would still be 4 calls deep into the recursion >>> when we wanted to stop on main. >>> >>> The reason the test works in 2 clauses is because only the first part is actually testing >>> the fix (that we can reverse-next without stopping at a recursive call site), the second >>> case is just resetting the inferior's state into a known position in case the first test >>> fails, so we dont get the whole test being useless if this feature ever regresses. That's >>> why the variable I used is called "step_out", it knows if we have to manually step out of >>> all recursive calls. You could simplify this case into a single test by only emitting a >>> pass/fail when you leave the loop and asserting that step_out is 0, but I thought it would >>> be even harder to figure out what the test was doing in the future. >>> >>> Another problem with adding a maximum iteration count (and the reason I didn't go for it >>> originally) is because you have to have enough iterations to step out of 4 function calls, >>> and I didn't want to hard code that number because changes to the function would not be >>> immediately obvious while the use case continued to work. Another way to know when the >>> inferior is at the expected state is to look for the message that we're back at the main >>> function, so we avoid using ".*" in the match. >> Thank you both for the quick review and feedback. Regarding adding a >> maximum iteration count, I agree with Tom that some ceiling is useful to >> avoid the possibility of the testcase running for hours, as has been >> happening in my case. Would it be ok if the maximum number of iterations >> was very big, such as 100 or 1000? That would leave room for future >> changes to the testcase while providing a bounded run time. >> > I'm not against this idea in itself, but I think having the regex contain "main" so that > it picks up the following output: > > (gdb) reverse-next > main () > at > /home/blarsen/Documents/fsf_build/gdb/testsuite/../../../binutils-gdb/gdb/testsuite/gdb.reverse/step-reverse.c:69 > 69 recursive_callee (32); /* NEXT OVER THIS RECURSION */ > > which is when we left recursive_callee. This would make it so gdb_test_multiple picks up > the fact that the inferior finished running. I just sent a v2, with one of the patterns changed to match the main function for the case above. Please let me know if that's what you had in mind. It also adds a maximum iteration count of 1000 (thank you Tom, for providing the code). With the new regexes, I don't hit it because gdb_test_multiple matches the message "The program is not being run." but if I change to the old regex again then I get, as expected: FAIL: gdb.reverse/step-reverse.exp: step over recursion inside the recursion (reached 1000 iterations)
diff --git a/gdb/testsuite/gdb.reverse/step-reverse.exp b/gdb/testsuite/gdb.reverse/step-reverse.exp index 729218d4cb8c..766ca02910af 100644 --- a/gdb/testsuite/gdb.reverse/step-reverse.exp +++ b/gdb/testsuite/gdb.reverse/step-reverse.exp @@ -261,7 +261,7 @@ if { "$step_out" == 1 } { -re -wrap "NEXT OVER THIS RECURSION.*" { set step_out 0 } - -re -wrap ".*" { + -re -wrap "^\[0-9\].*" { send_gdb "next\n" exp_continue } @@ -286,7 +286,7 @@ gdb_test_multiple "next" "step over recursion inside the recursion" { gdb_assert {"$seen_recursive_call" == 1} \ "step over recursion inside the recursion" } - -re -wrap ".*" { + -re -wrap "^\[0-9\].*" { send_gdb "next\n" exp_continue }