[v3,1/2] gdb/record: print frame information when exiting a recursive call

Message ID 20230927101950.1913970-4-blarsen@redhat.com
State New
Headers
Series Improving frame printing with recursive |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm fail Testing failed

Commit Message

Guinevere Larsen Sept. 27, 2023, 10:19 a.m. UTC
  Currently,  when GDB is reverse stepping out of a function into the same
function due to a recursive call, it doesn't print frame information, as
reported by PR record/29178. This happens because when the inferior
leaves the current frame, GDB decides to refresh the step information,
clobbering the original step_frame_id, making it impossible to figure
out later on that the frame has been changed.

This commit changes GDB so that, if we notice we're in this exact
situation, we won't refresh the step information.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29178
---
 gdb/infrun.c                            | 18 ++++++++++
 gdb/testsuite/gdb.reverse/recursion.c   | 44 ++++++++++++++++++++++++
 gdb/testsuite/gdb.reverse/recursion.exp | 45 +++++++++++++++++++++++++
 3 files changed, 107 insertions(+)
 create mode 100644 gdb/testsuite/gdb.reverse/recursion.c
 create mode 100644 gdb/testsuite/gdb.reverse/recursion.exp
  

Comments

Kevin Buettner Sept. 27, 2023, 9:26 p.m. UTC | #1
Hi Gwen,

On Wed, 27 Sep 2023 12:19:51 +0200
Guinevere Larsen <blarsen@redhat.com> wrote:

>  gdb/infrun.c                            | 18 ++++++++++
>  gdb/testsuite/gdb.reverse/recursion.c   | 44 ++++++++++++++++++++++++
>  gdb/testsuite/gdb.reverse/recursion.exp | 45 +++++++++++++++++++++++++
>  3 files changed, 107 insertions(+)
>  create mode 100644 gdb/testsuite/gdb.reverse/recursion.c
>  create mode 100644 gdb/testsuite/gdb.reverse/recursion.exp

I've applied your patch and have tested on Fedora 38, for both
x86_64 and aarch64 (both native).  I see no regressions on aarch64,
but on x86_64, I'm seeing this failure when your patch is applied:

FAIL: gdb.dwarf2/dw2-out-of-range-end-of-seq.exp: END with address 1 eliminated

After studying the log files, this makes absolutely no sense to me
since your patch should not have any affect on the "maint info line-table"
command.  Here are the relevant sections of the log files:

- - - passing run - - -
(gdb) break -qualified main
Breakpoint 1 at 0x40110a: file /ironwood1/sourceware-git/f38-review/bld/../../worktree-review/gdb/testsuite/gdb.dwarf2/main.c, line 1.
(gdb) run 
Starting program: /mesquite2/sourceware-git/f38-review/bld/gdb/testsuite/outputs/gdb.dwarf2/dw2-out-of-range-end-of-seq/dw2-out-of-range-end-of-seq 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, 0x000000000040110a in main () at /ironwood1/sourceware-git/f38-review/bld/../../worktree-review/gdb/testsuite/gdb.dwarf2/main.c:1
1	/* This testcase is part of GDB, the GNU debugger.
(gdb) maint info line-table main.c$
objfile: /mesquite2/sourceware-git/f38-review/bld/gdb/testsuite/outputs/gdb.dwarf2/dw2-out-of-range-end-of-seq/dw2-out-of-range-end-of-seq ((struct objfile *) 0x18e5670)
compunit_symtab: main.c ((struct compunit_symtab *) 0x18e5b90)
symtab: /mesquite2/sourceware-git/worktree-review/gdb/testsuite/gdb.dwarf2/main.c ((struct symtab *) 0x18e5c10)
linetable: ((struct linetable *) 0x18e5e30):
INDEX  LINE   REL-ADDRESS        UNREL-ADDRESS      IS-STMT PROLOGUE-END 
0      1      0x0000000000401106 0x0000000000401106 Y                    
1      END    0x0000000000401111 0x0000000000401111 Y                    

(gdb) PASS: gdb.dwarf2/dw2-out-of-range-end-of-seq.exp: END with address 1 eliminated
testcase /ironwood1/sourceware-git/f38-review/bld/../../worktree-review/gdb/testsuite/gdb.dwarf2/dw2-out-of-range-end-of-seq.exp completed in 1 seconds
- - - end passing run - - -

- - - failing run - - -
(gdb) break -qualified main
Breakpoint 1 at 0x40110a: file /ironwood1/sourceware-git/f38-review/bld/../../worktree-review/gdb/testsuite/gdb.dwarf2/main.c, line 1.
(gdb) run 
Starting program: /mesquite2/sourceware-git/f38-review/bld/gdb/testsuite/outputs/gdb.dwarf2/dw2-out-of-range-end-of-seq/dw2-out-of-range-end-of-seq 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, 0x000000000040110a in main () at /ironwood1/sourceware-git/f38-review/bld/../../worktree-review/gdb/testsuite/gdb.dwarf2/main.c:1
1	/* This testcase is part of GDB, the GNU debugger.
(gdb) maint info line-table main.c$
objfile: /mesquite2/sourceware-git/f38-review/bld/gdb/testsuite/outputs/gdb.dwarf2/dw2-out-of-range-end-of-seq/dw2-out-of-range-end-of-seq ((struct objfile *) 0x24a85b0)
compunit_symtab: main.c ((struct compunit_symtab *) 0x24a8ad0)
symtab: /mesquite2/sourceware-git/worktree-review/gdb/testsuite/gdb.dwarf2/main.c ((struct symtab *) 0x24a8b50)
linetable: ((struct linetable *) 0x24a8d70):
INDEX  LINE   REL-ADDRESS        UNREL-ADDRESS      IS-STMT PROLOGUE-END 
0      1      0x0000000000401106 0x0000000000401106 Y                    
1      END    0x0000000000401111 0x0000000000401111 Y                    

objfile: /usr/lib/debug/usr/lib64/libc.so.6-2.37-4.fc38.x86_64.debug ((struct objfile *) 0x26435b0)
compunit_symtab: <unknown> ((struct compunit_symtab *) 0x2eb2260)
symtab: /usr/src/debug/glibc-2.37-4.fc38.x86_64/intl/textdomain.c ((struct symtab *) 0x2f60430)
linetable: ((struct linetable *) 0x0):
No line table.

objfile: /usr/lib/debug/usr/lib64/libc.so.6-2.37-4.fc38.x86_64.debug ((struct objfile *) 0x26435b0)
compunit_symtab: <unknown> ((struct compunit_symtab *) 0x2eb2260)
symtab: /usr/src/debug/glibc-2.37-4.fc38.x86_64/support/support_test_main.c ((struct symtab *) 0x2f725d0)
linetable: ((struct linetable *) 0x0):
No line table.

(gdb) FAIL: gdb.dwarf2/dw2-out-of-range-end-of-seq.exp: END with address 1 eliminated
testcase /ironwood1/sourceware-git/f38-review/bld/../../worktree-review/gdb/testsuite/gdb.dwarf2/dw2-out-of-range-end-of-seq.exp completed in 1 seconds
- - - end failing run - - -

The difference here is that the failing run is (also) attempting to
list the line table for libc (and twice for some reason).  As noted
earlier, I don't see how your change can affect this area of the code,
but, multiple times, I've done a "git reset --hard" to get rid of your
commit, followed by a rebuild and test, each time seeing the test pass. 
After that, I've applied your patch, rebuilt, and tested again, each
time seeing this test fail.

Are you able to reproduce this problem?

(I'll continue to study it here - there may be something weird going
on with the VM upon which I'm testing...)

Kevin
  
Kevin Buettner Sept. 27, 2023, 11:22 p.m. UTC | #2
On Wed, 27 Sep 2023 14:26:19 -0700
Kevin Buettner wrote:

> (I'll continue to study it here - there may be something weird going
> on with the VM upon which I'm testing...)

Using a number of Fedora releases (and VMs), I tested
gdb.dwarf2/dw2-out-of-range-end-of-seq.exp with GDB builds with Gwen's
patch applied...

F35: pass
F36: pass
F37: pass
F38: fail
F38 (updated): fail
F38 (updated, no -O2 in GDB part of build): fail
F38 (circa May, 2023): fail
F38 (aarch64): pass
F39: pass
F40 (aka rawhide): pass

I used two different x86_64 F38 VMs, but they're related.  The "circa
May" F38 VM was cloned from the main one that I use for much of my
work, but I cloned it to look at a specific problem and
(intentionally) had not updated it since cloning.

I've also removed the debuginfod cache. Even with it deleted, I'm
still seeing the same failure on the x86_64 Fedora 38 VMs.

For all machines which show a pass for dw2-out-of-range-end-of-seq.exp,
I ran the new test using TESTS=gdb.reverse/recursion.exp to make sure
that I was really testing against a build with Gwen's patch applied. 
In each case, I saw 8 PASSes, so I was indeed testing what I had
thought I was.

I'll now do some actual debugging, comparing what happens with and
without Gwen's patch...

Kevin
  
Kevin Buettner Sept. 28, 2023, 1:57 a.m. UTC | #3
On Wed, 27 Sep 2023 16:22:31 -0700
Kevin Buettner <kevinb@redhat.com> wrote:

> I'll now do some actual debugging, comparing what happens with and
> without Gwen's patch...

I think I understand what's going on now.  This part of the
patch...

    diff --git a/gdb/infrun.c b/gdb/infrun.c
    index 4730d290442..3491327422d 100644
    --- a/gdb/infrun.c
    +++ b/gdb/infrun.c
    @@ -6923,6 +6923,11 @@ process_event_stop_test (struct execution_control_state *ecs)
       frame = get_current_frame ();
       gdbarch = get_frame_arch (frame);
     
    +  /* Shorthand to make if statements smaller.  */
    +  struct frame_id original_frame_id
    +    = ecs->event_thread->control.step_frame_id;
    +  struct frame_id curr_frame_id = get_frame_id (get_current_frame ());
    +
       switch (what.main_action)
	 {
	 case BPSTAT_WHAT_SET_LONGJMP_RESUME:

...is causing get_frame_id() to be called in situations where it might
not have been called before.  For the test case in question, it's
causing glibc's symbols to be loaded - they weren't before.  If the
new code is either removed or moved to a location later in the
function, then glibc's symbols aren't loaded.

Is this a problem?

For most (almost all?) debugging scenarios, this will merely cause
some symbols to be loaded somewhat earlier in the debugging session
than they would have been otherwise.  Therefore, I don't think it's a
problem.  However, if someone has a different point of view, I'd like
to hear it...

How should we fix the regresssion?

The problematic command in gdb.dwarf2/dw2-out-of-range-end-of-seq.exp
is:

    maint info line-table main.c$

The 'main.c$' part of it is actually a regular expression.  Due
to the fact that glibc has been loaded earlier than it had been
in the past, this RE is matching main.c, which is what we want,
but also textdomain.c and support_test_main.c, which we don't.
These latter two files are from glibc.  The relevant lines from
the log file are as follows:

    (gdb) maint info line-table main.c$
    objfile: /mesquite2/sourceware-git/f38-review/bld/gdb/testsuite/outputs/gdb.dwarf2/dw2-out-of-range-end-of-seq/dw2-out-of-range-end-of-seq ((struct objfile *) 0x2273d50)
    compunit_symtab: main.c ((struct compunit_symtab *) 0x2267b80)
    symtab: /mesquite2/sourceware-git/worktree-review/gdb/testsuite/gdb.dwarf2/main.c ((struct symtab *) 0x2267c00)
    linetable: ((struct linetable *) 0x2267e20):
    INDEX  LINE   REL-ADDRESS        UNREL-ADDRESS      IS-STMT PROLOGUE-END 
    0      1      0x0000000000401106 0x0000000000401106 Y                    
    1      END    0x0000000000401111 0x0000000000401111 Y                    

    objfile: /usr/lib/debug/usr/lib64/libc.so.6-2.37-5.fc38.x86_64.debug ((struct objfile *) 0x23eecc0)
    compunit_symtab: <unknown> ((struct compunit_symtab *) 0x2c47a50)
    symtab: /usr/src/debug/glibc-2.37-5.fc38.x86_64/intl/textdomain.c ((struct symtab *) 0x2c71530)
    linetable: ((struct linetable *) 0x0):
    No line table.

    objfile: /usr/lib/debug/usr/lib64/libc.so.6-2.37-5.fc38.x86_64.debug ((struct objfile *) 0x23eecc0)
    compunit_symtab: <unknown> ((struct compunit_symtab *) 0x2c47a50)
    symtab: /usr/src/debug/glibc-2.37-5.fc38.x86_64/support/support_test_main.c ((struct symtab *) 0x2c87410)
    linetable: ((struct linetable *) 0x0):
    No line table.

If the command is changed to "maint info line-table \bmain.c$", the
word boundary anchor \b will cause only main.c to be matched.

(Ideally, the .  would also be escaped with '\' so that it would only
match a literal .  (period) and not any character.)

Kevin
  
Guinevere Larsen Oct. 2, 2023, 1:11 p.m. UTC | #4
On 28/09/2023 03:57, Kevin Buettner wrote:
> On Wed, 27 Sep 2023 16:22:31 -0700
> Kevin Buettner <kevinb@redhat.com> wrote:
>
>> I'll now do some actual debugging, comparing what happens with and
>> without Gwen's patch...
> I think I understand what's going on now.  This part of the
> patch...
>
>      diff --git a/gdb/infrun.c b/gdb/infrun.c
>      index 4730d290442..3491327422d 100644
>      --- a/gdb/infrun.c
>      +++ b/gdb/infrun.c
>      @@ -6923,6 +6923,11 @@ process_event_stop_test (struct execution_control_state *ecs)
>         frame = get_current_frame ();
>         gdbarch = get_frame_arch (frame);
>       
>      +  /* Shorthand to make if statements smaller.  */
>      +  struct frame_id original_frame_id
>      +    = ecs->event_thread->control.step_frame_id;
>      +  struct frame_id curr_frame_id = get_frame_id (get_current_frame ());
>      +
>         switch (what.main_action)
> 	 {
> 	 case BPSTAT_WHAT_SET_LONGJMP_RESUME:
>
> ...is causing get_frame_id() to be called in situations where it might
> not have been called before.  For the test case in question, it's
> causing glibc's symbols to be loaded - they weren't before.  If the
> new code is either removed or moved to a location later in the
> function, then glibc's symbols aren't loaded.

Huh! I moved this code further up from v2, but I didn't expect this 
change to make any difference at all. Thanks for spotting it!

This is only that far up because then I could simplify more stuff on 
patch 2.

>
> Is this a problem?
>
> For most (almost all?) debugging scenarios, this will merely cause
> some symbols to be loaded somewhat earlier in the debugging session
> than they would have been otherwise.  Therefore, I don't think it's a
> problem.  However, if someone has a different point of view, I'd like
> to hear it...

I'll give it a week or so before sending a new version with your 
suggestions.

If the ultimate decision is to leave things as they are, it is trivial 
to revert my patch to where the test case doesn't have that regression.
  

Patch

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 4730d290442..3491327422d 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -6923,6 +6923,11 @@  process_event_stop_test (struct execution_control_state *ecs)
   frame = get_current_frame ();
   gdbarch = get_frame_arch (frame);
 
+  /* Shorthand to make if statements smaller.  */
+  struct frame_id original_frame_id
+    = ecs->event_thread->control.step_frame_id;
+  struct frame_id curr_frame_id = get_frame_id (get_current_frame ());
+
   switch (what.main_action)
     {
     case BPSTAT_WHAT_SET_LONGJMP_RESUME:
@@ -7722,6 +7727,19 @@  process_event_stop_test (struct execution_control_state *ecs)
 			       "it's not the start of a statement");
 	}
     }
+  else if (execution_direction == EXEC_REVERSE
+	  && curr_frame_id != original_frame_id
+	  && original_frame_id.code_addr_p && curr_frame_id.code_addr_p
+	  && original_frame_id.code_addr == curr_frame_id.code_addr)
+    {
+      /* If we enter here, we're leaving a recursive function call.  In this
+	 situation, we shouldn't refresh the step information, because if we
+	 do, we'll lose the frame_id of when we started stepping, and this
+	 will make GDB not know we need to print frame information.  */
+      refresh_step_info = false;
+      infrun_debug_printf ("reverse stepping, left a recursive call, don't "
+			   "update step info so we remember we left a frame");
+    }
 
   /* We aren't done stepping.
 
diff --git a/gdb/testsuite/gdb.reverse/recursion.c b/gdb/testsuite/gdb.reverse/recursion.c
new file mode 100644
index 00000000000..5ce1c8dbea0
--- /dev/null
+++ b/gdb/testsuite/gdb.reverse/recursion.c
@@ -0,0 +1,44 @@ 
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2023 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+/* Test GDB's ability to handle recursive functions when executing
+   in reverse.  */
+
+/* The recursive foo call must be the first line of the recursive
+   function, to test that we're not stepping too much and skipping
+   multiple calls when we should skip only one.  */
+int
+foo (int x)
+{
+  if (x) return foo (x-1);
+  return 0;
+}
+
+int
+bar (int x)
+{
+  int r = foo (x);
+  return 2*r;
+}
+
+int
+main ()
+{
+  int i = bar (5);
+  int j = foo (5);
+  return 0;			/* END OF MAIN */
+}
diff --git a/gdb/testsuite/gdb.reverse/recursion.exp b/gdb/testsuite/gdb.reverse/recursion.exp
new file mode 100644
index 00000000000..3fead0e8c8d
--- /dev/null
+++ b/gdb/testsuite/gdb.reverse/recursion.exp
@@ -0,0 +1,45 @@ 
+# Copyright 2008-2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+# This file is part of the GDB testsuite.  It tests reverse stepping
+# out of recursive functions.
+
+require supports_reverse
+
+standard_testfile
+
+if { [prepare_for_testing "failed to prepare" $testfile $srcfile] } {
+    return -1
+}
+
+runto_main
+
+if [supports_process_record] {
+    # Activate process record/replay
+    gdb_test_no_output "record" "turn on process record"
+}
+
+set end_of_program [gdb_get_line_number "END OF MAIN" "$srcfile"]
+gdb_breakpoint $end_of_program
+gdb_continue_to_breakpoint ".*$srcfile/$end_of_program.*"
+
+## test if GDB can reverse over a recursive program
+gdb_test "reverse-next" ".*int j = foo.*" "Skipping recursion from outside"
+## setup and next over a recursion for inside a recursive call
+repeat_cmd_until "reverse-step" ".*" ".*foo .x=4.*"
+gdb_test "reverse-next" ".*return foo.*" "Skipping recursion from inside"
+gdb_test "reverse-next" ".*foo .x=5.*" "print frame when stepping out"
+gdb_test "reverse-next" ".*bar .x=5.*" "stepping into a different function"
+gdb_test "reverse-next" "main .. at .*" "stepping back to main"