[PATCHv2] gdb: handle DW_AT_entry_pc pointing at an empty sub-range

Message ID 79756797913a38e36ffd4e244c6bd0553613eb4a.1732815309.git.aburgess@redhat.com
State New
Headers
Series [PATCHv2] gdb: handle DW_AT_entry_pc pointing at an empty sub-range |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Test passed

Commit Message

Andrew Burgess Nov. 28, 2024, 5:44 p.m. UTC
  Bernd,

Here's a v2 with an extended DWARF assembler test that now includes a
line table in some cases.  Specifically, when I test with label foo_6,
the address which is the end of a sub-range, which is also the end of
the entire block range, then I add a line table which indicates that
this address is the transition back from an inline function to the
outer function.

I've tested this with your series merge on top of this patch, and,
with your modified block lookup logic in place, and with the 'addr <
block->end ()' changed to 'addr <= block->end ()' in
dwarf2_addr_in_block_ranges (as well as in block::set_entry_pc) GDB is
able to stop at the foo_6 label.

I think you were OK with the v1 patch being merged, but I'll give you
some time to comment on the updated test.  If you're happy with this
then I'll get this merged.

Thanks,
Andrew

--

The test gdb.cp/step-and-next-inline.exp creates a test binary called
step-and-next-inline-no-header.  This test includes a function
`tree_check` which is inlined 3 times.

When testing with some older versions of gcc (I've tried 8.4.0, 9.3.1)
we see the following DWARF representing one of the inline instances of
tree_check:

 <2><8d9>: Abbrev Number: 38 (DW_TAG_inlined_subroutine)
    <8da>   DW_AT_abstract_origin: <0x9ee>
    <8de>   DW_AT_entry_pc    : 0x401165
    <8e6>   DW_AT_GNU_entry_view: 0
    <8e7>   DW_AT_ranges      : 0x30
    <8eb>   DW_AT_call_file   : 1
    <8ec>   DW_AT_call_line   : 52
    <8ed>   DW_AT_call_column : 10
    <8ee>   DW_AT_sibling     : <0x92d>

 ...

 <1><9ee>: Abbrev Number: 46 (DW_TAG_subprogram)
    <9ef>   DW_AT_external    : 1
    <9ef>   DW_AT_name        : (indirect string, offset: 0xe8): tree_check
    <9f3>   DW_AT_decl_file   : 1
    <9f4>   DW_AT_decl_line   : 38
    <9f5>   DW_AT_decl_column : 1
    <9f6>   DW_AT_linkage_name: (indirect string, offset: 0x2f2): _Z10tree_checkP4treei
    <9fa>   DW_AT_type        : <0x9e8>
    <9fe>   DW_AT_inline      : 3       (declared as inline and inlined)
    <9ff>   DW_AT_sibling     : <0xa22>

 ...

 Contents of the .debug_ranges section:

    Offset   Begin    End
    ...
    00000030 0000000000401165 0000000000401165 (start == end)
    00000030 0000000000401169 0000000000401173
    00000030 0000000000401040 0000000000401045
    00000030 <End of list>
    ...

Notice that one of the sub-ranges of tree-check is empty, this is the
line marked 'start == end'.  As the end address is the first address
after the range, this range cover absolutely no code.

But notice too that the DW_AT_entry_pc for the inline instance points
at this empty range.

Further, notice that despite the ordering of the sub-ranges, the empty
range is actually in the middle of the region defined by the lowest
address to the highest address.  The ordering is not a problem, the
DWARF spec doesn't require that ranges be in any particular order.

However, this empty range is causing issues with GDB newly acquire
DW_AT_entry_pc support.

GDB already rejects, and has done for a long time, empty sub-ranges,
after all, the DWARF spec is clear that such a range covers no code.

The recent DW_AT_entry_pc patch also had GDB reject an entry-pc which
was outside of the low/high bounds of a block.

But in this case, the entry-pc value is within the bounds of a block,
it's just not within any useful sub-range.  As a consequence, GDB is
storing the entry-pc value, and making use of it, but when GDB stops,
and tries to work out which block the inferior is in, it fails to spot
that the inferior is within tree_check, and instead reports the
function into which tree_check was inlined.

I've tested with newer versions of gcc (12.2.0 and 14.2.0) and with
these versions gcc is still generating the empty sub-range, but now
this empty sub-range is no longer the entry point.  Here's the
corresponding ranges table from gcc 14.2.0:

  Contents of the .debug_rnglists section:

   Table at Offset: 0:
    Length:          0x56
    DWARF version:   5
    Address size:    8
    Segment size:    0
    Offset entries:  0
      Offset   Begin    End
      ...
      00000021 0000000000401165 000000000040116f
      0000002b 0000000000401040 (base address)
      00000034 0000000000401040 0000000000401040  (start == end)
      00000037 0000000000401041 0000000000401046
      0000003a <End of list>
      ...

The DW_AT_entry_pc is 0x401165, but this is not the empty sub-range,
as a result, when GDB stops at the entry-pc, GDB will correctly spot
that the inferior is in the tree_check function.

The fix I propose here is, instead of rejecting entry-pc values that
are outside the block's low/high range, instead reject entry-pc values
that are not inside any of the block's sub-ranges.

Now, GDB will ignore the prescribed entry-pc, and will instead select
a suitable default entry-pc based on either the block's low-pc value,
or the first address of the first range.

I have extended the gdb.cp/step-and-next-inline.exp test to check this
case, but this does depend on the compiler version being used (newer
compilers will always pass, even without the fix).

So I have also added a DWARF assembler test to cover this case.

Reviewed-By: Kevin Buettner <kevinb@redhat.com>
---
 gdb/dwarf2/read.c                             |  24 +-
 gdb/testsuite/gdb.cp/step-and-next-inline.exp |  26 ++
 .../gdb.dwarf2/dw2-unexpected-entry-pc.c      |  57 ++++
 .../gdb.dwarf2/dw2-unexpected-entry-pc.exp    | 250 ++++++++++++++++++
 4 files changed, 356 insertions(+), 1 deletion(-)
 create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.c
 create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.exp


base-commit: dfb65386a5c3ec70f3f684b1f4a36da179aa023b
  

Comments

Bernd Edlinger Nov. 29, 2024, 2:19 p.m. UTC | #1
Hi Andrew,

On 11/28/24 18:44, Andrew Burgess wrote:
> Bernd,
> 
> Here's a v2 with an extended DWARF assembler test that now includes a
> line table in some cases.  Specifically, when I test with label foo_6,
> the address which is the end of a sub-range, which is also the end of
> the entire block range, then I add a line table which indicates that
> this address is the transition back from an inline function to the
> outer function.
> 

Respect!


> I've tested this with your series merge on top of this patch, and,
> with your modified block lookup logic in place, and with the 'addr <
> block->end ()' changed to 'addr <= block->end ()' in
> dwarf2_addr_in_block_ranges (as well as in block::set_entry_pc) GDB is
> able to stop at the foo_6 label.
> 
> I think you were OK with the v1 patch being merged, but I'll give you
> some time to comment on the updated test.  If you're happy with this
> then I'll get this merged.
> 

Yes, I am OK with the v1 patch to be merged now.

> Thanks,
> Andrew
> 

P.S: As I was pretty busy the last days, I have not been able to
give the requested data earlier, so here I have the line table of the
step-and-next-inline-no-header, compiled with gcc-9.4.0:

$ readelf -w step-and-next-inline-no-header
[...]
 <2><8d6>: Abbrev Number: 38 (DW_TAG_inlined_subroutine)
    <8d7>   DW_AT_abstract_origin: <0x9c7>
    <8db>   DW_AT_entry_pc    : 0x1189
    <8e3>   DW_AT_GNU_entry_view: 0
    <8e4>   DW_AT_ranges      : 0x30
[...]
    00000030 0000000000001189 0000000000001189 (start == end)
    00000030 000000000000118d 0000000000001197
    00000030 0000000000001060 0000000000001065
    00000030 <End of list>
[...]

so the intersting end pc values are 0x1189 0x1197 and 0x1065.

$ readelf -wL step-and-next-inline-no-header
File name                        Line number    Starting address    View    Stmt
[...]
step-and-next-inline.cc                   54              0x1184       1
step-and-next-inline.cc                   38              0x1189               x  -->+
step-and-next-inline.cc                   40              0x1189       1       x     |
step-and-next-inline.cc                   50              0x1189       2          <--+
step-and-next-inline.cc                   40              0x118d        
step-and-next-inline.cc                   40              0x118f        
step-and-next-inline.cc                   42              0x1197               x  -->+
step-and-next-inline.cc                   43              0x1197       1       x     |
step-and-next-inline.cc                   43              0x1197       2             |
step-and-next-inline.cc                   52              0x1197       3          <--+
step-and-next-inline.cc                   52              0x119a        
[...]
step-and-next-inline.cc                   41              0x1060       1
step-and-next-inline.cc                   41              0x1065        
step-and-next-inline.cc                    -              0x1065        

So we have most of the time one or two statement lines,
followed by one or more non-statement lines at the end_pc, these
statement lines are considered to be weak.  When we have a breakpoint
there it should show the program stepping out of the inline to the
calling function.

Then we have the third line, which is a non-statement line followed
by an end marker at the same pc, we usually ignore those, because they
are unreachable, in this case after the abort(), and could easily be
the beginning of the next function.  If we happen to have a breakpoint
at this pc value, we should never consider any sub-range from the
function before.  And a breakpoint there should show that we just
entered the function.


Thanks
Bernd.


> --
> 
> The test gdb.cp/step-and-next-inline.exp creates a test binary called
> step-and-next-inline-no-header.  This test includes a function
> `tree_check` which is inlined 3 times.
> 
> When testing with some older versions of gcc (I've tried 8.4.0, 9.3.1)
> we see the following DWARF representing one of the inline instances of
> tree_check:
> 
>  <2><8d9>: Abbrev Number: 38 (DW_TAG_inlined_subroutine)
>     <8da>   DW_AT_abstract_origin: <0x9ee>
>     <8de>   DW_AT_entry_pc    : 0x401165
>     <8e6>   DW_AT_GNU_entry_view: 0
>     <8e7>   DW_AT_ranges      : 0x30
>     <8eb>   DW_AT_call_file   : 1
>     <8ec>   DW_AT_call_line   : 52
>     <8ed>   DW_AT_call_column : 10
>     <8ee>   DW_AT_sibling     : <0x92d>
> 
>  ...
> 
>  <1><9ee>: Abbrev Number: 46 (DW_TAG_subprogram)
>     <9ef>   DW_AT_external    : 1
>     <9ef>   DW_AT_name        : (indirect string, offset: 0xe8): tree_check
>     <9f3>   DW_AT_decl_file   : 1
>     <9f4>   DW_AT_decl_line   : 38
>     <9f5>   DW_AT_decl_column : 1
>     <9f6>   DW_AT_linkage_name: (indirect string, offset: 0x2f2): _Z10tree_checkP4treei
>     <9fa>   DW_AT_type        : <0x9e8>
>     <9fe>   DW_AT_inline      : 3       (declared as inline and inlined)
>     <9ff>   DW_AT_sibling     : <0xa22>
> 
>  ...
> 
>  Contents of the .debug_ranges section:
> 
>     Offset   Begin    End
>     ...
>     00000030 0000000000401165 0000000000401165 (start == end)
>     00000030 0000000000401169 0000000000401173
>     00000030 0000000000401040 0000000000401045
>     00000030 <End of list>
>     ...
> 
> Notice that one of the sub-ranges of tree-check is empty, this is the
> line marked 'start == end'.  As the end address is the first address
> after the range, this range cover absolutely no code.
> 
> But notice too that the DW_AT_entry_pc for the inline instance points
> at this empty range.
> 
> Further, notice that despite the ordering of the sub-ranges, the empty
> range is actually in the middle of the region defined by the lowest
> address to the highest address.  The ordering is not a problem, the
> DWARF spec doesn't require that ranges be in any particular order.
> 
> However, this empty range is causing issues with GDB newly acquire
> DW_AT_entry_pc support.
> 
> GDB already rejects, and has done for a long time, empty sub-ranges,
> after all, the DWARF spec is clear that such a range covers no code.
> 
> The recent DW_AT_entry_pc patch also had GDB reject an entry-pc which
> was outside of the low/high bounds of a block.
> 
> But in this case, the entry-pc value is within the bounds of a block,
> it's just not within any useful sub-range.  As a consequence, GDB is
> storing the entry-pc value, and making use of it, but when GDB stops,
> and tries to work out which block the inferior is in, it fails to spot
> that the inferior is within tree_check, and instead reports the
> function into which tree_check was inlined.
> 
> I've tested with newer versions of gcc (12.2.0 and 14.2.0) and with
> these versions gcc is still generating the empty sub-range, but now
> this empty sub-range is no longer the entry point.  Here's the
> corresponding ranges table from gcc 14.2.0:
> 
>   Contents of the .debug_rnglists section:
> 
>    Table at Offset: 0:
>     Length:          0x56
>     DWARF version:   5
>     Address size:    8
>     Segment size:    0
>     Offset entries:  0
>       Offset   Begin    End
>       ...
>       00000021 0000000000401165 000000000040116f
>       0000002b 0000000000401040 (base address)
>       00000034 0000000000401040 0000000000401040  (start == end)
>       00000037 0000000000401041 0000000000401046
>       0000003a <End of list>
>       ...
> 
> The DW_AT_entry_pc is 0x401165, but this is not the empty sub-range,
> as a result, when GDB stops at the entry-pc, GDB will correctly spot
> that the inferior is in the tree_check function.
> 
> The fix I propose here is, instead of rejecting entry-pc values that
> are outside the block's low/high range, instead reject entry-pc values
> that are not inside any of the block's sub-ranges.
> 
> Now, GDB will ignore the prescribed entry-pc, and will instead select
> a suitable default entry-pc based on either the block's low-pc value,
> or the first address of the first range.
> 
> I have extended the gdb.cp/step-and-next-inline.exp test to check this
> case, but this does depend on the compiler version being used (newer
> compilers will always pass, even without the fix).
> 
> So I have also added a DWARF assembler test to cover this case.
> 
> Reviewed-By: Kevin Buettner <kevinb@redhat.com>
> ---
>  gdb/dwarf2/read.c                             |  24 +-
>  gdb/testsuite/gdb.cp/step-and-next-inline.exp |  26 ++
>  .../gdb.dwarf2/dw2-unexpected-entry-pc.c      |  57 ++++
>  .../gdb.dwarf2/dw2-unexpected-entry-pc.exp    | 250 ++++++++++++++++++
>  4 files changed, 356 insertions(+), 1 deletion(-)
>  create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.c
>  create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.exp
> 
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index 5a284be1f90..995abf1405d 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -11343,6 +11343,28 @@ dwarf2_die_base_address (struct die_info *die, struct block *block,
>    return {};
>  }
>  
> +/* Return true if ADDR is within any of the ranges covered by BLOCK.  If
> +   there are no sub-ranges then just check against the block's start and
> +   end addresses, otherwise, check each sub-range covered by the block.  */
> +
> +static bool
> +dwarf2_addr_in_block_ranges (CORE_ADDR addr, struct block *block)
> +{
> +  if (block->ranges ().size () == 0)
> +    return addr >= block->start () && addr < block->end ();
> +
> +  /* Check if ADDR is within any of the block's sub-ranges.  */
> +  for (const blockrange &br : block->ranges ())
> +    {
> +      if (addr >= br.start () && addr < br.end ())
> +	return true;
> +    }
> +
> +  /* ADDR is not within any of the block's sub-ranges.  */
> +  return false;
> +}
> +
> +
>  /* Set the entry PC for BLOCK which represents DIE from CU.  Relies on the
>     range information (if present) already having been read from DIE and
>     stored into BLOCK.  */
> @@ -11403,7 +11425,7 @@ dwarf2_record_block_entry_pc (struct die_info *die, struct block *block,
>  
>  	 To avoid this, ignore entry-pc values that are outside the block's
>  	 range, GDB will then select a suitable default entry-pc.  */
> -      if (entry_pc >= block->start () && entry_pc < block->end ())
> +      if (dwarf2_addr_in_block_ranges (entry_pc, block))
>  	block->set_entry_pc (entry_pc);
>        else
>  	complaint (_("in %s, DIE %s, DW_AT_entry_pc (%s) outside "
> diff --git a/gdb/testsuite/gdb.cp/step-and-next-inline.exp b/gdb/testsuite/gdb.cp/step-and-next-inline.exp
> index af1719dc53a..e16c2cca82d 100644
> --- a/gdb/testsuite/gdb.cp/step-and-next-inline.exp
> +++ b/gdb/testsuite/gdb.cp/step-and-next-inline.exp
> @@ -55,6 +55,32 @@ proc do_test { use_header } {
>  
>      set main_location [gdb_get_line_number "Beginning of main" $srcfile]
>  
> +    if {![runto_main]} {
> +	return
> +    }
> +
> +    gdb_breakpoint tree_check
> +
> +    # Check that GDB can correctly stop in `tree_check`.  On some
> +    # targets. gcc will use DW_AT_ranges to represent the addresses of
> +    # tree_check, and in some cases, will create an empty sub-range
> +    # for some of the tree_check code.  To really confuse things, gcc
> +    # will then set the DW_AT_entry_pc to point at the address of the
> +    # empty sub-range.
> +    #
> +    # The result of this is that GDB would stop at the DW_AT_entry_pc,
> +    # but then GDB would fail to realise that this address was inside
> +    # tree_check.
> +    for { set i 1 } { $i < 4 } { incr i } {
> +	gdb_test "continue" \
> +	    [multi_line \
> +		 "Breakpoint $::decimal\\.$i, (?:$::hex in )?tree_check \\(\[^\r\n\]+\\) at \[^\r\n\]+/$hdrfile:$::decimal" \
> +		 "$::decimal\\s+\[^\r\n\]+"] \
> +	    "stop at tree_check, $i"
> +    }
> +
> +    clean_restart $executable
> +
>      if ![runto $main_location qualified] {
>  	return
>      }
> diff --git a/gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.c b/gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.c
> new file mode 100644
> index 00000000000..cf43727f1fb
> --- /dev/null
> +++ b/gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.c
> @@ -0,0 +1,57 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2024 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +volatile int global_var = 0;
> +
> +void
> +foo (void)	/* foo decl line */
> +{
> +  /* This label is used to find the start of 'foo' when generating the
> +     debug information.  Place nothing before it.  */
> +  asm ("foo_label: .globl foo_label");
> +  ++global_var;
> +
> +  asm ("foo_0: .globl foo_0");
> +  ++global_var;		/* bar call line */
> +
> +  asm ("foo_1: .globl foo_1");
> +  ++global_var;
> +
> +  asm ("foo_2: .globl foo_2");
> +  ++global_var;
> +
> +  asm ("foo_3: .globl foo_3");
> +  ++global_var;
> +
> +  asm ("foo_4: .globl foo_4");
> +  ++global_var;
> +
> +  asm ("foo_5: .globl foo_5");
> +  ++global_var;
> +
> +  asm ("foo_6: .globl foo_6");
> +  ++global_var;
> +
> +  asm ("foo_7: .globl foo_7");
> +}
> +
> +int
> +main (void)
> +{
> +  asm ("main_label: .globl main_label");
> +  foo ();
> +}
> diff --git a/gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.exp b/gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.exp
> new file mode 100644
> index 00000000000..69e1ce67cb9
> --- /dev/null
> +++ b/gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.exp
> @@ -0,0 +1,250 @@
> +# Copyright 2024 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Create an inline function which uses DW_AT_ranges and which has a
> +# DW_AT_entry_pc.
> +#
> +# Within the function's ranges, create an empty sub-range, many
> +# versions of gcc (8.x to at least 14.x) do this, and point the
> +# DW_AT_entry_pc at this empty sub-range (at last 8.x to 9.x did
> +# this).
> +#
> +# Now place a breakpoint on the inline function and run to the
> +# breakpoint, check that GDB reports we are inside the inline
> +# function.
> +#
> +# At one point GDB would use the entry-pc value as the breakpoint
> +# location even though that address is not actually associated with
> +# the inline function.  Now GDB will reject the entry-pc value and
> +# select a suitable default entry-pc value instead, one which is
> +# associated with the inline function.
> +
> +load_lib dwarf.exp
> +
> +require dwarf2_support
> +
> +standard_testfile
> +
> +# This compiles the source file and starts and stops GDB, so run it
> +# before calling prepare_for_testing otherwise GDB will have exited.
> +get_func_info foo
> +
> +if { [prepare_for_testing "failed to prepare" ${testfile} \
> +	  [list ${srcfile}]] } {
> +    return
> +}
> +
> +if ![runto_main] {
> +    return
> +}
> +
> +# Some label addresses, needed to match against the output later.
> +foreach foo {foo_1 foo_2 foo_3 foo_4 foo_5 foo_6} {
> +    set $foo [get_hexadecimal_valueof "&$foo" "UNKNOWN" \
> +		  "get address for $foo label"]
> +}
> +
> +# Some line numbers needed in the generated DWARF.
> +set foo_decl_line [gdb_get_line_number "foo decl line"]
> +set bar_call_line [gdb_get_line_number "bar call line"]
> +
> +if [is_ilp32_target] {
> +    set ptr_type "data4"
> +} else {
> +    set ptr_type "data8"
> +}
> +
> +# Setup the fake DWARF (see comment at top of this file for more
> +# details).  Use DWARF_VERSION (either 4 or 5) to select which type of
> +# ranges are created.  Compile the source and generated DWARF and run
> +# the test.
> +#
> +# The ENTRY_LABEL is the label to use as the entry-pc value.  The
> +# useful choices are 'foo_3', this label is for an empty sub-range,
> +# 'foo_4', this label is within the blocks low/high addresses, but is
> +# not in any sub-range for the block at all, or 'foo_6', this label is
> +# the end address of a non-empty sub-range, and is also the end
> +# address for the whole block.
> +#
> +# The 'foo_4' case is not something that has been seen generated by
> +# any compiler, but it doesn't hurt to test.
> +#
> +# When WITH_LINE_TABLE is true a small snippet of line table will be
> +# generated which covers some parts of the inlined function.  This
> +# makes most sense when being tested with the 'foo_6' label, as that
> +# label is all about handling the end of the inline function case.
> +
> +proc run_test { entry_label dwarf_version with_line_table } {
> +    set dw_testname "${::testfile}-${dwarf_version}-${entry_label}"
> +
> +    if { $with_line_table } {
> +	set dw_testname ${dw_testname}-lt
> +    }
> +
> +    set asm_file [standard_output_file "${dw_testname}.S"]
> +    Dwarf::assemble $asm_file {
> +	upvar dwarf_version dwarf_version
> +	upvar entry_label entry_label
> +
> +	declare_labels lines_table inline_func ranges_label
> +
> +	cu { version $dwarf_version } {
> +	    compile_unit {
> +		{producer "gcc"}
> +		{language @DW_LANG_C}
> +		{name $::srcfile}
> +		{comp_dir /tmp}
> +		{stmt_list $lines_table DW_FORM_sec_offset}
> +		{low_pc 0 addr}
> +	    } {
> +		inline_func: subprogram {
> +		    {name bar}
> +		    {inline @DW_INL_declared_inlined}
> +		}
> +		subprogram {
> +		    {name foo}
> +		    {decl_file 1 data1}
> +		    {decl_line $::foo_decl_line data1}
> +		    {decl_column 1 data1}
> +		    {low_pc $::foo_start addr}
> +		    {high_pc $::foo_len $::ptr_type}
> +		    {external 1 flag}
> +		} {
> +		    inlined_subroutine {
> +			{abstract_origin %$inline_func}
> +			{call_file 1 data1}
> +			{call_line $::bar_call_line data1}
> +			{entry_pc $entry_label addr}
> +			{ranges ${ranges_label} DW_FORM_sec_offset}
> +		    }
> +		}
> +	    }
> +	}
> +
> +	lines {version 2} lines_table {
> +	    include_dir "$::srcdir/$::subdir"
> +	    file_name "$::srcfile" 1
> +
> +	    upvar with_line_table with_line_table
> +
> +	    if {$with_line_table} {
> +		program {
> +		    DW_LNE_set_address foo_label
> +		    line [expr $::bar_call_line - 2]
> +		    DW_LNS_copy
> +
> +		    DW_LNE_set_address foo_0
> +		    line [expr $::bar_call_line - 1]
> +		    DW_LNS_copy
> +
> +		    DW_LNE_set_address foo_1
> +		    line 1
> +		    DW_LNS_copy
> +
> +		    DW_LNE_set_address foo_2
> +		    line 2
> +		    DW_LNS_copy
> +
> +		    DW_LNE_set_address foo_6
> +		    line 10
> +		    DW_LNS_copy
> +
> +		    DW_LNE_set_address foo_6
> +		    line 10
> +		    DW_LNS_negate_stmt
> +		    DW_LNS_copy
> +
> +		    DW_LNE_set_address foo_6
> +		    line $::bar_call_line
> +		    DW_LNS_copy
> +
> +		    DW_LNE_set_address "$::foo_start + $::foo_len"
> +		    DW_LNE_end_sequence
> +		}
> +	    }
> +	}
> +
> +	if { $dwarf_version == 5 } {
> +	    rnglists {} {
> +		table {} {
> +		    ranges_label: list_ {
> +			start_end foo_3 foo_3
> +			start_end foo_1 foo_2
> +			start_end foo_5 foo_6
> +		    }
> +		}
> +	    }
> +	} else {
> +	    ranges { } {
> +		ranges_label: sequence {
> +		    range foo_3 foo_3
> +		    range foo_1 foo_2
> +		    range foo_5 foo_6
> +		}
> +	    }
> +	}
> +    }
> +
> +    if {[prepare_for_testing "failed to prepare" "${dw_testname}" \
> +	     [list $::srcfile $asm_file] {nodebug}]} {
> +	return false
> +    }
> +
> +    if ![runto_main] {
> +	return false
> +    }
> +
> +    # Place a breakpoint on `bar` and run to the breakpoint.  Use
> +    # gdb_test as we want full pattern matching against the stop
> +    # location.
> +    #
> +    # When we have a line table GDB will find a line for the
> +    # breakpoint location, so the output will be different.
> +    if { $with_line_table } {
> +	set re \
> +	    [multi_line \
> +		 "Breakpoint $::decimal, bar \\(\\) at \[^\r\n\]+/$::srcfile:1" \
> +		 "1\\s+\[^\r\n\]+"]
> +    } else {
> +	set re "Breakpoint $::decimal, $::hex in bar \\(\\)"
> +    }
> +    gdb_breakpoint bar
> +    gdb_test "continue" $re
> +
> +    # Inspect the block structure of `bar` at this location.  We are
> +    # expecting that the empty range (that contained the entry-pc) has
> +    # been removed from the block, and that the entry-pc has its
> +    # default value.
> +    gdb_test "maint info blocks" \
> +	[multi_line \
> +	     "\\\[\\(block \\*\\) $::hex\\\] $::foo_1\\.\\.$::foo_6" \
> +	     "  entry pc: $::foo_1" \
> +	     "  inline function: bar" \
> +	     "  symbol count: $::decimal" \
> +	     "  address ranges:" \
> +	     "    $::foo_1\\.\\.$::foo_2" \
> +	     "    $::foo_5\\.\\.$::foo_6"]
> +}
> +
> +foreach_with_prefix dwarf_version { 4 5 } {
> +    # Test various labels without any line table present.
> +    foreach_with_prefix entry_label { foo_3 foo_4 foo_2 foo_6 } {
> +	run_test $entry_label $dwarf_version false
> +    }
> +
> +    # Now test what happens if we use the end address of the block,
> +    # but also supply a line table.  Does GDB do anything different?
> +    run_test foo_6 $dwarf_version true
> +}
> 
> base-commit: dfb65386a5c3ec70f3f684b1f4a36da179aa023b
  
Andrew Burgess Dec. 2, 2024, 10:53 a.m. UTC | #2
Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

> Hi Andrew,
>
> On 11/28/24 18:44, Andrew Burgess wrote:
>> Bernd,
>> 
>> Here's a v2 with an extended DWARF assembler test that now includes a
>> line table in some cases.  Specifically, when I test with label foo_6,
>> the address which is the end of a sub-range, which is also the end of
>> the entire block range, then I add a line table which indicates that
>> this address is the transition back from an inline function to the
>> outer function.
>> 
>
> Respect!
>
>
>> I've tested this with your series merge on top of this patch, and,
>> with your modified block lookup logic in place, and with the 'addr <
>> block->end ()' changed to 'addr <= block->end ()' in
>> dwarf2_addr_in_block_ranges (as well as in block::set_entry_pc) GDB is
>> able to stop at the foo_6 label.
>> 
>> I think you were OK with the v1 patch being merged, but I'll give you
>> some time to comment on the updated test.  If you're happy with this
>> then I'll get this merged.
>> 
>
> Yes, I am OK with the v1 patch to be merged now.

Great.  I pushed this patch.

>
>> Thanks,
>> Andrew
>> 
>
> P.S: As I was pretty busy the last days, I have not been able to
> give the requested data earlier, so here I have the line table of the
> step-and-next-inline-no-header, compiled with gcc-9.4.0:
>
> $ readelf -w step-and-next-inline-no-header
> [...]
>  <2><8d6>: Abbrev Number: 38 (DW_TAG_inlined_subroutine)
>     <8d7>   DW_AT_abstract_origin: <0x9c7>
>     <8db>   DW_AT_entry_pc    : 0x1189
>     <8e3>   DW_AT_GNU_entry_view: 0
>     <8e4>   DW_AT_ranges      : 0x30
> [...]
>     00000030 0000000000001189 0000000000001189 (start == end)
>     00000030 000000000000118d 0000000000001197
>     00000030 0000000000001060 0000000000001065
>     00000030 <End of list>
> [...]
>
> so the intersting end pc values are 0x1189 0x1197 and 0x1065.
>
> $ readelf -wL step-and-next-inline-no-header
> File name                        Line number    Starting address    View    Stmt
> [...]
> step-and-next-inline.cc                   54              0x1184       1
> step-and-next-inline.cc                   38              0x1189               x  -->+
> step-and-next-inline.cc                   40              0x1189       1       x     |
> step-and-next-inline.cc                   50              0x1189       2          <--+
> step-and-next-inline.cc                   40              0x118d        
> step-and-next-inline.cc                   40              0x118f        
> step-and-next-inline.cc                   42              0x1197               x  -->+
> step-and-next-inline.cc                   43              0x1197       1       x     |
> step-and-next-inline.cc                   43              0x1197       2             |
> step-and-next-inline.cc                   52              0x1197       3          <--+
> step-and-next-inline.cc                   52              0x119a        
> [...]
> step-and-next-inline.cc                   41              0x1060       1
> step-and-next-inline.cc                   41              0x1065        
> step-and-next-inline.cc                    -              0x1065        
>
> So we have most of the time one or two statement lines,
> followed by one or more non-statement lines at the end_pc, these
> statement lines are considered to be weak.  When we have a breakpoint
> there it should show the program stepping out of the inline to the
> calling function.
>
> Then we have the third line, which is a non-statement line followed
> by an end marker at the same pc, we usually ignore those, because they
> are unreachable, in this case after the abort(), and could easily be
> the beginning of the next function.  If we happen to have a breakpoint
> at this pc value, we should never consider any sub-range from the
> function before.  And a breakpoint there should show that we just
> entered the function.

Thanks for the details, and the analysis.

Thanks,
Andrew
  

Patch

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 5a284be1f90..995abf1405d 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -11343,6 +11343,28 @@  dwarf2_die_base_address (struct die_info *die, struct block *block,
   return {};
 }
 
+/* Return true if ADDR is within any of the ranges covered by BLOCK.  If
+   there are no sub-ranges then just check against the block's start and
+   end addresses, otherwise, check each sub-range covered by the block.  */
+
+static bool
+dwarf2_addr_in_block_ranges (CORE_ADDR addr, struct block *block)
+{
+  if (block->ranges ().size () == 0)
+    return addr >= block->start () && addr < block->end ();
+
+  /* Check if ADDR is within any of the block's sub-ranges.  */
+  for (const blockrange &br : block->ranges ())
+    {
+      if (addr >= br.start () && addr < br.end ())
+	return true;
+    }
+
+  /* ADDR is not within any of the block's sub-ranges.  */
+  return false;
+}
+
+
 /* Set the entry PC for BLOCK which represents DIE from CU.  Relies on the
    range information (if present) already having been read from DIE and
    stored into BLOCK.  */
@@ -11403,7 +11425,7 @@  dwarf2_record_block_entry_pc (struct die_info *die, struct block *block,
 
 	 To avoid this, ignore entry-pc values that are outside the block's
 	 range, GDB will then select a suitable default entry-pc.  */
-      if (entry_pc >= block->start () && entry_pc < block->end ())
+      if (dwarf2_addr_in_block_ranges (entry_pc, block))
 	block->set_entry_pc (entry_pc);
       else
 	complaint (_("in %s, DIE %s, DW_AT_entry_pc (%s) outside "
diff --git a/gdb/testsuite/gdb.cp/step-and-next-inline.exp b/gdb/testsuite/gdb.cp/step-and-next-inline.exp
index af1719dc53a..e16c2cca82d 100644
--- a/gdb/testsuite/gdb.cp/step-and-next-inline.exp
+++ b/gdb/testsuite/gdb.cp/step-and-next-inline.exp
@@ -55,6 +55,32 @@  proc do_test { use_header } {
 
     set main_location [gdb_get_line_number "Beginning of main" $srcfile]
 
+    if {![runto_main]} {
+	return
+    }
+
+    gdb_breakpoint tree_check
+
+    # Check that GDB can correctly stop in `tree_check`.  On some
+    # targets. gcc will use DW_AT_ranges to represent the addresses of
+    # tree_check, and in some cases, will create an empty sub-range
+    # for some of the tree_check code.  To really confuse things, gcc
+    # will then set the DW_AT_entry_pc to point at the address of the
+    # empty sub-range.
+    #
+    # The result of this is that GDB would stop at the DW_AT_entry_pc,
+    # but then GDB would fail to realise that this address was inside
+    # tree_check.
+    for { set i 1 } { $i < 4 } { incr i } {
+	gdb_test "continue" \
+	    [multi_line \
+		 "Breakpoint $::decimal\\.$i, (?:$::hex in )?tree_check \\(\[^\r\n\]+\\) at \[^\r\n\]+/$hdrfile:$::decimal" \
+		 "$::decimal\\s+\[^\r\n\]+"] \
+	    "stop at tree_check, $i"
+    }
+
+    clean_restart $executable
+
     if ![runto $main_location qualified] {
 	return
     }
diff --git a/gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.c b/gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.c
new file mode 100644
index 00000000000..cf43727f1fb
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.c
@@ -0,0 +1,57 @@ 
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+volatile int global_var = 0;
+
+void
+foo (void)	/* foo decl line */
+{
+  /* This label is used to find the start of 'foo' when generating the
+     debug information.  Place nothing before it.  */
+  asm ("foo_label: .globl foo_label");
+  ++global_var;
+
+  asm ("foo_0: .globl foo_0");
+  ++global_var;		/* bar call line */
+
+  asm ("foo_1: .globl foo_1");
+  ++global_var;
+
+  asm ("foo_2: .globl foo_2");
+  ++global_var;
+
+  asm ("foo_3: .globl foo_3");
+  ++global_var;
+
+  asm ("foo_4: .globl foo_4");
+  ++global_var;
+
+  asm ("foo_5: .globl foo_5");
+  ++global_var;
+
+  asm ("foo_6: .globl foo_6");
+  ++global_var;
+
+  asm ("foo_7: .globl foo_7");
+}
+
+int
+main (void)
+{
+  asm ("main_label: .globl main_label");
+  foo ();
+}
diff --git a/gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.exp b/gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.exp
new file mode 100644
index 00000000000..69e1ce67cb9
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/dw2-unexpected-entry-pc.exp
@@ -0,0 +1,250 @@ 
+# Copyright 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Create an inline function which uses DW_AT_ranges and which has a
+# DW_AT_entry_pc.
+#
+# Within the function's ranges, create an empty sub-range, many
+# versions of gcc (8.x to at least 14.x) do this, and point the
+# DW_AT_entry_pc at this empty sub-range (at last 8.x to 9.x did
+# this).
+#
+# Now place a breakpoint on the inline function and run to the
+# breakpoint, check that GDB reports we are inside the inline
+# function.
+#
+# At one point GDB would use the entry-pc value as the breakpoint
+# location even though that address is not actually associated with
+# the inline function.  Now GDB will reject the entry-pc value and
+# select a suitable default entry-pc value instead, one which is
+# associated with the inline function.
+
+load_lib dwarf.exp
+
+require dwarf2_support
+
+standard_testfile
+
+# This compiles the source file and starts and stops GDB, so run it
+# before calling prepare_for_testing otherwise GDB will have exited.
+get_func_info foo
+
+if { [prepare_for_testing "failed to prepare" ${testfile} \
+	  [list ${srcfile}]] } {
+    return
+}
+
+if ![runto_main] {
+    return
+}
+
+# Some label addresses, needed to match against the output later.
+foreach foo {foo_1 foo_2 foo_3 foo_4 foo_5 foo_6} {
+    set $foo [get_hexadecimal_valueof "&$foo" "UNKNOWN" \
+		  "get address for $foo label"]
+}
+
+# Some line numbers needed in the generated DWARF.
+set foo_decl_line [gdb_get_line_number "foo decl line"]
+set bar_call_line [gdb_get_line_number "bar call line"]
+
+if [is_ilp32_target] {
+    set ptr_type "data4"
+} else {
+    set ptr_type "data8"
+}
+
+# Setup the fake DWARF (see comment at top of this file for more
+# details).  Use DWARF_VERSION (either 4 or 5) to select which type of
+# ranges are created.  Compile the source and generated DWARF and run
+# the test.
+#
+# The ENTRY_LABEL is the label to use as the entry-pc value.  The
+# useful choices are 'foo_3', this label is for an empty sub-range,
+# 'foo_4', this label is within the blocks low/high addresses, but is
+# not in any sub-range for the block at all, or 'foo_6', this label is
+# the end address of a non-empty sub-range, and is also the end
+# address for the whole block.
+#
+# The 'foo_4' case is not something that has been seen generated by
+# any compiler, but it doesn't hurt to test.
+#
+# When WITH_LINE_TABLE is true a small snippet of line table will be
+# generated which covers some parts of the inlined function.  This
+# makes most sense when being tested with the 'foo_6' label, as that
+# label is all about handling the end of the inline function case.
+
+proc run_test { entry_label dwarf_version with_line_table } {
+    set dw_testname "${::testfile}-${dwarf_version}-${entry_label}"
+
+    if { $with_line_table } {
+	set dw_testname ${dw_testname}-lt
+    }
+
+    set asm_file [standard_output_file "${dw_testname}.S"]
+    Dwarf::assemble $asm_file {
+	upvar dwarf_version dwarf_version
+	upvar entry_label entry_label
+
+	declare_labels lines_table inline_func ranges_label
+
+	cu { version $dwarf_version } {
+	    compile_unit {
+		{producer "gcc"}
+		{language @DW_LANG_C}
+		{name $::srcfile}
+		{comp_dir /tmp}
+		{stmt_list $lines_table DW_FORM_sec_offset}
+		{low_pc 0 addr}
+	    } {
+		inline_func: subprogram {
+		    {name bar}
+		    {inline @DW_INL_declared_inlined}
+		}
+		subprogram {
+		    {name foo}
+		    {decl_file 1 data1}
+		    {decl_line $::foo_decl_line data1}
+		    {decl_column 1 data1}
+		    {low_pc $::foo_start addr}
+		    {high_pc $::foo_len $::ptr_type}
+		    {external 1 flag}
+		} {
+		    inlined_subroutine {
+			{abstract_origin %$inline_func}
+			{call_file 1 data1}
+			{call_line $::bar_call_line data1}
+			{entry_pc $entry_label addr}
+			{ranges ${ranges_label} DW_FORM_sec_offset}
+		    }
+		}
+	    }
+	}
+
+	lines {version 2} lines_table {
+	    include_dir "$::srcdir/$::subdir"
+	    file_name "$::srcfile" 1
+
+	    upvar with_line_table with_line_table
+
+	    if {$with_line_table} {
+		program {
+		    DW_LNE_set_address foo_label
+		    line [expr $::bar_call_line - 2]
+		    DW_LNS_copy
+
+		    DW_LNE_set_address foo_0
+		    line [expr $::bar_call_line - 1]
+		    DW_LNS_copy
+
+		    DW_LNE_set_address foo_1
+		    line 1
+		    DW_LNS_copy
+
+		    DW_LNE_set_address foo_2
+		    line 2
+		    DW_LNS_copy
+
+		    DW_LNE_set_address foo_6
+		    line 10
+		    DW_LNS_copy
+
+		    DW_LNE_set_address foo_6
+		    line 10
+		    DW_LNS_negate_stmt
+		    DW_LNS_copy
+
+		    DW_LNE_set_address foo_6
+		    line $::bar_call_line
+		    DW_LNS_copy
+
+		    DW_LNE_set_address "$::foo_start + $::foo_len"
+		    DW_LNE_end_sequence
+		}
+	    }
+	}
+
+	if { $dwarf_version == 5 } {
+	    rnglists {} {
+		table {} {
+		    ranges_label: list_ {
+			start_end foo_3 foo_3
+			start_end foo_1 foo_2
+			start_end foo_5 foo_6
+		    }
+		}
+	    }
+	} else {
+	    ranges { } {
+		ranges_label: sequence {
+		    range foo_3 foo_3
+		    range foo_1 foo_2
+		    range foo_5 foo_6
+		}
+	    }
+	}
+    }
+
+    if {[prepare_for_testing "failed to prepare" "${dw_testname}" \
+	     [list $::srcfile $asm_file] {nodebug}]} {
+	return false
+    }
+
+    if ![runto_main] {
+	return false
+    }
+
+    # Place a breakpoint on `bar` and run to the breakpoint.  Use
+    # gdb_test as we want full pattern matching against the stop
+    # location.
+    #
+    # When we have a line table GDB will find a line for the
+    # breakpoint location, so the output will be different.
+    if { $with_line_table } {
+	set re \
+	    [multi_line \
+		 "Breakpoint $::decimal, bar \\(\\) at \[^\r\n\]+/$::srcfile:1" \
+		 "1\\s+\[^\r\n\]+"]
+    } else {
+	set re "Breakpoint $::decimal, $::hex in bar \\(\\)"
+    }
+    gdb_breakpoint bar
+    gdb_test "continue" $re
+
+    # Inspect the block structure of `bar` at this location.  We are
+    # expecting that the empty range (that contained the entry-pc) has
+    # been removed from the block, and that the entry-pc has its
+    # default value.
+    gdb_test "maint info blocks" \
+	[multi_line \
+	     "\\\[\\(block \\*\\) $::hex\\\] $::foo_1\\.\\.$::foo_6" \
+	     "  entry pc: $::foo_1" \
+	     "  inline function: bar" \
+	     "  symbol count: $::decimal" \
+	     "  address ranges:" \
+	     "    $::foo_1\\.\\.$::foo_2" \
+	     "    $::foo_5\\.\\.$::foo_6"]
+}
+
+foreach_with_prefix dwarf_version { 4 5 } {
+    # Test various labels without any line table present.
+    foreach_with_prefix entry_label { foo_3 foo_4 foo_2 foo_6 } {
+	run_test $entry_label $dwarf_version false
+    }
+
+    # Now test what happens if we use the end address of the block,
+    # but also supply a line table.  Does GDB do anything different?
+    run_test foo_6 $dwarf_version true
+}