gdb: handle DW_AT_entry_pc pointing at an empty sub-range

Message ID 34cfe440ffd0e53843bfaf92494d29a6951fa9fd.1732114887.git.aburgess@redhat.com
State New
Headers
Series gdb: handle DW_AT_entry_pc pointing at an empty sub-range |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Test passed

Commit Message

Andrew Burgess Nov. 20, 2024, 3:01 p.m. UTC
  The test gdb.cp/step-and-next-inline.exp creates a test binary called
step-and-next-inline-no-header.  This test includes a function
`tree_check` which is inlined 3 times.

When testing with some older versions of gcc (I've tried 8.4.0, 9.3.1)
we see the following DWARF representing one of the inline instances of
tree_check:

 <2><8d9>: Abbrev Number: 38 (DW_TAG_inlined_subroutine)
    <8da>   DW_AT_abstract_origin: <0x9ee>
    <8de>   DW_AT_entry_pc    : 0x401165
    <8e6>   DW_AT_GNU_entry_view: 0
    <8e7>   DW_AT_ranges      : 0x30
    <8eb>   DW_AT_call_file   : 1
    <8ec>   DW_AT_call_line   : 52
    <8ed>   DW_AT_call_column : 10
    <8ee>   DW_AT_sibling     : <0x92d>

 ...

 <1><9ee>: Abbrev Number: 46 (DW_TAG_subprogram)
    <9ef>   DW_AT_external    : 1
    <9ef>   DW_AT_name        : (indirect string, offset: 0xe8): tree_check
    <9f3>   DW_AT_decl_file   : 1
    <9f4>   DW_AT_decl_line   : 38
    <9f5>   DW_AT_decl_column : 1
    <9f6>   DW_AT_linkage_name: (indirect string, offset: 0x2f2): _Z10tree_checkP4treei
    <9fa>   DW_AT_type        : <0x9e8>
    <9fe>   DW_AT_inline      : 3       (declared as inline and inlined)
    <9ff>   DW_AT_sibling     : <0xa22>

 ...

 Contents of the .debug_ranges section:

    Offset   Begin    End
    ...
    00000030 0000000000401165 0000000000401165 (start == end)
    00000030 0000000000401169 0000000000401173
    00000030 0000000000401040 0000000000401045
    00000030 <End of list>
    ...

Notice that one of the sub-ranges of tree-check is empty, this is the
line marked 'start == end'.  As the end address is the first address
after the range, this range cover absolutely no code.

But notice too that the DW_AT_entry_pc for the inline instance points
at this empty range.

Further, notice that despite the ordering of the sub-ranges, the empty
range is actually in the middle of the region defined by the lowest
address to the highest address.  The ordering is not a problem, the
DWARF spec doesn't require that ranges be in any particular order.

However, this empty range is causing issues with GDB newly acquire
DW_AT_entry_pc support.

GDB already rejects, and has done for a long time, empty sub-ranges,
after all, the DWARF spec is clear that such a range covers no code.

The recent DW_AT_entry_pc patch also had GDB reject an entry-pc which
was outside of the low/high bounds of a block.

But in this case, the entry-pc value is within the bounds of a block,
it's just not within any useful sub-range.  As a consequence, GDB is
storing the entry-pc value, and making use of it, but when GDB stops,
and tries to work out which block the inferior is in, it fails to spot
that the inferior is within tree_check, and instead reports the
function into which tree_check was inlined.

I've tested with newer versions of gcc (12.2.0 and 14.2.0) and with
these versions gcc is still generating the empty sub-range, but now
this empty sub-range is no longer the entry point.  Here's the
corresponding ranges table from gcc 14.2.0:

  Contents of the .debug_rnglists section:

   Table at Offset: 0:
    Length:          0x56
    DWARF version:   5
    Address size:    8
    Segment size:    0
    Offset entries:  0
      Offset   Begin    End
      ...
      00000021 0000000000401165 000000000040116f
      0000002b 0000000000401040 (base address)
      00000034 0000000000401040 0000000000401040  (start == end)
      00000037 0000000000401041 0000000000401046
      0000003a <End of list>
      ...

The DW_AT_entry_pc is 0x401165, but this is not the empty sub-range,
as a result, when GDB stops at the entry-pc, GDB will correctly spot
that the inferior is in the tree_check function.

The fix I propose here is, instead of rejecting entry-pc values that
are outside the block's low/high range, instead reject entry-pc values
that are not inside any of the block's sub-ranges.

Now, on the older versions of GDB, GDB will ignore the prescribed
entry-pc, and will instead select a suitable default entry-pc based on
either the block's low-pc value, or the first address of the first
range.

I have extended the gdb.cp/step-and-next-inline.exp test to check this
case, but this does depend on the compiler version being used (newer
compilers will always pass, even without the fix).

So I have also added a DWARF assembler test to cover this case.
---
 gdb/dwarf2/read.c                             |  24 ++-
 gdb/testsuite/gdb.cp/step-and-next-inline.exp |  26 +++
 .../gdb.dwarf2/dw2-entry-pc-in-empty-range.c  |  57 ++++++
 .../dw2-entry-pc-in-empty-range.exp           | 186 ++++++++++++++++++
 4 files changed, 292 insertions(+), 1 deletion(-)
 create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.c
 create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.exp


base-commit: 82eff6743b77908a502b4cf9030acc93caf69e74
  

Comments

Kevin Buettner Nov. 20, 2024, 9 p.m. UTC | #1
On Wed, 20 Nov 2024 15:01:49 +0000
Andrew Burgess <aburgess@redhat.com> wrote:

[...]
> Now, on the older versions of GDB, GDB will ignore the prescribed

I assume that you actually mean "older versions of GCC"?

> entry-pc, and will instead select a suitable default entry-pc based on
> either the block's low-pc value, or the first address of the first
> range.
> 
> I have extended the gdb.cp/step-and-next-inline.exp test to check this
> case, but this does depend on the compiler version being used (newer
> compilers will always pass, even without the fix).
> 
> So I have also added a DWARF assembler test to cover this case.

Thanks for the detailed explanation.

The patches LGTM.

Reviewed-by: Kevin Buettner <kevinb@redhat.com>

(If there isn't additional discussion within a week or so, please
consider the above to be Approved-by.)
  
Bernd Edlinger Nov. 21, 2024, 1:21 p.m. UTC | #2
Hmm, sorry, but I think this goes in the wrong direction.

On 11/20/24 16:01, Andrew Burgess wrote:
> The test gdb.cp/step-and-next-inline.exp creates a test binary called
> step-and-next-inline-no-header.  This test includes a function
> `tree_check` which is inlined 3 times.
> 
> When testing with some older versions of gcc (I've tried 8.4.0, 9.3.1)
> we see the following DWARF representing one of the inline instances of
> tree_check:
> 
>  <2><8d9>: Abbrev Number: 38 (DW_TAG_inlined_subroutine)
>     <8da>   DW_AT_abstract_origin: <0x9ee>
>     <8de>   DW_AT_entry_pc    : 0x401165
>     <8e6>   DW_AT_GNU_entry_view: 0
>     <8e7>   DW_AT_ranges      : 0x30
>     <8eb>   DW_AT_call_file   : 1
>     <8ec>   DW_AT_call_line   : 52
>     <8ed>   DW_AT_call_column : 10
>     <8ee>   DW_AT_sibling     : <0x92d>
> 
>  ...
> 
>  <1><9ee>: Abbrev Number: 46 (DW_TAG_subprogram)
>     <9ef>   DW_AT_external    : 1
>     <9ef>   DW_AT_name        : (indirect string, offset: 0xe8): tree_check
>     <9f3>   DW_AT_decl_file   : 1
>     <9f4>   DW_AT_decl_line   : 38
>     <9f5>   DW_AT_decl_column : 1
>     <9f6>   DW_AT_linkage_name: (indirect string, offset: 0x2f2): _Z10tree_checkP4treei
>     <9fa>   DW_AT_type        : <0x9e8>
>     <9fe>   DW_AT_inline      : 3       (declared as inline and inlined)
>     <9ff>   DW_AT_sibling     : <0xa22>
> 
>  ...
> 
>  Contents of the .debug_ranges section:
> 
>     Offset   Begin    End
>     ...
>     00000030 0000000000401165 0000000000401165 (start == end)
>     00000030 0000000000401169 0000000000401173
>     00000030 0000000000401040 0000000000401045
>     00000030 <End of list>
>     ...
> 
> Notice that one of the sub-ranges of tree-check is empty, this is the
> line marked 'start == end'.  As the end address is the first address
> after the range, this range cover absolutely no code.
> 
> But notice too that the DW_AT_entry_pc for the inline instance points
> at this empty range.
> 
> Further, notice that despite the ordering of the sub-ranges, the empty
> range is actually in the middle of the region defined by the lowest
> address to the highest address.  The ordering is not a problem, the
> DWARF spec doesn't require that ranges be in any particular order.
> 
> However, this empty range is causing issues with GDB newly acquire
> DW_AT_entry_pc support.
> 
> GDB already rejects, and has done for a long time, empty sub-ranges,
> after all, the DWARF spec is clear that such a range covers no code.
> 
> The recent DW_AT_entry_pc patch also had GDB reject an entry-pc which
> was outside of the low/high bounds of a block.
> 
> But in this case, the entry-pc value is within the bounds of a block,
> it's just not within any useful sub-range.  As a consequence, GDB is
> storing the entry-pc value, and making use of it, but when GDB stops,
> and tries to work out which block the inferior is in, it fails to spot
> that the inferior is within tree_check, and instead reports the
> function into which tree_check was inlined.
> 
> I've tested with newer versions of gcc (12.2.0 and 14.2.0) and with
> these versions gcc is still generating the empty sub-range, but now
> this empty sub-range is no longer the entry point.  Here's the
> corresponding ranges table from gcc 14.2.0:
> 

Yeah, maybe not in this test case, but that is not true in general,
a quick check with gcc-15 shows that there still a number of such
empty range table entries in the gdb executable itself.

Note that ignoring these entry_pc values is completely wrong,
and my patch series handles exactly these empty subranges, by not
ignoring them in the dwarf reader, and the debug experience is
completely normal when this happens.

Furthermore, I think that having a break point at these PC values has
some benefit, because it is the earliest point in time, when all
the input parameter values of the inline function are available,
and can be inspected by gdb.

I think now it is time to consider merging the rest of my
patch.


Bernd.
  
Andrew Burgess Nov. 22, 2024, 4:53 p.m. UTC | #3
Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

> Hmm, sorry, but I think this goes in the wrong direction.
>
> On 11/20/24 16:01, Andrew Burgess wrote:
>> The test gdb.cp/step-and-next-inline.exp creates a test binary called
>> step-and-next-inline-no-header.  This test includes a function
>> `tree_check` which is inlined 3 times.
>> 
>> When testing with some older versions of gcc (I've tried 8.4.0, 9.3.1)
>> we see the following DWARF representing one of the inline instances of
>> tree_check:
>> 
>>  <2><8d9>: Abbrev Number: 38 (DW_TAG_inlined_subroutine)
>>     <8da>   DW_AT_abstract_origin: <0x9ee>
>>     <8de>   DW_AT_entry_pc    : 0x401165
>>     <8e6>   DW_AT_GNU_entry_view: 0
>>     <8e7>   DW_AT_ranges      : 0x30
>>     <8eb>   DW_AT_call_file   : 1
>>     <8ec>   DW_AT_call_line   : 52
>>     <8ed>   DW_AT_call_column : 10
>>     <8ee>   DW_AT_sibling     : <0x92d>
>> 
>>  ...
>> 
>>  <1><9ee>: Abbrev Number: 46 (DW_TAG_subprogram)
>>     <9ef>   DW_AT_external    : 1
>>     <9ef>   DW_AT_name        : (indirect string, offset: 0xe8): tree_check
>>     <9f3>   DW_AT_decl_file   : 1
>>     <9f4>   DW_AT_decl_line   : 38
>>     <9f5>   DW_AT_decl_column : 1
>>     <9f6>   DW_AT_linkage_name: (indirect string, offset: 0x2f2): _Z10tree_checkP4treei
>>     <9fa>   DW_AT_type        : <0x9e8>
>>     <9fe>   DW_AT_inline      : 3       (declared as inline and inlined)
>>     <9ff>   DW_AT_sibling     : <0xa22>
>> 
>>  ...
>> 
>>  Contents of the .debug_ranges section:
>> 
>>     Offset   Begin    End
>>     ...
>>     00000030 0000000000401165 0000000000401165 (start == end)
>>     00000030 0000000000401169 0000000000401173
>>     00000030 0000000000401040 0000000000401045
>>     00000030 <End of list>
>>     ...
>> 
>> Notice that one of the sub-ranges of tree-check is empty, this is the
>> line marked 'start == end'.  As the end address is the first address
>> after the range, this range cover absolutely no code.
>> 
>> But notice too that the DW_AT_entry_pc for the inline instance points
>> at this empty range.
>> 
>> Further, notice that despite the ordering of the sub-ranges, the empty
>> range is actually in the middle of the region defined by the lowest
>> address to the highest address.  The ordering is not a problem, the
>> DWARF spec doesn't require that ranges be in any particular order.
>> 
>> However, this empty range is causing issues with GDB newly acquire
>> DW_AT_entry_pc support.
>> 
>> GDB already rejects, and has done for a long time, empty sub-ranges,
>> after all, the DWARF spec is clear that such a range covers no code.
>> 
>> The recent DW_AT_entry_pc patch also had GDB reject an entry-pc which
>> was outside of the low/high bounds of a block.
>> 
>> But in this case, the entry-pc value is within the bounds of a block,
>> it's just not within any useful sub-range.  As a consequence, GDB is
>> storing the entry-pc value, and making use of it, but when GDB stops,
>> and tries to work out which block the inferior is in, it fails to spot
>> that the inferior is within tree_check, and instead reports the
>> function into which tree_check was inlined.
>> 
>> I've tested with newer versions of gcc (12.2.0 and 14.2.0) and with
>> these versions gcc is still generating the empty sub-range, but now
>> this empty sub-range is no longer the entry point.  Here's the
>> corresponding ranges table from gcc 14.2.0:
>> 
>
> Yeah, maybe not in this test case, but that is not true in general,
> a quick check with gcc-15 shows that there still a number of such
> empty range table entries in the gdb executable itself.
>
> Note that ignoring these entry_pc values is completely wrong,
> and my patch series handles exactly these empty subranges, by not
> ignoring them in the dwarf reader, and the debug experience is
> completely normal when this happens.
>
> Furthermore, I think that having a break point at these PC values has
> some benefit, because it is the earliest point in time, when all
> the input parameter values of the inline function are available,
> and can be inspected by gdb.

I hear and understand your frustration.  I'm absolutely not ignoring the
work you've done.  But I see this as a journey of small steps to get
where you want to be.

I agree with you 100% that dealing better with the empty sub-ranges is
the right way to go, and indeed, I have just finished pulling the empty
sub-range work from your series, and I plan to post those patches
hopefully over the weekend, I'm just running final tests now.

But, this patch does more than just deal with the case of entry-pc
pointing at the empty sub-range.  I believe that having a sanity check
that the entry-pc is within any sub-range is the right thing for GDB to
do, regardless of what we ultimately end up doing with empty ranges.

I'm hoping I can convince you that fixing this first is not going to
prevent us merging your empty sub-range handling code next.

> I think now it is time to consider merging the rest of my
> patch.

My top priority between now and year end is to merge either your
patches, or equivalent functionality, into GDB.  But, as you can tell
from what I've posted, I think your series covers a lot of fixes which
should be broken into separate commits.

I think the work you have done is absolutely amazing, and makes huge
improvements to GDB's handling of optimised code, I just don't want to
rush things, but I'm certain we will get there.

Thanks,
Andrew
  
Bernd Edlinger Nov. 22, 2024, 10:57 p.m. UTC | #4
On 11/22/24 17:53, Andrew Burgess wrote:
> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
> 
>> Hmm, sorry, but I think this goes in the wrong direction.
>>
>> On 11/20/24 16:01, Andrew Burgess wrote:
>>> The test gdb.cp/step-and-next-inline.exp creates a test binary called
>>> step-and-next-inline-no-header.  This test includes a function
>>> `tree_check` which is inlined 3 times.
>>>
>>> When testing with some older versions of gcc (I've tried 8.4.0, 9.3.1)
>>> we see the following DWARF representing one of the inline instances of
>>> tree_check:
>>>
>>>  <2><8d9>: Abbrev Number: 38 (DW_TAG_inlined_subroutine)
>>>     <8da>   DW_AT_abstract_origin: <0x9ee>
>>>     <8de>   DW_AT_entry_pc    : 0x401165
>>>     <8e6>   DW_AT_GNU_entry_view: 0
>>>     <8e7>   DW_AT_ranges      : 0x30
>>>     <8eb>   DW_AT_call_file   : 1
>>>     <8ec>   DW_AT_call_line   : 52
>>>     <8ed>   DW_AT_call_column : 10
>>>     <8ee>   DW_AT_sibling     : <0x92d>
>>>
>>>  ...
>>>
>>>  <1><9ee>: Abbrev Number: 46 (DW_TAG_subprogram)
>>>     <9ef>   DW_AT_external    : 1
>>>     <9ef>   DW_AT_name        : (indirect string, offset: 0xe8): tree_check
>>>     <9f3>   DW_AT_decl_file   : 1
>>>     <9f4>   DW_AT_decl_line   : 38
>>>     <9f5>   DW_AT_decl_column : 1
>>>     <9f6>   DW_AT_linkage_name: (indirect string, offset: 0x2f2): _Z10tree_checkP4treei
>>>     <9fa>   DW_AT_type        : <0x9e8>
>>>     <9fe>   DW_AT_inline      : 3       (declared as inline and inlined)
>>>     <9ff>   DW_AT_sibling     : <0xa22>
>>>
>>>  ...
>>>
>>>  Contents of the .debug_ranges section:
>>>
>>>     Offset   Begin    End
>>>     ...
>>>     00000030 0000000000401165 0000000000401165 (start == end)
>>>     00000030 0000000000401169 0000000000401173
>>>     00000030 0000000000401040 0000000000401045
>>>     00000030 <End of list>
>>>     ...
>>>
>>> Notice that one of the sub-ranges of tree-check is empty, this is the
>>> line marked 'start == end'.  As the end address is the first address
>>> after the range, this range cover absolutely no code.
>>>
>>> But notice too that the DW_AT_entry_pc for the inline instance points
>>> at this empty range.
>>>
>>> Further, notice that despite the ordering of the sub-ranges, the empty
>>> range is actually in the middle of the region defined by the lowest
>>> address to the highest address.  The ordering is not a problem, the
>>> DWARF spec doesn't require that ranges be in any particular order.
>>>
>>> However, this empty range is causing issues with GDB newly acquire
>>> DW_AT_entry_pc support.
>>>
>>> GDB already rejects, and has done for a long time, empty sub-ranges,
>>> after all, the DWARF spec is clear that such a range covers no code.
>>>
>>> The recent DW_AT_entry_pc patch also had GDB reject an entry-pc which
>>> was outside of the low/high bounds of a block.
>>>
>>> But in this case, the entry-pc value is within the bounds of a block,
>>> it's just not within any useful sub-range.  As a consequence, GDB is
>>> storing the entry-pc value, and making use of it, but when GDB stops,
>>> and tries to work out which block the inferior is in, it fails to spot
>>> that the inferior is within tree_check, and instead reports the
>>> function into which tree_check was inlined.
>>>
>>> I've tested with newer versions of gcc (12.2.0 and 14.2.0) and with
>>> these versions gcc is still generating the empty sub-range, but now
>>> this empty sub-range is no longer the entry point.  Here's the
>>> corresponding ranges table from gcc 14.2.0:
>>>
>>
>> Yeah, maybe not in this test case, but that is not true in general,
>> a quick check with gcc-15 shows that there still a number of such
>> empty range table entries in the gdb executable itself.
>>
>> Note that ignoring these entry_pc values is completely wrong,
>> and my patch series handles exactly these empty subranges, by not
>> ignoring them in the dwarf reader, and the debug experience is
>> completely normal when this happens.
>>
>> Furthermore, I think that having a break point at these PC values has
>> some benefit, because it is the earliest point in time, when all
>> the input parameter values of the inline function are available,
>> and can be inspected by gdb.
> 
> I hear and understand your frustration.  I'm absolutely not ignoring the
> work you've done.  But I see this as a journey of small steps to get
> where you want to be.
> 
> I agree with you 100% that dealing better with the empty sub-ranges is
> the right way to go, and indeed, I have just finished pulling the empty
> sub-range work from your series, and I plan to post those patches
> hopefully over the weekend, I'm just running final tests now.
> 
> But, this patch does more than just deal with the case of entry-pc
> pointing at the empty sub-range.  I believe that having a sanity check
> that the entry-pc is within any sub-range is the right thing for GDB to
> do, regardless of what we ultimately end up doing with empty ranges.
> 
> I'm hoping I can convince you that fixing this first is not going to
> prevent us merging your empty sub-range handling code next.
> 
>> I think now it is time to consider merging the rest of my
>> patch.
> 
> My top priority between now and year end is to merge either your
> patches, or equivalent functionality, into GDB.  But, as you can tell
> from what I've posted, I think your series covers a lot of fixes which
> should be broken into separate commits.
> 
> I think the work you have done is absolutely amazing, and makes huge
> improvements to GDB's handling of optimised code, I just don't want to
> rush things, but I'm certain we will get there.
> 
> Thanks,
> Andrew
> 

Okay, I just wanted to point out that in my opinion the debug info which
points at the end of a sub-range is not incorrect, just maybe on a border
line, where the dwarf spec is unclear.  So you should not say:
"after all, the DWARF spec is clear that such a range covers no code."

But there are obviously not only cases where the entry_pc points at
an empty sub-range, but also in very rare cases the entry_pc points at
the end of a non-empty sub-range.
So could you please change the check in dwarf2_addr_in_block_ranges
from addr >= start && addr < end to addr >= start && addr <= end.


Thanks
Bernd.
  
Andrew Burgess Nov. 25, 2024, 2:30 p.m. UTC | #5
Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

> On 11/22/24 17:53, Andrew Burgess wrote:
>> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>> 
>>> Hmm, sorry, but I think this goes in the wrong direction.
>>>
>>> On 11/20/24 16:01, Andrew Burgess wrote:
>>>> The test gdb.cp/step-and-next-inline.exp creates a test binary called
>>>> step-and-next-inline-no-header.  This test includes a function
>>>> `tree_check` which is inlined 3 times.
>>>>
>>>> When testing with some older versions of gcc (I've tried 8.4.0, 9.3.1)
>>>> we see the following DWARF representing one of the inline instances of
>>>> tree_check:
>>>>
>>>>  <2><8d9>: Abbrev Number: 38 (DW_TAG_inlined_subroutine)
>>>>     <8da>   DW_AT_abstract_origin: <0x9ee>
>>>>     <8de>   DW_AT_entry_pc    : 0x401165
>>>>     <8e6>   DW_AT_GNU_entry_view: 0
>>>>     <8e7>   DW_AT_ranges      : 0x30
>>>>     <8eb>   DW_AT_call_file   : 1
>>>>     <8ec>   DW_AT_call_line   : 52
>>>>     <8ed>   DW_AT_call_column : 10
>>>>     <8ee>   DW_AT_sibling     : <0x92d>
>>>>
>>>>  ...
>>>>
>>>>  <1><9ee>: Abbrev Number: 46 (DW_TAG_subprogram)
>>>>     <9ef>   DW_AT_external    : 1
>>>>     <9ef>   DW_AT_name        : (indirect string, offset: 0xe8): tree_check
>>>>     <9f3>   DW_AT_decl_file   : 1
>>>>     <9f4>   DW_AT_decl_line   : 38
>>>>     <9f5>   DW_AT_decl_column : 1
>>>>     <9f6>   DW_AT_linkage_name: (indirect string, offset: 0x2f2): _Z10tree_checkP4treei
>>>>     <9fa>   DW_AT_type        : <0x9e8>
>>>>     <9fe>   DW_AT_inline      : 3       (declared as inline and inlined)
>>>>     <9ff>   DW_AT_sibling     : <0xa22>
>>>>
>>>>  ...
>>>>
>>>>  Contents of the .debug_ranges section:
>>>>
>>>>     Offset   Begin    End
>>>>     ...
>>>>     00000030 0000000000401165 0000000000401165 (start == end)
>>>>     00000030 0000000000401169 0000000000401173
>>>>     00000030 0000000000401040 0000000000401045
>>>>     00000030 <End of list>
>>>>     ...
>>>>
>>>> Notice that one of the sub-ranges of tree-check is empty, this is the
>>>> line marked 'start == end'.  As the end address is the first address
>>>> after the range, this range cover absolutely no code.
>>>>
>>>> But notice too that the DW_AT_entry_pc for the inline instance points
>>>> at this empty range.
>>>>
>>>> Further, notice that despite the ordering of the sub-ranges, the empty
>>>> range is actually in the middle of the region defined by the lowest
>>>> address to the highest address.  The ordering is not a problem, the
>>>> DWARF spec doesn't require that ranges be in any particular order.
>>>>
>>>> However, this empty range is causing issues with GDB newly acquire
>>>> DW_AT_entry_pc support.
>>>>
>>>> GDB already rejects, and has done for a long time, empty sub-ranges,
>>>> after all, the DWARF spec is clear that such a range covers no code.
>>>>
>>>> The recent DW_AT_entry_pc patch also had GDB reject an entry-pc which
>>>> was outside of the low/high bounds of a block.
>>>>
>>>> But in this case, the entry-pc value is within the bounds of a block,
>>>> it's just not within any useful sub-range.  As a consequence, GDB is
>>>> storing the entry-pc value, and making use of it, but when GDB stops,
>>>> and tries to work out which block the inferior is in, it fails to spot
>>>> that the inferior is within tree_check, and instead reports the
>>>> function into which tree_check was inlined.
>>>>
>>>> I've tested with newer versions of gcc (12.2.0 and 14.2.0) and with
>>>> these versions gcc is still generating the empty sub-range, but now
>>>> this empty sub-range is no longer the entry point.  Here's the
>>>> corresponding ranges table from gcc 14.2.0:
>>>>
>>>
>>> Yeah, maybe not in this test case, but that is not true in general,
>>> a quick check with gcc-15 shows that there still a number of such
>>> empty range table entries in the gdb executable itself.
>>>
>>> Note that ignoring these entry_pc values is completely wrong,
>>> and my patch series handles exactly these empty subranges, by not
>>> ignoring them in the dwarf reader, and the debug experience is
>>> completely normal when this happens.
>>>
>>> Furthermore, I think that having a break point at these PC values has
>>> some benefit, because it is the earliest point in time, when all
>>> the input parameter values of the inline function are available,
>>> and can be inspected by gdb.
>> 
>> I hear and understand your frustration.  I'm absolutely not ignoring the
>> work you've done.  But I see this as a journey of small steps to get
>> where you want to be.
>> 
>> I agree with you 100% that dealing better with the empty sub-ranges is
>> the right way to go, and indeed, I have just finished pulling the empty
>> sub-range work from your series, and I plan to post those patches
>> hopefully over the weekend, I'm just running final tests now.
>> 
>> But, this patch does more than just deal with the case of entry-pc
>> pointing at the empty sub-range.  I believe that having a sanity check
>> that the entry-pc is within any sub-range is the right thing for GDB to
>> do, regardless of what we ultimately end up doing with empty ranges.
>> 
>> I'm hoping I can convince you that fixing this first is not going to
>> prevent us merging your empty sub-range handling code next.
>> 
>>> I think now it is time to consider merging the rest of my
>>> patch.
>> 
>> My top priority between now and year end is to merge either your
>> patches, or equivalent functionality, into GDB.  But, as you can tell
>> from what I've posted, I think your series covers a lot of fixes which
>> should be broken into separate commits.
>> 
>> I think the work you have done is absolutely amazing, and makes huge
>> improvements to GDB's handling of optimised code, I just don't want to
>> rush things, but I'm certain we will get there.
>> 
>> Thanks,
>> Andrew
>> 
>
> Okay, I just wanted to point out that in my opinion the debug info which
> points at the end of a sub-range is not incorrect, just maybe on a border
> line, where the dwarf spec is unclear.  So you should not say:
> "after all, the DWARF spec is clear that such a range covers no code."
>
> But there are obviously not only cases where the entry_pc points at
> an empty sub-range, but also in very rare cases the entry_pc points at
> the end of a non-empty sub-range.
> So could you please change the check in dwarf2_addr_in_block_ranges
> from addr >= start && addr < end to addr >= start && addr <= end.

Could you expand on why you believe that the DWARF spec is unclear in
this regard.  I came to my conclusion based on this text within the
DWARF-5 specification, section 2.17.3 Non-Contiguous Address Ranges:

  Bounded range. This kind of entry defines an address range that is
  included in the range list. The starting address is the lowest address
  of the address range. The ending address is the address of the first
  location past the highest address of the address range

This seems pretty clear (to me) that the end address is not part of the
region covered by a range.

Additionally, if we start to accept 'addr == end' then this is going to
cause problems elsewhere.  GDB will place a b/p at the 'end' address,
but when GDB then performs block lookup, GDB will not return the block
we expect, and so GDB will not report the inferior as having stopped in
the scope that the user expects.

Thanks,
Andrew
  
Bernd Edlinger Nov. 26, 2024, 12:35 p.m. UTC | #6
On 11/25/24 15:30, Andrew Burgess wrote:
> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
> 
>> Okay, I just wanted to point out that in my opinion the debug info which
>> points at the end of a sub-range is not incorrect, just maybe on a border
>> line, where the dwarf spec is unclear.  So you should not say:
>> "after all, the DWARF spec is clear that such a range covers no code."
>>
>> But there are obviously not only cases where the entry_pc points at
>> an empty sub-range, but also in very rare cases the entry_pc points at
>> the end of a non-empty sub-range.
>> So could you please change the check in dwarf2_addr_in_block_ranges
>> from addr >= start && addr < end to addr >= start && addr <= end.
> 
> Could you expand on why you believe that the DWARF spec is unclear in
> this regard.  I came to my conclusion based on this text within the
> DWARF-5 specification, section 2.17.3 Non-Contiguous Address Ranges:
> 
>   Bounded range. This kind of entry defines an address range that is
>   included in the range list. The starting address is the lowest address
>   of the address range. The ending address is the address of the first
>   location past the highest address of the address range
> 
> This seems pretty clear (to me) that the end address is not part of the
> region covered by a range.
> 

Yes, but on the other hand, when we look at line table entries, each has a
PC and a VIEW number, and even the DW_AT_entry_pc has a DW_AT_GNU_entry_view,
just the range list does not have a view number, and that is inconsistent
with the concept of location views.

Consider as a simple example an inline function:

int f(int x)
{
 x++;
 return x;
}

it will most likely just be compiled into one "inc eax" or similar,
and of course you may want to set a break point on the return statement,
to inspect 'x' after the increment, but that will be on 'pc == end' !

But if the location view number would not be missing from the rnglist
it would be obvious whether the corresponding view number is still within
subroutine and not outside.  So in my opinion it is a defect in the
specification that it does not reflect this use case.


> Additionally, if we start to accept 'addr == end' then this is going to
> cause problems elsewhere.  GDB will place a b/p at the 'end' address,
> but when GDB then performs block lookup, GDB will not return the block
> we expect, and so GDB will not report the inferior as having stopped in
> the scope that the user expects.
> 

No, because this is exactly what the core of my patch does, admittedly
I also modified the block lookup code a bit, to handle that case.
So I strongly disagree here: we have to accept 'addr == end' and other
corner cases, otherwise my patch won't work in the end, regardless of in
how many small bug-fixes it can be split up, because it depends exactly
on not ignoring any information while parsing the debug info.


Thanks
Bernd,
  
Andrew Burgess Nov. 26, 2024, 5:48 p.m. UTC | #7
Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

> On 11/25/24 15:30, Andrew Burgess wrote:
>> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>> 
>>> Okay, I just wanted to point out that in my opinion the debug info which
>>> points at the end of a sub-range is not incorrect, just maybe on a border
>>> line, where the dwarf spec is unclear.  So you should not say:
>>> "after all, the DWARF spec is clear that such a range covers no code."
>>>
>>> But there are obviously not only cases where the entry_pc points at
>>> an empty sub-range, but also in very rare cases the entry_pc points at
>>> the end of a non-empty sub-range.
>>> So could you please change the check in dwarf2_addr_in_block_ranges
>>> from addr >= start && addr < end to addr >= start && addr <= end.
>> 
>> Could you expand on why you believe that the DWARF spec is unclear in
>> this regard.  I came to my conclusion based on this text within the
>> DWARF-5 specification, section 2.17.3 Non-Contiguous Address Ranges:
>> 
>>   Bounded range. This kind of entry defines an address range that is
>>   included in the range list. The starting address is the lowest address
>>   of the address range. The ending address is the address of the first
>>   location past the highest address of the address range
>> 
>> This seems pretty clear (to me) that the end address is not part of the
>> region covered by a range.
>> 
>
> Yes, but on the other hand, when we look at line table entries, each has a
> PC and a VIEW number, and even the DW_AT_entry_pc has a DW_AT_GNU_entry_view,
> just the range list does not have a view number, and that is inconsistent
> with the concept of location views.
>
> Consider as a simple example an inline function:
>
> int f(int x)
> {
>  x++;
>  return x;
> }
>
> it will most likely just be compiled into one "inc eax" or similar,
> and of course you may want to set a break point on the return statement,
> to inspect 'x' after the increment, but that will be on 'pc == end' !
>
> But if the location view number would not be missing from the rnglist
> it would be obvious whether the corresponding view number is still within
> subroutine and not outside.  So in my opinion it is a defect in the
> specification that it does not reflect this use case.

You make an interesting argument that the specification is deficient.
But I'm not sure how this helps with this discussion.  I would like to
avoid derailing this conversation with discussion of missing DWARF
features.

>
>> Additionally, if we start to accept 'addr == end' then this is going to
>> cause problems elsewhere.  GDB will place a b/p at the 'end' address,
>> but when GDB then performs block lookup, GDB will not return the block
>> we expect, and so GDB will not report the inferior as having stopped in
>> the scope that the user expects.
>> 
>
> No, because this is exactly what the core of my patch does, admittedly
> I also modified the block lookup code a bit, to handle that case.
> So I strongly disagree here: we have to accept 'addr == end' and other
> corner cases, otherwise my patch won't work in the end, regardless of in
> how many small bug-fixes it can be split up, because it depends exactly
> on not ignoring any information while parsing the debug info.

But accepting 'addr == end' only works if you also change the block
lookup mechanism, which isn't part of this patch.  This patch is based
on the state of block lookup as it exists today.

I've included a patch below which applies on top of this patch (i.e. the
one this thread is about), it changes the check to accept 'addr == end'
as you suggest.  It also updates the test so that an inline function
(bar) has DW_AT_entry_pc point at the 'end' address of a non-empty
sub-range.

Here's a GDB session with that patch applied:

  (gdb) b bar
  Breakpoint 1 at 0x401137
  (gdb) r
  Starting program: /tmp/gdb/testsuite/outputs/gdb.dwarf2/dw2-entry-pc-in-empty-range/dw2-entry-pc-in-empty-range-4 
  
  Breakpoint 1, 0x0000000000401137 in foo ()
  (gdb) maintenance info blocks 
  Blocks at 0x401137:
    from objfile: [(objfile *) 0x3b11720] /tmp/gdb/testsuite/outputs/gdb.dwarf2/dw2-entry-pc-in-empty-range/dw2-entry-pc-in-empty-range-4
  
  [(block *) 0x357e680] 0x401106..0x401185
    entry pc: 0x401106
    is global block
    symbol count: 1
    is contiguous
  [(block *) 0x357e630] 0x401106..0x401185
    entry pc: 0x401106
    is static block
    is contiguous
  [(block *) 0x357e5e0] 0x401106..0x401185
    entry pc: 0x401106
    function: foo
    is contiguous
  (gdb) 

As you can see, GDB stops at an address which doesn't then resolve to
the bar block.

If/when the block lookup code is changed as you propose then this
restriction (addr < end) can be relaxed.  But it doesn't make sense to
merge the relaxed restriction, without the block lookup changes.

And no, I don't see that as a reason to merge all of the changes at
once.  I think splitting the original large change into many small steps
is the correct approach.  And sometimes that will mean that we check
something in, and then revise it in a later commit.  That's not a
problem with this approach, it's an advantage of this approach.  It
makes it clearer how we got to the final destination.  And each step is
smaller, and easier to review.  This is the preferred approach for GDB
patches.

I feel that, as the author of the original large change, you're looking
ahead and you're frustrated that this code isn't inline with how you
feel the code should finally look.  But just because I hope we can check
this code in first, doesn't mean that I will prevent this code being
changed later on.  As GDB evolves (e.g. if the block lookup code
changes) then I'm happy for this code to evolve with it.

To (I hope) offer you some confidence, I have, on my machine, created a
branch with this patch (without the 'addr == end' change), followed by
the next two patches I plan to post (once this is merged), and then, on
top of that, I have rebased your original series.  This includes all of
your original tests completely unmodified.

I have tested this merge with gcc versions 14.2.0, 13.3.0, 12.2.0,
11.5.0, 10.5.0, 9.5.0, 9.3.1, 8.4.0, 8.1.0, and in all cases, all of
your original tests pass.  I'd rather not post this merged branch just
yet, as I'm worried that this might derail review of this patch even
more, I really don't want to start discussing the next patches before
they are even posted, but if it's the only way to move this patch
forward then I could share the branch.  I do plan to make this unified
branch available when I post the next two patches I'd like to upstream,
as I think it will actually help at that point.

My hope is that you will be willing to accept this change on the
understanding that this code might need to be modified in the future.

For my part I also accept that this code might need to change in the
future.

Thanks,
Andrew

---

### Patch to show 'addr == end' doesn't work (yet) ###

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index c178b13d96d..60daf032036 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -11351,7 +11351,7 @@ dwarf2_addr_in_block_ranges (CORE_ADDR addr, struct block *block)
   /* Check if ADDR is within any of the block's sub-ranges.  */
   for (const blockrange &br : block->ranges ())
     {
-      if (addr >= br.start () && addr < br.end ())
+      if (addr >= br.start () && addr <= br.end ())
 	return true;
     }
 
diff --git a/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.exp b/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.exp
index 79b1783b2ec..9e4fb781a8d 100644
--- a/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.exp
+++ b/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.exp
@@ -179,8 +179,8 @@ proc run_test { entry_label dwarf_version } {
 	     "    $::foo_5\\.\\.$::foo_6"]
 }
 
-foreach_with_prefix entry_label { foo_3 foo_4 } {
-    foreach_with_prefix dwarf_version { 4 5 } {
+foreach_with_prefix entry_label { foo_2 } {
+    foreach_with_prefix dwarf_version { 4 } {
 	run_test $entry_label $dwarf_version
     }
 }
  
Bernd Edlinger Nov. 27, 2024, 8:12 p.m. UTC | #8
On 11/26/24 18:48, Andrew Burgess wrote:
> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
> 
>> On 11/25/24 15:30, Andrew Burgess wrote:
>>> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>>>
>>>> Okay, I just wanted to point out that in my opinion the debug info which
>>>> points at the end of a sub-range is not incorrect, just maybe on a border
>>>> line, where the dwarf spec is unclear.  So you should not say:
>>>> "after all, the DWARF spec is clear that such a range covers no code."
>>>>
>>>> But there are obviously not only cases where the entry_pc points at
>>>> an empty sub-range, but also in very rare cases the entry_pc points at
>>>> the end of a non-empty sub-range.
>>>> So could you please change the check in dwarf2_addr_in_block_ranges
>>>> from addr >= start && addr < end to addr >= start && addr <= end.
>>>
>>> Could you expand on why you believe that the DWARF spec is unclear in
>>> this regard.  I came to my conclusion based on this text within the
>>> DWARF-5 specification, section 2.17.3 Non-Contiguous Address Ranges:
>>>
>>>   Bounded range. This kind of entry defines an address range that is
>>>   included in the range list. The starting address is the lowest address
>>>   of the address range. The ending address is the address of the first
>>>   location past the highest address of the address range
>>>
>>> This seems pretty clear (to me) that the end address is not part of the
>>> region covered by a range.
>>>
>>
>> Yes, but on the other hand, when we look at line table entries, each has a
>> PC and a VIEW number, and even the DW_AT_entry_pc has a DW_AT_GNU_entry_view,
>> just the range list does not have a view number, and that is inconsistent
>> with the concept of location views.
>>
>> Consider as a simple example an inline function:
>>
>> int f(int x)
>> {
>>  x++;
>>  return x;
>> }
>>
>> it will most likely just be compiled into one "inc eax" or similar,
>> and of course you may want to set a break point on the return statement,
>> to inspect 'x' after the increment, but that will be on 'pc == end' !
>>
>> But if the location view number would not be missing from the rnglist
>> it would be obvious whether the corresponding view number is still within
>> subroutine and not outside.  So in my opinion it is a defect in the
>> specification that it does not reflect this use case.
> 
> You make an interesting argument that the specification is deficient.
> But I'm not sure how this helps with this discussion.  I would like to
> avoid derailing this conversation with discussion of missing DWARF
> features.
> 
>>
>>> Additionally, if we start to accept 'addr == end' then this is going to
>>> cause problems elsewhere.  GDB will place a b/p at the 'end' address,
>>> but when GDB then performs block lookup, GDB will not return the block
>>> we expect, and so GDB will not report the inferior as having stopped in
>>> the scope that the user expects.
>>>
>>
>> No, because this is exactly what the core of my patch does, admittedly
>> I also modified the block lookup code a bit, to handle that case.
>> So I strongly disagree here: we have to accept 'addr == end' and other
>> corner cases, otherwise my patch won't work in the end, regardless of in
>> how many small bug-fixes it can be split up, because it depends exactly
>> on not ignoring any information while parsing the debug info.
> 
> But accepting 'addr == end' only works if you also change the block
> lookup mechanism, which isn't part of this patch.  This patch is based
> on the state of block lookup as it exists today.
> 
> I've included a patch below which applies on top of this patch (i.e. the
> one this thread is about), it changes the check to accept 'addr == end'
> as you suggest.  It also updates the test so that an inline function
> (bar) has DW_AT_entry_pc point at the 'end' address of a non-empty
> sub-range.
> 
> Here's a GDB session with that patch applied:
> 
>   (gdb) b bar
>   Breakpoint 1 at 0x401137
>   (gdb) r
>   Starting program: /tmp/gdb/testsuite/outputs/gdb.dwarf2/dw2-entry-pc-in-empty-range/dw2-entry-pc-in-empty-range-4 
>   
>   Breakpoint 1, 0x0000000000401137 in foo ()
>   (gdb) maintenance info blocks 
>   Blocks at 0x401137:
>     from objfile: [(objfile *) 0x3b11720] /tmp/gdb/testsuite/outputs/gdb.dwarf2/dw2-entry-pc-in-empty-range/dw2-entry-pc-in-empty-range-4
>   
>   [(block *) 0x357e680] 0x401106..0x401185
>     entry pc: 0x401106
>     is global block
>     symbol count: 1
>     is contiguous
>   [(block *) 0x357e630] 0x401106..0x401185
>     entry pc: 0x401106
>     is static block
>     is contiguous
>   [(block *) 0x357e5e0] 0x401106..0x401185
>     entry pc: 0x401106
>     function: foo
>     is contiguous
>   (gdb) 
> 
> As you can see, GDB stops at an address which doesn't then resolve to
> the bar block.
> 

Ah, okay, thanks for the test case I looked into it and tried it out with
a gdb built with my patch.

A breakpoint an an empty subrange works as expected, but indeed a breakpoint
at the end of a non-empty subrange does behave as you described.
But with real code examples a break point at the end of a non-empty subrange
does work, and that is because my "heuristic" which does the magic, uses
weak line-table entries near the end of a subrange to determine what to do
here.  So because the test case does not have a line table at all, the
test case is not realistic in that aspect, and does not tell you what happens
here in a real world example.

So a patch that allows for start <= addr && addr <= end should be applied,
probably with a line table that looks more like a real line table especially
at the end of a subrange, then the test should probably work out of the box.
But it is okay to do that after the rest of the series is applied.



> If/when the block lookup code is changed as you propose then this
> restriction (addr < end) can be relaxed.  But it doesn't make sense to
> merge the relaxed restriction, without the block lookup changes.
> 
> And no, I don't see that as a reason to merge all of the changes at
> once.  I think splitting the original large change into many small steps
> is the correct approach.  And sometimes that will mean that we check
> something in, and then revise it in a later commit.  That's not a
> problem with this approach, it's an advantage of this approach.  It
> makes it clearer how we got to the final destination.  And each step is
> smaller, and easier to review.  This is the preferred approach for GDB
> patches.
> 
> I feel that, as the author of the original large change, you're looking
> ahead and you're frustrated that this code isn't inline with how you
> feel the code should finally look.  But just because I hope we can check
> this code in first, doesn't mean that I will prevent this code being
> changed later on.  As GDB evolves (e.g. if the block lookup code
> changes) then I'm happy for this code to evolve with it.
> 
> To (I hope) offer you some confidence, I have, on my machine, created a
> branch with this patch (without the 'addr == end' change), followed by
> the next two patches I plan to post (once this is merged), and then, on
> top of that, I have rebased your original series.  This includes all of
> your original tests completely unmodified.
> 
> I have tested this merge with gcc versions 14.2.0, 13.3.0, 12.2.0,
> 11.5.0, 10.5.0, 9.5.0, 9.3.1, 8.4.0, 8.1.0, and in all cases, all of
> your original tests pass.  I'd rather not post this merged branch just
> yet, as I'm worried that this might derail review of this patch even
> more, I really don't want to start discussing the next patches before
> they are even posted, but if it's the only way to move this patch
> forward then I could share the branch.  I do plan to make this unified
> branch available when I post the next two patches I'd like to upstream,
> as I think it will actually help at that point.
> 
> My hope is that you will be willing to accept this change on the
> understanding that this code might need to be modified in the future.
> 
> For my part I also accept that this code might need to change in the
> future.
> 


Well okay, please go ahead.


Thanks
Bernd.

> Thanks,
> Andrew
> 
> ---
> 
> ### Patch to show 'addr == end' doesn't work (yet) ###
> 
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index c178b13d96d..60daf032036 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -11351,7 +11351,7 @@ dwarf2_addr_in_block_ranges (CORE_ADDR addr, struct block *block)
>    /* Check if ADDR is within any of the block's sub-ranges.  */
>    for (const blockrange &br : block->ranges ())
>      {
> -      if (addr >= br.start () && addr < br.end ())
> +      if (addr >= br.start () && addr <= br.end ())
>  	return true;
>      }
>  
> diff --git a/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.exp b/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.exp
> index 79b1783b2ec..9e4fb781a8d 100644
> --- a/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.exp
> +++ b/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.exp
> @@ -179,8 +179,8 @@ proc run_test { entry_label dwarf_version } {
>  	     "    $::foo_5\\.\\.$::foo_6"]
>  }
>  
> -foreach_with_prefix entry_label { foo_3 foo_4 } {
> -    foreach_with_prefix dwarf_version { 4 5 } {
> +foreach_with_prefix entry_label { foo_2 } {
> +    foreach_with_prefix dwarf_version { 4 } {
>  	run_test $entry_label $dwarf_version
>      }
>  }
>
  
Andrew Burgess Nov. 28, 2024, 10:10 a.m. UTC | #9
Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

> On 11/26/24 18:48, Andrew Burgess wrote:
>> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>> 
>>> On 11/25/24 15:30, Andrew Burgess wrote:
>>>> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>>>>
>>>>> Okay, I just wanted to point out that in my opinion the debug info which
>>>>> points at the end of a sub-range is not incorrect, just maybe on a border
>>>>> line, where the dwarf spec is unclear.  So you should not say:
>>>>> "after all, the DWARF spec is clear that such a range covers no code."
>>>>>
>>>>> But there are obviously not only cases where the entry_pc points at
>>>>> an empty sub-range, but also in very rare cases the entry_pc points at
>>>>> the end of a non-empty sub-range.
>>>>> So could you please change the check in dwarf2_addr_in_block_ranges
>>>>> from addr >= start && addr < end to addr >= start && addr <= end.
>>>>
>>>> Could you expand on why you believe that the DWARF spec is unclear in
>>>> this regard.  I came to my conclusion based on this text within the
>>>> DWARF-5 specification, section 2.17.3 Non-Contiguous Address Ranges:
>>>>
>>>>   Bounded range. This kind of entry defines an address range that is
>>>>   included in the range list. The starting address is the lowest address
>>>>   of the address range. The ending address is the address of the first
>>>>   location past the highest address of the address range
>>>>
>>>> This seems pretty clear (to me) that the end address is not part of the
>>>> region covered by a range.
>>>>
>>>
>>> Yes, but on the other hand, when we look at line table entries, each has a
>>> PC and a VIEW number, and even the DW_AT_entry_pc has a DW_AT_GNU_entry_view,
>>> just the range list does not have a view number, and that is inconsistent
>>> with the concept of location views.
>>>
>>> Consider as a simple example an inline function:
>>>
>>> int f(int x)
>>> {
>>>  x++;
>>>  return x;
>>> }
>>>
>>> it will most likely just be compiled into one "inc eax" or similar,
>>> and of course you may want to set a break point on the return statement,
>>> to inspect 'x' after the increment, but that will be on 'pc == end' !
>>>
>>> But if the location view number would not be missing from the rnglist
>>> it would be obvious whether the corresponding view number is still within
>>> subroutine and not outside.  So in my opinion it is a defect in the
>>> specification that it does not reflect this use case.
>> 
>> You make an interesting argument that the specification is deficient.
>> But I'm not sure how this helps with this discussion.  I would like to
>> avoid derailing this conversation with discussion of missing DWARF
>> features.
>> 
>>>
>>>> Additionally, if we start to accept 'addr == end' then this is going to
>>>> cause problems elsewhere.  GDB will place a b/p at the 'end' address,
>>>> but when GDB then performs block lookup, GDB will not return the block
>>>> we expect, and so GDB will not report the inferior as having stopped in
>>>> the scope that the user expects.
>>>>
>>>
>>> No, because this is exactly what the core of my patch does, admittedly
>>> I also modified the block lookup code a bit, to handle that case.
>>> So I strongly disagree here: we have to accept 'addr == end' and other
>>> corner cases, otherwise my patch won't work in the end, regardless of in
>>> how many small bug-fixes it can be split up, because it depends exactly
>>> on not ignoring any information while parsing the debug info.
>> 
>> But accepting 'addr == end' only works if you also change the block
>> lookup mechanism, which isn't part of this patch.  This patch is based
>> on the state of block lookup as it exists today.
>> 
>> I've included a patch below which applies on top of this patch (i.e. the
>> one this thread is about), it changes the check to accept 'addr == end'
>> as you suggest.  It also updates the test so that an inline function
>> (bar) has DW_AT_entry_pc point at the 'end' address of a non-empty
>> sub-range.
>> 
>> Here's a GDB session with that patch applied:
>> 
>>   (gdb) b bar
>>   Breakpoint 1 at 0x401137
>>   (gdb) r
>>   Starting program: /tmp/gdb/testsuite/outputs/gdb.dwarf2/dw2-entry-pc-in-empty-range/dw2-entry-pc-in-empty-range-4 
>>   
>>   Breakpoint 1, 0x0000000000401137 in foo ()
>>   (gdb) maintenance info blocks 
>>   Blocks at 0x401137:
>>     from objfile: [(objfile *) 0x3b11720] /tmp/gdb/testsuite/outputs/gdb.dwarf2/dw2-entry-pc-in-empty-range/dw2-entry-pc-in-empty-range-4
>>   
>>   [(block *) 0x357e680] 0x401106..0x401185
>>     entry pc: 0x401106
>>     is global block
>>     symbol count: 1
>>     is contiguous
>>   [(block *) 0x357e630] 0x401106..0x401185
>>     entry pc: 0x401106
>>     is static block
>>     is contiguous
>>   [(block *) 0x357e5e0] 0x401106..0x401185
>>     entry pc: 0x401106
>>     function: foo
>>     is contiguous
>>   (gdb) 
>> 
>> As you can see, GDB stops at an address which doesn't then resolve to
>> the bar block.
>> 
>
> Ah, okay, thanks for the test case I looked into it and tried it out with
> a gdb built with my patch.
>
> A breakpoint an an empty subrange works as expected, but indeed a breakpoint
> at the end of a non-empty subrange does behave as you described.
> But with real code examples a break point at the end of a non-empty subrange
> does work, and that is because my "heuristic" which does the magic, uses
> weak line-table entries near the end of a subrange to determine what to do
> here.  So because the test case does not have a line table at all, the
> test case is not realistic in that aspect, and does not tell you what happens
> here in a real world example.
>
> So a patch that allows for start <= addr && addr <= end should be applied,
> probably with a line table that looks more like a real line table especially
> at the end of a subrange, then the test should probably work out of the box.
> But it is okay to do that after the rest of the series is applied.

Could you include the details for what you'd like the line table to look
like please, I guess from the test program you were trying.  I'd like to
try and get the updated test right first time!  Ideally I guess I'd need
the `maint info blocks` output showing the inline function block, and
the `maint info line-table` output for the line table that covers the
inline and its containing function.

I'll get starting using my own demo programs, but I'd really like to
meet your expectations on this.

Thanks,
Andrew
  

Patch

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 5f0b0d4e5d6..c178b13d96d 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -11338,6 +11338,28 @@  dwarf2_die_base_address (struct die_info *die, struct block *block,
   return {};
 }
 
+/* Return true if ADDR is within any of the ranges covered by BLOCK.  If
+   there are no sub-ranges then just check against the block's start and
+   end addresses, otherwise, check each sub-range covered by the block.  */
+
+static bool
+dwarf2_addr_in_block_ranges (CORE_ADDR addr, struct block *block)
+{
+  if (block->ranges ().size () == 0)
+    return addr >= block->start () && addr < block->end ();
+
+  /* Check if ADDR is within any of the block's sub-ranges.  */
+  for (const blockrange &br : block->ranges ())
+    {
+      if (addr >= br.start () && addr < br.end ())
+	return true;
+    }
+
+  /* ADDR is not within any of the block's sub-ranges.  */
+  return false;
+}
+
+
 /* Set the entry PC for BLOCK which represents DIE from CU.  Relies on the
    range information (if present) already having been read from DIE and
    stored into BLOCK.  */
@@ -11398,7 +11420,7 @@  dwarf2_record_block_entry_pc (struct die_info *die, struct block *block,
 
 	 To avoid this, ignore entry-pc values that are outside the block's
 	 range, GDB will then select a suitable default entry-pc.  */
-      if (entry_pc >= block->start () && entry_pc < block->end ())
+      if (dwarf2_addr_in_block_ranges (entry_pc, block))
 	block->set_entry_pc (entry_pc);
       else
 	complaint (_("in %s, DIE %s, DW_AT_entry_pc (%s) outside "
diff --git a/gdb/testsuite/gdb.cp/step-and-next-inline.exp b/gdb/testsuite/gdb.cp/step-and-next-inline.exp
index af1719dc53a..e16c2cca82d 100644
--- a/gdb/testsuite/gdb.cp/step-and-next-inline.exp
+++ b/gdb/testsuite/gdb.cp/step-and-next-inline.exp
@@ -55,6 +55,32 @@  proc do_test { use_header } {
 
     set main_location [gdb_get_line_number "Beginning of main" $srcfile]
 
+    if {![runto_main]} {
+	return
+    }
+
+    gdb_breakpoint tree_check
+
+    # Check that GDB can correctly stop in `tree_check`.  On some
+    # targets. gcc will use DW_AT_ranges to represent the addresses of
+    # tree_check, and in some cases, will create an empty sub-range
+    # for some of the tree_check code.  To really confuse things, gcc
+    # will then set the DW_AT_entry_pc to point at the address of the
+    # empty sub-range.
+    #
+    # The result of this is that GDB would stop at the DW_AT_entry_pc,
+    # but then GDB would fail to realise that this address was inside
+    # tree_check.
+    for { set i 1 } { $i < 4 } { incr i } {
+	gdb_test "continue" \
+	    [multi_line \
+		 "Breakpoint $::decimal\\.$i, (?:$::hex in )?tree_check \\(\[^\r\n\]+\\) at \[^\r\n\]+/$hdrfile:$::decimal" \
+		 "$::decimal\\s+\[^\r\n\]+"] \
+	    "stop at tree_check, $i"
+    }
+
+    clean_restart $executable
+
     if ![runto $main_location qualified] {
 	return
     }
diff --git a/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.c b/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.c
new file mode 100644
index 00000000000..cf43727f1fb
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.c
@@ -0,0 +1,57 @@ 
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+volatile int global_var = 0;
+
+void
+foo (void)	/* foo decl line */
+{
+  /* This label is used to find the start of 'foo' when generating the
+     debug information.  Place nothing before it.  */
+  asm ("foo_label: .globl foo_label");
+  ++global_var;
+
+  asm ("foo_0: .globl foo_0");
+  ++global_var;		/* bar call line */
+
+  asm ("foo_1: .globl foo_1");
+  ++global_var;
+
+  asm ("foo_2: .globl foo_2");
+  ++global_var;
+
+  asm ("foo_3: .globl foo_3");
+  ++global_var;
+
+  asm ("foo_4: .globl foo_4");
+  ++global_var;
+
+  asm ("foo_5: .globl foo_5");
+  ++global_var;
+
+  asm ("foo_6: .globl foo_6");
+  ++global_var;
+
+  asm ("foo_7: .globl foo_7");
+}
+
+int
+main (void)
+{
+  asm ("main_label: .globl main_label");
+  foo ();
+}
diff --git a/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.exp b/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.exp
new file mode 100644
index 00000000000..79b1783b2ec
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/dw2-entry-pc-in-empty-range.exp
@@ -0,0 +1,186 @@ 
+# Copyright 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Create an inline function which uses DW_AT_ranges and which has a
+# DW_AT_entry_pc.
+#
+# Within the function's ranges, create an empty sub-range, many
+# versions of gcc (8.x to at least 14.x) do this, and point the
+# DW_AT_entry_pc at this empty sub-range (at last 8.x to 9.x did
+# this).
+#
+# Now place a breakpoint on the inline function and run to the
+# breakpoint, check that GDB reports we are inside the inline
+# function.
+#
+# At one point GDB would use the entry-pc value as the breakpoint
+# location even though that address is not actually associated with
+# the inline function.  Now GDB will reject the entry-pc value and
+# select a suitable default entry-pc value instead, one which is
+# associated with the inline function.
+
+load_lib dwarf.exp
+
+require dwarf2_support
+
+standard_testfile
+
+# This compiles the source file and starts and stops GDB, so run it
+# before calling prepare_for_testing otherwise GDB will have exited.
+get_func_info foo
+
+if { [prepare_for_testing "failed to prepare" ${testfile} \
+	  [list ${srcfile}]] } {
+    return
+}
+
+if ![runto_main] {
+    return
+}
+
+# Some label addresses, needed to match against the output later.
+foreach foo {foo_1 foo_2 foo_3 foo_4 foo_5 foo_6} {
+    set $foo [get_hexadecimal_valueof "&$foo" "UNKNOWN" \
+		  "get address for $foo label"]
+}
+
+# Some line numbers needed in the generated DWARF.
+set foo_decl_line [gdb_get_line_number "foo decl line"]
+set bar_call_line [gdb_get_line_number "bar call line"]
+
+if [is_ilp32_target] {
+    set ptr_type "data4"
+} else {
+    set ptr_type "data8"
+}
+
+# Setup the fake DWARF (see comment at top of this file for more
+# details).  Use DWARF_VERSION (either 4 or 5) to select which type of
+# ranges are created.  Compile the source and generated DWARF and run
+# the test.
+#
+# The ENTRY_LABEL is the label to use as the entry-pc value.  The
+# useful choices are 'foo_3', this label is for an empty sub-range, or
+# 'foo_4', this label is within the blocks low/high addresses, but is
+# not in any sub-range for the block at all.
+#
+# The 'foo_4' case is not something that has been seen generated by
+# any compiler, but it doesn't hurt to test.
+
+proc run_test { entry_label dwarf_version } {
+    set dw_testname "${::testfile}-${dwarf_version}"
+
+    set asm_file [standard_output_file "${dw_testname}.S"]
+    Dwarf::assemble $asm_file {
+	upvar dwarf_version dwarf_version
+	upvar entry_label entry_label
+
+	declare_labels lines_table inline_func ranges_label
+
+	cu { version $dwarf_version } {
+	    compile_unit {
+		{producer "gcc"}
+		{language @DW_LANG_C}
+		{name $::srcfile}
+		{comp_dir /tmp}
+		{stmt_list $lines_table DW_FORM_sec_offset}
+		{low_pc 0 addr}
+	    } {
+		inline_func: subprogram {
+		    {name bar}
+		    {inline @DW_INL_declared_inlined}
+		}
+		subprogram {
+		    {name foo}
+		    {decl_file 1 data1}
+		    {decl_line $::foo_decl_line data1}
+		    {decl_column 1 data1}
+		    {low_pc $::foo_start addr}
+		    {high_pc $::foo_len $::ptr_type}
+		    {external 1 flag}
+		} {
+		    inlined_subroutine {
+			{abstract_origin %$inline_func}
+			{call_file 1 data1}
+			{call_line $::bar_call_line data1}
+			{entry_pc $entry_label addr}
+			{ranges ${ranges_label} DW_FORM_sec_offset}
+		    }
+		}
+	    }
+	}
+
+	lines {version 2} lines_table {
+	    include_dir "$::srcdir/$::subdir"
+	    file_name "$::srcfile" 1
+	}
+
+	if { $dwarf_version == 5 } {
+	    rnglists {} {
+		table {} {
+		    ranges_label: list_ {
+			start_end foo_3 foo_3
+			start_end foo_1 foo_2
+			start_end foo_5 foo_6
+		    }
+		}
+	    }
+	} else {
+	    ranges { } {
+		ranges_label: sequence {
+		    range foo_3 foo_3
+		    range foo_1 foo_2
+		    range foo_5 foo_6
+		}
+	    }
+	}
+    }
+
+    if {[prepare_for_testing "failed to prepare" "${dw_testname}" \
+	     [list $::srcfile $asm_file] {nodebug}]} {
+	return false
+    }
+
+    if ![runto_main] {
+	return false
+    }
+
+    # Place a breakpoint on `bar` and run to the breakpoint.  Use
+    # gdb_test as we want full pattern matching against the stop
+    # location.
+    gdb_breakpoint bar
+    gdb_test "continue" \
+	"Breakpoint $::decimal, $::hex in bar \\(\\)"
+
+    # Inspect the block structure of `bar` at this location.  We are
+    # expecting that the empty range (that contained the entry-pc) has
+    # been removed from the block, and that the entry-pc has its
+    # default value.
+    gdb_test "maint info blocks" \
+	[multi_line \
+	     "\\\[\\(block \\*\\) $::hex\\\] $::foo_1\\.\\.$::foo_6" \
+	     "  entry pc: $::foo_1" \
+	     "  inline function: bar" \
+	     "  symbol count: $::decimal" \
+	     "  address ranges:" \
+	     "    $::foo_1\\.\\.$::foo_2" \
+	     "    $::foo_5\\.\\.$::foo_6"]
+}
+
+foreach_with_prefix entry_label { foo_3 foo_4 } {
+    foreach_with_prefix dwarf_version { 4 5 } {
+	run_test $entry_label $dwarf_version
+    }
+}