[v2] dwarf_getaranges: Build aranges list from CUs instead of .debug_aranges

Message ID 20231211231853.116254-1-amerey@redhat.com
State Superseded
Headers
Series [v2] dwarf_getaranges: Build aranges list from CUs instead of .debug_aranges |

Commit Message

Aaron Merey Dec. 11, 2023, 11:18 p.m. UTC
  No longer use .debug_aranges to build the aranges list since it could be
absent or incomplete.

Instead build the aranges list by iterating over each CU and recording
each address range.

https://sourceware.org/bugzilla/show_bug.cgi?id=22288
https://sourceware.org/bugzilla/show_bug.cgi?id=30948

Signed-off-by: Aaron Merey <amerey@redhat.com>
---

v2 adds a test for generating aranges from a binary with no
.debug_aranges.

This patch's method of building the aranges list is slower than simply
reading .debug_aranges.  On my machine, running eu-stack on a 2.9G
firefox core file takes about 8.7 seconds with this patch applied,
compared to about 3.3 seconds without this patch.

Ideally we could assume that .debug_aranges is complete if it is present
and build the aranges list via CU iteration only when .debug_aranges
is absent.  This would let us save time on gcc-compiled binaries, which
include complete .debug_aranges by default.

However the DWARF spec appears to permit partially complete
.debug_aranges [1].  We could improve performance by starting with a
potentially incomplete list built from .debug_aranges.  If a lookup
fails then search the CUs for missing aranges and add to the list
when found.

This approach would complicate the dwarf_get_aranges interface.  The
list it initially provides could no longer be assumed to be complete.
The number of elements in the list could change during calls to
dwarf_getarange{info, _addr}.  This would invalidate the naranges value
set by dwarf_getaranges.  The current API doesn't include a way to
communicate to the caller when narages changes and by how much.

Due to these complications I think it's better to simply ignore
.debug_aranges altogether and build the aranges table via CU iteration,
as is done in this patch.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=22288#c5

 libdw/dwarf_getaranges.c | 215 ++++++++++-----------------------------
 tests/run-get-aranges.sh |  48 +++++++++
 2 files changed, 100 insertions(+), 163 deletions(-)
  

Comments

Mark Wielaard Feb. 13, 2024, 1:28 p.m. UTC | #1
Hi Aaron,

On Mon, 2023-12-11 at 18:18 -0500, Aaron Merey wrote:
> No longer use .debug_aranges to build the aranges list since it could be
> absent or incomplete.
> 
> Instead build the aranges list by iterating over each CU and recording
> each address range.
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=22288
> https://sourceware.org/bugzilla/show_bug.cgi?id=30948
> 
> Signed-off-by: Aaron Merey <amerey@redhat.com>
> ---
> 
> v2 adds a test for generating aranges from a binary with no
> .debug_aranges.
> 
> This patch's method of building the aranges list is slower than simply
> reading .debug_aranges.  On my machine, running eu-stack on a 2.9G
> firefox core file takes about 8.7 seconds with this patch applied,
> compared to about 3.3 seconds without this patch.

That is significant. 2.5 times slower.
Did you check with perf or some other profiler where exactly the extra
time goes. Does the new method find more aranges (and so produces
"better" backtraces)?

> Ideally we could assume that .debug_aranges is complete if it is present
> and build the aranges list via CU iteration only when .debug_aranges
> is absent.  This would let us save time on gcc-compiled binaries, which
> include complete .debug_aranges by default.

Right. This why the question is if the firefox case sees more/less
aranges. If I remember correctly it is build with gcc and rustc, and
rustc might not produce .debug_aranges.

> However the DWARF spec appears to permit partially complete
> .debug_aranges [1].  We could improve performance by starting with a
> potentially incomplete list built from .debug_aranges.  If a lookup
> fails then search the CUs for missing aranges and add to the list
> when found.
> 
> This approach would complicate the dwarf_get_aranges interface.  The
> list it initially provides could no longer be assumed to be complete.
> The number of elements in the list could change during calls to
> dwarf_getarange{info, _addr}.  This would invalidate the naranges value
> set by dwarf_getaranges.  The current API doesn't include a way to
> communicate to the caller when narages changes and by how much.
> 
> Due to these complications I think it's better to simply ignore
> .debug_aranges altogether and build the aranges table via CU iteration,
> as is done in this patch.

Might it be an idea to leave dwarf_getaranges as it is and introduce a
new (internal) function to get "dynamic" ranges? It looks like what
programs (like eu-stack and eu-addr2line) really use is dwarf_addrdie
and dwfl_module_addrdie. These are currently build on dwarf_getaranges,
but could maybe use a new interface?

> [1] https://sourceware.org/bugzilla/show_bug.cgi?id=22288#c5

So this comment says that "parsing CUs lightly (just enough to get
their CU ranges) should be fairly cheap", but it seems that isn't
really true. Or at least parsing .debug_aranges is a lot (2.5 times)
faster (measured in seconds).

It would be good to better understand why this is.

Cheers,

Mark
  
Aaron Merey Feb. 20, 2024, 4:20 a.m. UTC | #2
Hi Mark,

On Tue, Feb 13, 2024 at 8:28 AM Mark Wielaard <mark@klomp.org> wrote:
>
> > This patch's method of building the aranges list is slower than simply
> > reading .debug_aranges.  On my machine, running eu-stack on a 2.9G
> > firefox core file takes about 8.7 seconds with this patch applied,
> > compared to about 3.3 seconds without this patch.
>
> That is significant. 2.5 times slower.
> Did you check with perf or some other profiler where exactly the extra
> time goes. Does the new method find more aranges (and so produces
> "better" backtraces)?

I took another look at the performance and realized I made a silly
mistake when I originally tested this.  My build that was 2.5x slower
was compiled with -O0 but I tested it against an -O2 build.  Oops!

With the optimization level set to -O2 in all cases, the runtime of
'eu-stack -s' on the original 2.9G firefox core file is only about
9% slower: 3.6 seconds with the patch applied compared to 3.3
seconds without the patch.

As for the number of aranges found, there is a difference for libxul.so:
250435 with the patch compared to 254832 without.  So 4397 fewer aranges
are found when using the new CU iteration method.  I'll dig into this and
see if there is a problem or if it's just due to some redundancy in
libxul's .debug_aranges.  FWIW there was no change to the aranges counts
for the other modules searched during this eu-stack firefox corefile test.

>
> > Ideally we could assume that .debug_aranges is complete if it is present
> > and build the aranges list via CU iteration only when .debug_aranges
> > is absent.  This would let us save time on gcc-compiled binaries, which
> > include complete .debug_aranges by default.
>
> Right. This why the question is if the firefox case sees more/less
> aranges. If I remember correctly it is build with gcc and rustc, and
> rustc might not produce .debug_aranges.
>
> > However the DWARF spec appears to permit partially complete
> > .debug_aranges [1].  We could improve performance by starting with a
> > potentially incomplete list built from .debug_aranges.  If a lookup
> > fails then search the CUs for missing aranges and add to the list
> > when found.
> >
> > This approach would complicate the dwarf_get_aranges interface.  The
> > list it initially provides could no longer be assumed to be complete.
> > The number of elements in the list could change during calls to
> > dwarf_getarange{info, _addr}.  This would invalidate the naranges value
> > set by dwarf_getaranges.  The current API doesn't include a way to
> > communicate to the caller when narages changes and by how much.
> >
> > Due to these complications I think it's better to simply ignore
> > .debug_aranges altogether and build the aranges table via CU iteration,
> > as is done in this patch.
>
> Might it be an idea to leave dwarf_getaranges as it is and introduce a
> new (internal) function to get "dynamic" ranges? It looks like what
> programs (like eu-stack and eu-addr2line) really use is dwarf_addrdie
> and dwfl_module_addrdie. These are currently build on dwarf_getaranges,
> but could maybe use a new interface?

IMO this depends on what users expect from dwarf_getaranges.  Do they
want the exact contents of .debug_aranges (whether or not it's complete)
or should dwarf_getaranges go beyond .debug_aranges to ensure the most
complete results?

The comment for dwarf_getaranges in libdw.h simply reads "Return list
address ranges".  Since there's no mention of .debug_aranges specifically,
I think it's fair if dwarf_getaranges does whatever it can to ensure
comprehensive results.  In which case dwarf_getaranges should probably
dynamically generate aranges.

Aaron
  
Mark Wielaard Feb. 20, 2024, 10:23 p.m. UTC | #3
Hi Aaron,

We already discussed on irc, but just for the record.

On Mon, Feb 19, 2024 at 11:20:13PM -0500, Aaron Merey wrote:
> On Tue, Feb 13, 2024 at 8:28 AM Mark Wielaard <mark@klomp.org> wrote:
> >
> > > This patch's method of building the aranges list is slower than simply
> > > reading .debug_aranges.  On my machine, running eu-stack on a 2.9G
> > > firefox core file takes about 8.7 seconds with this patch applied,
> > > compared to about 3.3 seconds without this patch.
> >
> > That is significant. 2.5 times slower.
> > Did you check with perf or some other profiler where exactly the extra
> > time goes. Does the new method find more aranges (and so produces
> > "better" backtraces)?
> 
> I took another look at the performance and realized I made a silly
> mistake when I originally tested this.  My build that was 2.5x slower
> was compiled with -O0 but I tested it against an -O2 build.  Oops!
> 
> With the optimization level set to -O2 in all cases, the runtime of
> 'eu-stack -s' on the original 2.9G firefox core file is only about
> 9% slower: 3.6 seconds with the patch applied compared to 3.3
> seconds without the patch.

OK, still a slowdown, but 9% is much more reasonable given we are
doing more work now. Good.

> As for the number of aranges found, there is a difference for libxul.so:
> 250435 with the patch compared to 254832 without.  So 4397 fewer aranges
> are found when using the new CU iteration method.  I'll dig into this and
> see if there is a problem or if it's just due to some redundancy in
> libxul's .debug_aranges.  FWIW there was no change to the aranges counts
> for the other modules searched during this eu-stack firefox corefile test.

A quick way to see where the differences are is using
eu-readelf --debug-dump=decodedaranges before/after your patch.

This is opposite to what I expected. I had expected there to be more,
instead of less ranges. The difference is less than 2%. But still
interesting to know what/why.

Were there any differences in the backtraces? If not, then those
ranges might not actually have been mapping to code.

> > Might it be an idea to leave dwarf_getaranges as it is and introduce a
> > new (internal) function to get "dynamic" ranges? It looks like what
> > programs (like eu-stack and eu-addr2line) really use is dwarf_addrdie
> > and dwfl_module_addrdie. These are currently build on dwarf_getaranges,
> > but could maybe use a new interface?
> 
> IMO this depends on what users expect from dwarf_getaranges.  Do they
> want the exact contents of .debug_aranges (whether or not it's complete)
> or should dwarf_getaranges go beyond .debug_aranges to ensure the most
> complete results?
> 
> The comment for dwarf_getaranges in libdw.h simply reads "Return list
> address ranges".  Since there's no mention of .debug_aranges specifically,
> I think it's fair if dwarf_getaranges does whatever it can to ensure
> comprehensive results.  In which case dwarf_getaranges should probably
> dynamically generate aranges.

You might be right that no user really cares. But as seen in the
eu-readelf code, it might also be that people expected it to map to
the ranges from .debug_aranges.

So I would be happier if we just kept the dwarf_getaranges code as
is. And just change the code in dwarf_addrdie and dwfl_module_addrdie.

We could then also introduce a new public function, dwarf_getdieranges
(?) that does the new thing. But it doesn't have to be public on the
first try as long as dwarf_addrdie and dwfl_module_addrdie work. (We
might want to change the interface of dwarf_getdieranges so it can be
"lazy" for example.)

Cheers,

Mark
  
Aaron Merey Feb. 22, 2024, 3:19 a.m. UTC | #4
Hi Mark,

On Tue, Feb 20, 2024 at 5:23 PM Mark Wielaard <mark@klomp.org> wrote:
>
> > As for the number of aranges found, there is a difference for libxul.so:
> > 250435 with the patch compared to 254832 without.  So 4397 fewer aranges
> > are found when using the new CU iteration method.  I'll dig into this and
> > see if there is a problem or if it's just due to some redundancy in
> > libxul's .debug_aranges.  FWIW there was no change to the aranges counts
> > for the other modules searched during this eu-stack firefox corefile test.
>
> A quick way to see where the differences are is using
> eu-readelf --debug-dump=decodedaranges before/after your patch.
>
> This is opposite to what I expected. I had expected there to be more,
> instead of less ranges. The difference is less than 2%. But still
> interesting to know what/why.
>
> Were there any differences in the backtraces? If not, then those
> ranges might not actually have been mapping to code.

The backtraces were identical.  I took a closer look at this and the
difference is due to clang including DW_AT_location addresses in
.debug_aranges.  This patch just looks for DW_AT_{high,low}_pc and
DW_AT_ranges (via dwarf_ranges) when generating the aranges list,
so these DW_AT_location addresses aren't included.

According to David Blaikie [1], "GCC does not include globals in
debug_aranges, so they probably can't be relied upon unfortunately
(unfortunate that Clang pays the cost when it emits them, but consumers
can't rely on them)".

Since consumers already can't assume that .debug_aranges includes
DW_AT_location addresses, I think it's reasonable for us to not include
them when dynamically generating aranges.  Especially since there could
be a noticeable performance cost to doing so.

A separate issue I saw is that some entries in .debug_aranges with
a decoded start address of "000000000000000000" wouldn't always
have matching entries in .debug_ranges.  This resulted in some dynamic
aranges not matching their corresponding .debug_aranges entry.

For example, the following decoded arange is in libxul's .debug_aranges:

    start: 000000000000000000, length:    13, CU DIE offset: 95189496

In the .debug_info for this CU, there is a corresponding subprogram with
low_pc 0 and high_pc 13.

However in the .debug_ranges entry for this CU, there is no range starting
at address 0 with size of at least 13.  The closest range list entry is:

    range 1, 1
    +0x0000000000000001 <__ehdr_start+0x1>..
    +000000000000000000 <libxul.so>

When we dynamically generate aranges this results in the following
decoded arange:

    start: 0x0000000000000001, length:     0, CU DIE offset: 95189496

So the start address is off by 1 and the length doesn't match the
.debug_aranges entry.  In some similar cases there wouldn't even be
a corresponding "range 1, 1" entry in .debug_ranges.  I'm not yet sure
what's going on here.

>
> > > Might it be an idea to leave dwarf_getaranges as it is and introduce a
> > > new (internal) function to get "dynamic" ranges? It looks like what
> > > programs (like eu-stack and eu-addr2line) really use is dwarf_addrdie
> > > and dwfl_module_addrdie. These are currently build on dwarf_getaranges,
> > > but could maybe use a new interface?
> >
> > IMO this depends on what users expect from dwarf_getaranges.  Do they
> > want the exact contents of .debug_aranges (whether or not it's complete)
> > or should dwarf_getaranges go beyond .debug_aranges to ensure the most
> > complete results?
> >
> > The comment for dwarf_getaranges in libdw.h simply reads "Return list
> > address ranges".  Since there's no mention of .debug_aranges specifically,
> > I think it's fair if dwarf_getaranges does whatever it can to ensure
> > comprehensive results.  In which case dwarf_getaranges should probably
> > dynamically generate aranges.
>
> You might be right that no user really cares. But as seen in the
> eu-readelf code, it might also be that people expected it to map to
> the ranges from .debug_aranges.
>
> So I would be happier if we just kept the dwarf_getaranges code as
> is. And just change the code in dwarf_addrdie and dwfl_module_addrdie.
>
> We could then also introduce a new public function, dwarf_getdieranges
> (?) that does the new thing. But it doesn't have to be public on the
> first try as long as dwarf_addrdie and dwfl_module_addrdie work. (We
> might want to change the interface of dwarf_getdieranges so it can be
> "lazy" for example.)

Ok this approach seems like the most flexible.  Users can have both
.debug_aranges and CU-based aranges plus we don't have to change the
semantics of dwarf_getaranges.  I'll submit a revised patch for this.

Aaron

[1] https://reviews.llvm.org/D123538#inline-1188522
  
Frank Ch. Eigler Feb. 22, 2024, 3:35 p.m. UTC | #5
Hi -

> > We could then also introduce a new public function, dwarf_getdieranges
> > (?) that does the new thing. But it doesn't have to be public on the
> > first try as long as dwarf_addrdie and dwfl_module_addrdie work. (We
> > might want to change the interface of dwarf_getdieranges so it can be
> > "lazy" for example.)
> 
> Ok this approach seems like the most flexible.  Users can have both
> .debug_aranges and CU-based aranges plus we don't have to change the
> semantics of dwarf_getaranges.  [...]

Another option is to generate the .debug_aranges automatically, but
only if it is absent from the input dwarf file.  (An eu-readelf user
can tell if it was synthetic or not, if it matters.)

- FChE
  
Aaron Merey Feb. 22, 2024, 5:27 p.m. UTC | #6
Hi Frank,

On Thu, Feb 22, 2024 at 10:35 AM Frank Ch. Eigler <fche@redhat.com> wrote:
>
> > > We could then also introduce a new public function, dwarf_getdieranges
> > > (?) that does the new thing. But it doesn't have to be public on the
> > > first try as long as dwarf_addrdie and dwfl_module_addrdie work. (We
> > > might want to change the interface of dwarf_getdieranges so it can be
> > > "lazy" for example.)
> >
> > Ok this approach seems like the most flexible.  Users can have both
> > .debug_aranges and CU-based aranges plus we don't have to change the
> > semantics of dwarf_getaranges.  [...]
>
> Another option is to generate the .debug_aranges automatically, but
> only if it is absent from the input dwarf file.  (An eu-readelf user
> can tell if it was synthetic or not, if it matters.)

I'd like to be able to do this but .debug_aranges might be incomplete
even if the section is present in the input file.  This would require
us to parse CUs anyways in order to verify the section's completeness.

I'll defer to David Blaikie [1] again: "consumers shouldn't rely on the
presence of .debug_aranges, or them being complete (DWARF doesn't
require/guarantee this - and Clang doesn't emit aranges by default, for
instance). Each contribution to .debug_aranges says which CU it covers
and any CUs not covered should be parsed/assumed to contain things
covering other addresses."

Aaron

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=22288#c5
  

Patch

diff --git a/libdw/dwarf_getaranges.c b/libdw/dwarf_getaranges.c
index 27439d37..8676f93b 100644
--- a/libdw/dwarf_getaranges.c
+++ b/libdw/dwarf_getaranges.c
@@ -33,7 +33,6 @@ 
 #endif
 
 #include <stdlib.h>
-#include <assert.h>
 #include "libdwP.h"
 #include <dwarf.h>
 
@@ -68,174 +67,51 @@  dwarf_getaranges (Dwarf *dbg, Dwarf_Aranges **aranges, size_t *naranges)
       return 0;
     }
 
-  if (dbg->sectiondata[IDX_debug_aranges] == NULL)
-    {
-      /* No such section.  */
-      *aranges = NULL;
-      if (naranges != NULL)
-	*naranges = 0;
-      return 0;
-    }
-
-  if (dbg->sectiondata[IDX_debug_aranges]->d_buf == NULL)
-    return -1;
-
   struct arangelist *arangelist = NULL;
   unsigned int narangelist = 0;
 
-  const unsigned char *readp = dbg->sectiondata[IDX_debug_aranges]->d_buf;
-  const unsigned char *readendp
-    = readp + dbg->sectiondata[IDX_debug_aranges]->d_size;
-
-  while (readp < readendp)
+  Dwarf_CU *cu = NULL;
+  while (INTUSE(dwarf_get_units) (dbg, cu, &cu, NULL, NULL, NULL, NULL) == 0)
     {
-      const unsigned char *hdrstart = readp;
-
-      /* Each entry starts with a header:
-
-	 1. A 4-byte or 12-byte length containing the length of the
-	 set of entries for this compilation unit, not including the
-	 length field itself. [...]
-
-	 2. A 2-byte version identifier containing the value 2 for
-	 DWARF Version 2.1.
-
-	 3. A 4-byte or 8-byte offset into the .debug_info section. [...]
-
-	 4. A 1-byte unsigned integer containing the size in bytes of
-	 an address (or the offset portion of an address for segmented
-	 addressing) on the target system.
-
-	 5. A 1-byte unsigned integer containing the size in bytes of
-	 a segment descriptor on the target system.  */
-      if (unlikely (readp + 4 > readendp))
-	goto invalid;
-
-      Dwarf_Word length = read_4ubyte_unaligned_inc (dbg, readp);
-      unsigned int length_bytes = 4;
-      if (length == DWARF3_LENGTH_64_BIT)
-	{
-	  if (unlikely (readp + 8 > readendp))
-	    goto invalid;
-
-	  length = read_8ubyte_unaligned_inc (dbg, readp);
-	  length_bytes = 8;
-	}
-      else if (unlikely (length >= DWARF3_LENGTH_MIN_ESCAPE_CODE
-			 && length <= DWARF3_LENGTH_MAX_ESCAPE_CODE))
-	goto invalid;
-
-      const unsigned char *endp = readp + length;
-      if (unlikely (endp > readendp))
-	goto invalid;
-
-      if (unlikely (readp + 2 > readendp))
-	goto invalid;
-
-      unsigned int version = read_2ubyte_unaligned_inc (dbg, readp);
-      if (version != 2)
-	{
-	invalid:
-	  __libdw_seterrno (DWARF_E_INVALID_DWARF);
-	fail:
-	  while (arangelist != NULL)
-	    {
-	      struct arangelist *next = arangelist->next;
-	      free (arangelist);
-	      arangelist = next;
-	    }
-	  return -1;
-	}
-
-      Dwarf_Word offset = 0;
-      if (__libdw_read_offset_inc (dbg,
-				   IDX_debug_aranges, &readp,
-				   length_bytes, &offset, IDX_debug_info, 4))
-	goto fail;
-
-      /* Next up two bytes for address and segment size.  */
-      if (readp + 2 > readendp)
-	goto invalid;
-
-      unsigned int address_size = *readp++;
-      if (unlikely (address_size != 4 && address_size != 8))
-	goto invalid;
-
-      /* We don't actually support segment selectors.  */
-      unsigned int segment_size = *readp++;
-      if (segment_size != 0)
-	goto invalid;
-
-      /* Round the address to the next multiple of 2*address_size.  */
-      readp += ((2 * address_size - ((readp - hdrstart) % (2 * address_size)))
-		% (2 * address_size));
-
-      while (1)
-	{
-	  Dwarf_Word range_address;
-	  Dwarf_Word range_length;
-
-	  if (__libdw_read_address_inc (dbg, IDX_debug_aranges, &readp,
-					address_size, &range_address))
-	    goto fail;
-
-	  if (readp + address_size > readendp)
-	    goto invalid;
-
-	  if (address_size == 4)
-	    range_length = read_4ubyte_unaligned_inc (dbg, readp);
-	  else
-	    range_length = read_8ubyte_unaligned_inc (dbg, readp);
-
-	  /* Two zero values mark the end.  But in some cases (bugs)
-	     there might be such entries in the middle of the table.
-	     Ignore and continue, we'll check the actual length of
-	     the table to see if we are really at the end.  */
-	  if (range_address == 0 && range_length == 0)
-	    {
-	      if (readp >= endp)
-		break;
-	      else
-		continue;
-	    }
-
-	  /* We don't use alloca for these temporary structures because
-	     the total number of them can be quite large.  */
-	  struct arangelist *new_arange = malloc (sizeof *new_arange);
-	  if (unlikely (new_arange == NULL))
-	    {
-	      __libdw_seterrno (DWARF_E_NOMEM);
-	      goto fail;
-	    }
-
-	  new_arange->arange.addr = range_address;
-	  new_arange->arange.length = range_length;
-
-	  /* We store the actual CU DIE offset, not the CU header offset.  */
-	  Dwarf_CU *cu = __libdw_findcu (dbg, offset, false);
-	  if (unlikely (cu == NULL))
-	    {
-	      /* We haven't gotten a chance to link in the new_arange
-		 into the arangelist, don't leak it.  */
-	      free (new_arange);
-	      goto fail;
-	    }
-	  new_arange->arange.offset = __libdw_first_die_off_from_cu (cu);
-
-	  new_arange->next = arangelist;
-	  arangelist = new_arange;
-	  ++narangelist;
-
-	  /* Sanity-check the data.  */
-	  if (unlikely (new_arange->arange.offset
-			>= dbg->sectiondata[IDX_debug_info]->d_size))
-	    goto invalid;
-	}
+      Dwarf_Addr base;
+      Dwarf_Addr low;
+      Dwarf_Addr high;
+
+      Dwarf_Die cudie = CUDIE (cu);
+
+      /* Skip CUs that only contain type information.  */
+      if (!INTUSE(dwarf_hasattr) (&cudie, DW_AT_low_pc)
+	  && !INTUSE(dwarf_hasattr) (&cudie, DW_AT_ranges))
+	continue;
+
+      ptrdiff_t offset = 0;
+      while ((offset = INTUSE(dwarf_ranges) (&cudie, offset,
+					     &base, &low, &high)) > 0)
+        {
+          if (offset == -1)
+            goto fail;
+
+          struct arangelist *new_arange = malloc (sizeof *new_arange);
+          if (unlikely (new_arange == NULL))
+            {
+              __libdw_seterrno (DWARF_E_NOMEM);
+              goto fail;
+            }
+
+          new_arange->arange.addr = low;
+          new_arange->arange.length = (Dwarf_Word) (high - low);
+          new_arange->arange.offset = __libdw_first_die_off_from_cu (cu);
+
+          new_arange->next = arangelist;
+          arangelist = new_arange;
+          ++narangelist;
+        }
     }
 
   if (narangelist == 0)
     {
-      assert (arangelist == NULL);
+      if (arangelist != NULL)
+	goto fail;
       if (naranges != NULL)
 	*naranges = 0;
       *aranges = NULL;
@@ -250,7 +126,7 @@  dwarf_getaranges (Dwarf *dbg, Dwarf_Aranges **aranges, size_t *naranges)
   /* First use the buffer for the pointers, and sort the entries.
      We'll write the pointers in the end of the buffer, and then
      copy into the buffer from the beginning so the overlap works.  */
-  assert (sizeof (Dwarf_Arange) >= sizeof (Dwarf_Arange *));
+  eu_static_assert (sizeof (Dwarf_Arange) >= sizeof (Dwarf_Arange *));
   struct arangelist **sortaranges
     = (buf + sizeof (Dwarf_Aranges)
        + ((sizeof (Dwarf_Arange) - sizeof sortaranges[0]) * narangelist));
@@ -264,7 +140,11 @@  dwarf_getaranges (Dwarf *dbg, Dwarf_Aranges **aranges, size_t *naranges)
       sortaranges[i] = arangelist;
       arangelist = arangelist->next;
     }
-  assert (arangelist == NULL);
+
+  /* Something went wrong if narangelist is less then the actual length
+     of arangelist. */
+  if (arangelist != NULL)
+    goto fail;
 
   /* Sort by ascending address.  */
   qsort (sortaranges, narangelist, sizeof sortaranges[0], &compare_aranges);
@@ -286,5 +166,14 @@  dwarf_getaranges (Dwarf *dbg, Dwarf_Aranges **aranges, size_t *naranges)
     }
 
   return 0;
+
+fail:
+  while (arangelist != NULL)
+    {
+      struct arangelist *next = arangelist->next;
+      free (arangelist);
+      arangelist = next;
+    }
+  return -1;
 }
 INTDEF(dwarf_getaranges)
diff --git a/tests/run-get-aranges.sh b/tests/run-get-aranges.sh
index 62cae5f8..195a4c2e 100755
--- a/tests/run-get-aranges.sh
+++ b/tests/run-get-aranges.sh
@@ -65,4 +65,52 @@  CU name: "f.c"
 CU name: "m.c"
 EOF
 
+# Test whether aranges can be acquired even with no .debug_aranges
+objcopy --remove-section .debug_aranges testfile
+
+testrun_compare ${abs_builddir}/get-aranges testfile testfile2 <<\EOF
+0x804842b: not in range
+CU name: "m.c"
+CU name: "m.c"
+CU name: "m.c"
+0x804845a: not in range
+0x804845b: not in range
+CU name: "b.c"
+CU name: "b.c"
+CU name: "b.c"
+0x8048466: not in range
+0x8048467: not in range
+CU name: "f.c"
+CU name: "f.c"
+CU name: "f.c"
+0x8048472: not in range
+ [ 0] start: 0x804842c, length: 46, cu: 11
+CU name: "m.c"
+ [ 1] start: 0x804845c, length: 10, cu: 202
+CU name: "b.c"
+ [ 2] start: 0x8048468, length: 10, cu: 5628
+CU name: "f.c"
+0x804842b: not in range
+0x804842c: not in range
+0x804843c: not in range
+0x8048459: not in range
+0x804845a: not in range
+0x804845b: not in range
+0x804845c: not in range
+0x8048460: not in range
+0x8048465: not in range
+0x8048466: not in range
+0x8048467: not in range
+0x8048468: not in range
+0x8048470: not in range
+0x8048471: not in range
+0x8048472: not in range
+ [ 0] start: 0x10000470, length: 32, cu: 11
+CU name: "b.c"
+ [ 1] start: 0x10000490, length: 32, cu: 2429
+CU name: "f.c"
+ [ 2] start: 0x100004b0, length: 100, cu: 2532
+CU name: "m.c"
+EOF
+
 exit 0