[dwarf2read] Fix crash when loading dwp files: calculate num_sections based on actual section indices, not just the number of sections.

Message ID 20190223005323.188851-1-rupprecht@google.com
State New, archived
Headers

Commit Message

Terekhov, Mikhail via Gdb-patches Feb. 23, 2019, 12:53 a.m. UTC
  When loading dwp files, we create an array of elf sections indexed by the section index in the dwp file. The size of this array is calculated by section_count + 1 (the +1 handling the null section). However, when loading the bfd file, strtab/symtab sections are not added to the list, nor do they increment section_count, so section_count is actually smaller than the number of sections.

This happens to work when using GNU dwp, which lays out .debug section first, with sections like .shstrtab coming at the end. Other tools, like llvm-dwp, put .strtab first, and gdb crashes when loading those dwp files.

For instance, with the current state of gdb, loading a file like this:
$ readelf -SW <file.dwp>
[ 0] <empty>
[ 1] .debug_foo PROGBITS ...
[ 2] .strtab    STRTAB ...

... results in section_count = 2 (.debug is the only thing placed into bfd->sections, so section_count + 1 == 2), and sectp->this_idx = 1 when mapping over .debug_foo in dwarf2_locate_common_dwp_sections, which passes the assertion that 1 < 2.

However, using a dwp file produced by llvm-dwp:
$ readelf -SW <file.dwp>
[ 0] <empty>
[ 1] .strtab    STRTAB ...
[ 2] .debug_foo PROGBITS ...
... results in section_count = 2 (.debug is the only thing placed into bfd->sections, so section_count + 1 == 2), and sectp->this_idx = 2 when mapping over .debug_foo in dwarf2_locate_common_dwp_sections, which fails the assertion that 2 < 2.

This patch changes the calculation of section_count to look at the actual highest this_idx value we see instead of assuming that all strtab/symtab sections come at the end.
---
 gdb/ChangeLog    |  6 ++++++
 gdb/dwarf2read.c | 17 +++++++++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)
  

Comments

Simon Marchi Feb. 24, 2019, 4:01 a.m. UTC | #1
On 2019-02-22 19:53, Jordan Rupprecht via gdb-patches wrote:
> When loading dwp files, we create an array of elf sections indexed by
> the section index in the dwp file. The size of this array is
> calculated by section_count + 1 (the +1 handling the null section).
> However, when loading the bfd file, strtab/symtab sections are not
> added to the list, nor do they increment section_count, so
> section_count is actually smaller than the number of sections.

Just wondering, is this the expected behavior of BFD, to not make the 
strtab section count as a section (as far as bfd_count_sections is 
concerned)?  If so, why?

Otherwise can we just elf_numsections instead of bfd_count_sections?  
Since we index the array by ELF section index, using the number of ELF 
sections seems appropriate, it should always match.  We wouldn't need 
the +1 then.

> This happens to work when using GNU dwp, which lays out .debug section
> first, with sections like .shstrtab coming at the end. Other tools,
> like llvm-dwp, put .strtab first, and gdb crashes when loading those
> dwp files.
> 
> For instance, with the current state of gdb, loading a file like this:
> $ readelf -SW <file.dwp>
> [ 0] <empty>
> [ 1] .debug_foo PROGBITS ...
> [ 2] .strtab    STRTAB ...
> 
> ... results in section_count = 2 (.debug is the only thing placed into
> bfd->sections, so section_count + 1 == 2), and sectp->this_idx = 1
> when mapping over .debug_foo in dwarf2_locate_common_dwp_sections,
> which passes the assertion that 1 < 2.
> 
> However, using a dwp file produced by llvm-dwp:
> $ readelf -SW <file.dwp>
> [ 0] <empty>
> [ 1] .strtab    STRTAB ...
> [ 2] .debug_foo PROGBITS ...
> ... results in section_count = 2 (.debug is the only thing placed into
> bfd->sections, so section_count + 1 == 2), and sectp->this_idx = 2
> when mapping over .debug_foo in dwarf2_locate_common_dwp_sections,
> which fails the assertion that 2 < 2.
> 
> This patch changes the calculation of section_count to look at the
> actual highest this_idx value we see instead of assuming that all
> strtab/symtab sections come at the end.

Thanks for the detailed explanation.

Here are just some formatting comments.

> ---
>  gdb/ChangeLog    |  6 ++++++
>  gdb/dwarf2read.c | 17 +++++++++++++++--
>  2 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/gdb/ChangeLog b/gdb/ChangeLog
> index 6a551dfa64..c789b34890 100644
> --- a/gdb/ChangeLog
> +++ b/gdb/ChangeLog
> @@ -1,3 +1,9 @@
> +2019-02-22  Jordan Rupprecht  <rupprecht@google.com>
> +
> +	* dwarf2read.c (get_largest_section_index): new function

Capital letter and period at the end ("New function.").

> +	(open_and_init_dwp_file): Call get_largest_section_index to
> +	calculate dwp_file->num_sections.
> +
>  2019-02-22  Simon Marchi  <simon.marchi@polymtl.ca>
> 
>  	* MAINTAINERS: Update my email address.
> diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
> index 98f46e0416..07a9d4ea5d 100644
> --- a/gdb/dwarf2read.c
> +++ b/gdb/dwarf2read.c
> @@ -13182,6 +13182,18 @@ open_dwp_file (struct dwarf2_per_objfile
> *dwarf2_per_objfile,
>    return NULL;
>  }
> 
> +/* Return the largest section index */

Period at the end, followed by two spaces:

/* Return the largest section index.  */

> +static int
> +get_largest_section_index(bfd *bfd)

Space before the parentheses:

get_largest_section_index (bfd *bfd)

> +{
> +  int max_sec_idx = 0;
> +  asection *sectp;

You can declare sectp in the for loop header.

> +  for (sectp = bfd->sections; sectp != NULL; sectp = sectp->next)
> +    max_sec_idx = std::max (max_sec_idx, elf_section_data 
> (sectp)->this_idx);
> +
> +  return max_sec_idx;
> +}
> +
>  /* Initialize the use of the DWP file for the current objfile.
>     By convention the name of the DWP file is ${objfile}.dwp.
>     The result is NULL if it can't be found.  */
> @@ -13230,8 +13242,9 @@ open_and_init_dwp_file (struct
> dwarf2_per_objfile *dwarf2_per_objfile)
>    std::unique_ptr<struct dwp_file> dwp_file
>      (new struct dwp_file (name, std::move (dbfd)));
> 
> -  /* +1: section 0 is unused */
> -  dwp_file->num_sections = bfd_count_sections (dwp_file->dbfd) + 1;
> +  /* +1: so that dwp_file->elf_sections[max_idx] is valid */

Period and two spaces at the end.

> +  dwp_file->num_sections =
> +    get_largest_section_index (dwp_file->dbfd.get()) + 1;

Normally, the = should be on the second line when breaking assignments 
(despite the counter example just below).  But in this case, it can fit 
on a single line I believe.

>    dwp_file->elf_sections =
>      OBSTACK_CALLOC (&objfile->objfile_obstack,
>  		    dwp_file->num_sections, asection *);

Thanks,

Simon
  
Terekhov, Mikhail via Gdb-patches Feb. 25, 2019, 8:17 p.m. UTC | #2
On Sat, Feb 23, 2019 at 8:01 PM Simon Marchi <simon.marchi@polymtl.ca> wrote:
>
> On 2019-02-22 19:53, Jordan Rupprecht via gdb-patches wrote:
> > When loading dwp files, we create an array of elf sections indexed by
> > the section index in the dwp file. The size of this array is
> > calculated by section_count + 1 (the +1 handling the null section).
> > However, when loading the bfd file, strtab/symtab sections are not
> > added to the list, nor do they increment section_count, so
> > section_count is actually smaller than the number of sections.
>
> Just wondering, is this the expected behavior of BFD, to not make the
> strtab section count as a section (as far as bfd_count_sections is
> concerned)?  If so, why?
I'm not very familiar with bfd, so I don't know if it's expected. It
seems that bfd->sections contains "interesting" sections (e.g.
progbits), and treats sections like strtab/symtab as a kind of
metadata section that get stored differently. The method I'm stepping
through is bfd_section_from_shdr here:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/elf.c;h=f16acaa08d8e24af9af069efcdcb244b2c19c734;hb=HEAD#l1994.
Running that method over a normal .dwp file, all the .debug_* methods
get added to sections, but .strtab/.symtab/.shstrtab don't.

So it *seems* like that's expected behavior, but I can't say for sure
that it is.

>
> Otherwise can we just elf_numsections instead of bfd_count_sections?
> Since we index the array by ELF section index, using the number of ELF
> sections seems appropriate, it should always match.  We wouldn't need
> the +1 then.

Don't know how I didn't see that. Now I don't need
get_largest_section_index() anymore. Thanks!
All your comments are basically not relevant any more :)
  
Simon Marchi Feb. 25, 2019, 8:20 p.m. UTC | #3
On 2019-02-25 15:17, Jordan Rupprecht via gdb-patches wrote:
> Don't know how I didn't see that. Now I don't need
> get_largest_section_index() anymore. Thanks!
> All your comments are basically not relevant any more :)

Well, it's useful to learn for next time :).  Are you going to send an 
updated version?

Thanks,

Simon
  

Patch

diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index 6a551dfa64..c789b34890 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,3 +1,9 @@ 
+2019-02-22  Jordan Rupprecht  <rupprecht@google.com>
+
+	* dwarf2read.c (get_largest_section_index): new function
+	(open_and_init_dwp_file): Call get_largest_section_index to
+	calculate dwp_file->num_sections.
+
 2019-02-22  Simon Marchi  <simon.marchi@polymtl.ca>
 
 	* MAINTAINERS: Update my email address.
diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
index 98f46e0416..07a9d4ea5d 100644
--- a/gdb/dwarf2read.c
+++ b/gdb/dwarf2read.c
@@ -13182,6 +13182,18 @@  open_dwp_file (struct dwarf2_per_objfile *dwarf2_per_objfile,
   return NULL;
 }
 
+/* Return the largest section index */
+static int
+get_largest_section_index(bfd *bfd)
+{
+  int max_sec_idx = 0;
+  asection *sectp;
+  for (sectp = bfd->sections; sectp != NULL; sectp = sectp->next)
+    max_sec_idx = std::max (max_sec_idx, elf_section_data (sectp)->this_idx);
+
+  return max_sec_idx;
+}
+
 /* Initialize the use of the DWP file for the current objfile.
    By convention the name of the DWP file is ${objfile}.dwp.
    The result is NULL if it can't be found.  */
@@ -13230,8 +13242,9 @@  open_and_init_dwp_file (struct dwarf2_per_objfile *dwarf2_per_objfile)
   std::unique_ptr<struct dwp_file> dwp_file
     (new struct dwp_file (name, std::move (dbfd)));
 
-  /* +1: section 0 is unused */
-  dwp_file->num_sections = bfd_count_sections (dwp_file->dbfd) + 1;
+  /* +1: so that dwp_file->elf_sections[max_idx] is valid */
+  dwp_file->num_sections =
+    get_largest_section_index (dwp_file->dbfd.get()) + 1;
   dwp_file->elf_sections =
     OBSTACK_CALLOC (&objfile->objfile_obstack,
 		    dwp_file->num_sections, asection *);