Throw error when creating an overly large gdb-index file

Message ID 20230909025521.3128935-2-kevinb@redhat.com
State New
Headers
Series Throw error when creating an overly large gdb-index file |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed

Commit Message

Kevin Buettner Sept. 9, 2023, 2:55 a.m. UTC
  The header in a .gdb_index section uses 32-bit unsigned offsets to
refer to other areas of the section.  Thus, there is a size limit of
2^32-1 which is currently unaccounted for by GDB's code for outputting
these sections.

At the moment, when GDB creates an overly large section, it will exit
abnormally due to an internal error, which is caused by a failed
assert in assert_file_size, which in turn is called from
write_gdbindex_1, both of which are in gdb/dwarf2/index-write.c.

This is what happens when that assert fails:

$ gdb -q -nx -iex 'set auto-load no' -iex 'set debuginfod enabled off' -ex file ./libgraph_tool_inference.so -ex "save gdb-index `pwd`/"
Reading symbols from ./libgraph_tool_inference.so...
No executable file now.
Discard symbol table from `libgraph_tool_inference.so'? (y or n) n
Not confirmed.
../../gdb/dwarf2/index-write.c:1069: internal-error: assert_file_size: Assertion `file_size == expected_size' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
0x55fddb4d78b0 gdb_internal_backtrace_1
	../../gdb/bt-utils.c:122
0x55fddb4d78b0 _Z22gdb_internal_backtracev
	../../gdb/bt-utils.c:168
0x55fddb98b5d4 internal_vproblem
	../../gdb/utils.c:396
0x55fddb98b8de _Z15internal_verrorPKciS0_P13__va_list_tag
	../../gdb/utils.c:476
0x55fddbb71654 _Z18internal_error_locPKciS0_z
	../../gdbsupport/errors.cc:58
0x55fddb5a0f23 assert_file_size
	../../gdb/dwarf2/index-write.c:1069
0x55fddb5a1ee0 assert_file_size
	/usr/include/c++/13/bits/stl_iterator.h:1158
0x55fddb5a1ee0 write_gdbindex_1
	../../gdb/dwarf2/index-write.c:1119
0x55fddb5a51be write_gdbindex
	../../gdb/dwarf2/index-write.c:1273
[...]
---------------------
../../gdb/dwarf2/index-write.c:1069: internal-error: assert_file_size: Assertion `file_size == expected_size' failed.

This problem was encountered while building the python-graph-tool
package on Fedora.  The Fedora bugzilla bug can be found here:

https://bugzilla.redhat.com/show_bug.cgi?id=1773651

This commit prevents the internal error from occurring by calling error()
when the file size exceeds 2^32-1.

Using a gdb built with this commit, I now see this behavior instead:

$ gdb -q -nx -iex 'set auto-load no' -iex 'set debuginfod enabled off' -ex file ./libgraph_tool_inference.so -ex "save gdb-index `pwd`/"
Reading symbols from ./libgraph_tool_inference.so...
No executable file now.
Discard symbol table from `/mesquite2/fedora-bugs/1773651/libgraph_tool_inference.so'? (y or n) n
Not confirmed.
Error while writing index for `/mesquite2/fedora-bugs/1773651/libgraph_tool_inference.so': gdb-index maximum file size of 4294967295 exceeded
(gdb)

I wish I could provide a test case, but due to the sizes of both the
input and output files, I think that testing resources would be
strained or exceeded in many environments.

My testing on Fedora 38 shows no regressions.
---
 gdb/dwarf2/index-write.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)
  

Comments

Kevin Buettner Sept. 9, 2023, 5:19 a.m. UTC | #1
On Fri,  8 Sep 2023 19:55:22 -0700
Kevin Buettner <kevinb@redhat.com> wrote:

> +    error ("gdb-index maximum file size of %zu exceeded", max_size);

I see that I forgot to add the i18n markup for the string.  I've fixed
it in my local sources.  That line has been changed to look like this:

    error (_("gdb-index maximum file size of %zu exceeded"), max_size);

Kevin
  
Tom de Vries Sept. 9, 2023, 8:20 a.m. UTC | #2
On 9/9/23 04:55, Kevin Buettner via Gdb-patches wrote:
> I wish I could provide a test case, but due to the sizes of both the 
> input and output files, I think that testing resources would be strained 
> or exceeded in many environments.

How about this unit test approach?  This fails on master, and could be 
updated to catch the error thrown by the patch.

Thanks,
- Tom
  
Tom Tromey Sept. 12, 2023, 4:07 p.m. UTC | #3
>>>>> "Kevin" == Kevin Buettner via Gdb-patches <gdb-patches@sourceware.org> writes:

Kevin> The header in a .gdb_index section uses 32-bit unsigned offsets to
Kevin> refer to other areas of the section.  Thus, there is a size limit of
Kevin> 2^32-1 which is currently unaccounted for by GDB's code for outputting
Kevin> these sections.
...
Kevin> This commit prevents the internal error from occurring by calling error()
Kevin> when the file size exceeds 2^32-1.

I don't have a problem with this but... why is the section too long?
I wonder if this is covering up some other bug, like maybe not enough
symbol de-duplication.

Tom
  
Kevin Buettner Sept. 13, 2023, 2:43 a.m. UTC | #4
On Tue, 12 Sep 2023 10:07:49 -0600
Tom Tromey <tom@tromey.com> wrote:

> >>>>> "Kevin" == Kevin Buettner via Gdb-patches <gdb-patches@sourceware.org> writes:  
> 
> Kevin> The header in a .gdb_index section uses 32-bit unsigned offsets to
> Kevin> refer to other areas of the section.  Thus, there is a size limit of
> Kevin> 2^32-1 which is currently unaccounted for by GDB's code for outputting
> Kevin> these sections.  
> ...
> Kevin> This commit prevents the internal error from occurring by calling error()
> Kevin> when the file size exceeds 2^32-1.  
> 
> I don't have a problem with this but... why is the section too long?
> I wonder if this is covering up some other bug, like maybe not enough
> symbol de-duplication.

The problematic shared object is named libgraph_tool_inference.so and
was created while building the python-graph-tool package on Fedora.

The shared object is 3 GiB in size and the gdb-index section for it is
4.3 GiB in size.

objdump -h shows that the largest section is .debug_str with a size
of 0x81fe181e, which is a little over 2 GiB.  The next largest is
.debug_info with a size of 0x19691516 or about 406.6 MiB.  Here is
some of the objdump -h output:

libgraph_tool_inference.so:     file format elf64-x86-64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
[...]
 28 .debug_aranges 00004d70  0000000000000000  0000000000000000  083dd180  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 29 .debug_info   19691516  0000000000000000  0000000000000000  083e1ef0  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 30 .debug_abbrev 0012e28b  0000000000000000  0000000000000000  21a73406  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 31 .debug_line   03b842f5  0000000000000000  0000000000000000  21ba1691  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 32 .debug_str    81fe181e  0000000000000000  0000000000000000  25725986  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 33 .debug_line_str 0000656b  0000000000000000  0000000000000000  a77071a4  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 34 .debug_loclists 090b9227  0000000000000000  0000000000000000  a770d70f  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS
 35 .debug_rnglists 01e6cd27  0000000000000000  0000000000000000  b07c6936  2**0
                  CONTENTS, READONLY, DEBUGGING, OCTETS

I extracted .debug_str using obj_copy, replaced the \0 characters
with newlines, and then used sort and uniq on that output.  There
were no duplicate lines.  Altogether, there are 1,212,620 strings
in .debug_str.

I can't say that I fully understand the layout of the constant pool,
but it appears that all of the strings from .debug_str will end in the
latter part of it.  So, if I'm right, that alone accounts for nearly
half the size of the index file.

Kevin
  
Tom Tromey Sept. 13, 2023, 2:24 p.m. UTC | #5
>>>>> "Kevin" == Kevin Buettner via Gdb-patches <gdb-patches@sourceware.org> writes:

Kevin> The shared object is 3 GiB in size and the gdb-index section for it is
Kevin> 4.3 GiB in size.

Thanks.

Kevin> I extracted .debug_str using obj_copy, replaced the \0 characters
Kevin> with newlines, and then used sort and uniq on that output.  There
Kevin> were no duplicate lines.  Altogether, there are 1,212,620 strings
Kevin> in .debug_str.

Kevin> I can't say that I fully understand the layout of the constant pool,
Kevin> but it appears that all of the strings from .debug_str will end in the
Kevin> latter part of it.  So, if I'm right, that alone accounts for nearly
Kevin> half the size of the index file.

Yeah, putting copies into the constant pool was a mistake; though I
think part of my rationale was that often the constant pool will hold
different versions -- they go through name canonicalization.

Anyway your patch looks good to me.  Thank you.

Tom
  
Kevin Buettner Sept. 15, 2023, 12:10 a.m. UTC | #6
On Wed, 13 Sep 2023 08:24:02 -0600
Tom Tromey <tom@tromey.com> wrote:

> Anyway your patch looks good to me.  Thank you.

I've pushed it.  Thanks for the review!

Kevin
  
Kevin Buettner Sept. 15, 2023, 12:22 a.m. UTC | #7
On Sat, 9 Sep 2023 10:20:19 +0200
Tom de Vries <tdevries@suse.de> wrote:

> On 9/9/23 04:55, Kevin Buettner via Gdb-patches wrote:
> > I wish I could provide a test case, but due to the sizes of both the 
> > input and output files, I think that testing resources would be strained 
> > or exceeded in many environments.  
> 
> How about this unit test approach?  This fails on master, and could be 
> updated to catch the error thrown by the patch.

I like it.  I made several tweaks resulting in the version below.
Does it look okay to you?

From 1999dc14bf8c498c3f0db83eeb8f7e6d7376e8ff Mon Sep 17 00:00:00 2001
From: Tom de Vries <tdevries@suse.de>
Date: Sat, 9 Sep 2023 10:15:01 +0200
Subject: [PATCH] Add unit test for overly large gdb-index file

This commit adds a test which checks that write_gdb_index_1 will throw
an error when the size of the file would exceed the maximum value
capable of being represented by 'offset_type'.

Co-Authored-By: Kevin Buettner <kevinb@redhat.com>

diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
index 3827a810130..fd9605700c8 100644
--- a/gdb/dwarf2/index-write.c
+++ b/gdb/dwarf2/index-write.c
@@ -137,7 +137,7 @@ class data_buf
   }
 
   /* Return the size of the buffer.  */
-  size_t size () const
+  virtual size_t size () const
   {
     return m_vec.size ();
   }
@@ -1117,6 +1117,9 @@ write_gdbindex_1 (FILE *out_file,
   if (total_len > max_size)
     error (_("gdb-index maximum file size of %zu exceeded"), max_size);
 
+  if (out_file == nullptr)
+    return;
+
   contents.file_write (out_file);
   cu_list.file_write (out_file);
   types_cu_list.file_write (out_file);
@@ -1537,10 +1540,70 @@ save_gdb_index_command (const char *arg, int from_tty)
     }
 }
 
+#if GDB_SELF_TEST
+#include "gdbsupport/selftest.h"
+
+namespace selftests {
+
+class pretend_data_buf : public data_buf
+{
+public:
+  /* Set the pretend size.  */
+  void set_pretend_size (size_t s) {
+    m_pretend_size = s;
+  }
+
+  /* Override size method of data_buf, returning the pretend size instead.  */
+  size_t size () const override {
+    return m_pretend_size;
+  }
+
+private:
+  size_t m_pretend_size;
+};
+
+static void
+gdb_index ()
+{
+  pretend_data_buf cu_list;
+  pretend_data_buf types_cu_list;
+  pretend_data_buf addr_vec;
+  pretend_data_buf symtab_vec;
+  pretend_data_buf constant_pool;
+
+  /* Test that an overly large index will throw an error.  */
+  symtab_vec.set_pretend_size (~(offset_type)0);
+  constant_pool.set_pretend_size (1);
+
+  bool saw_exception = false;
+  try
+    {
+      write_gdbindex_1 (nullptr, cu_list, types_cu_list, addr_vec,
+                        symtab_vec, constant_pool);
+    }
+  catch (const gdb_exception_error &e)
+    {
+      SELF_CHECK (e.reason == RETURN_ERROR);
+      SELF_CHECK (e.error == GENERIC_ERROR);
+      SELF_CHECK (e.message->find (_("gdb-index maximum file size of")) 
+                    != std::string::npos);
+      SELF_CHECK (e.message->find (_("exceeded")) != std::string::npos);
+      saw_exception = true;
+    }
+  SELF_CHECK (saw_exception);
+}
+
+} /* selftests namespace.  */
+#endif
+
 void _initialize_dwarf_index_write ();
 void
 _initialize_dwarf_index_write ()
 {
+#if GDB_SELF_TEST
+  selftests::register_test ("gdb_index", selftests::gdb_index);
+#endif
+
   cmd_list_element *c = add_cmd ("gdb-index", class_files,
 				 save_gdb_index_command, _("\
 Save a gdb-index file.\n\
  
Tom de Vries Sept. 15, 2023, 9:09 a.m. UTC | #8
On 9/15/23 02:22, Kevin Buettner wrote:
> On Sat, 9 Sep 2023 10:20:19 +0200
> Tom de Vries <tdevries@suse.de> wrote:
> 
>> On 9/9/23 04:55, Kevin Buettner via Gdb-patches wrote:
>>> I wish I could provide a test case, but due to the sizes of both the
>>> input and output files, I think that testing resources would be strained
>>> or exceeded in many environments.
>>
>> How about this unit test approach?  This fails on master, and could be
>> updated to catch the error thrown by the patch.
> 
> I like it.  I made several tweaks resulting in the version below.
> Does it look okay to you?
> 

I see that the virtual / override bits were indeed necessary, thanks for 
catching that.

The m_pretend_size was missing zero-initialization, so I've added that.

Also I've added a test that checks that using the maximum size does not 
generate an error.

I've also build on a 32-bit system, and noticed that the unit test fails 
because the error doesn't trigger.

I've proposed a fix that includes the unit test here ( 
https://sourceware.org/pipermail/gdb-patches/2023-September/202493.html ).

Thanks,
- Tom

>  From 1999dc14bf8c498c3f0db83eeb8f7e6d7376e8ff Mon Sep 17 00:00:00 2001
> From: Tom de Vries <tdevries@suse.de>
> Date: Sat, 9 Sep 2023 10:15:01 +0200
> Subject: [PATCH] Add unit test for overly large gdb-index file
> 
> This commit adds a test which checks that write_gdb_index_1 will throw
> an error when the size of the file would exceed the maximum value
> capable of being represented by 'offset_type'.
> 
> Co-Authored-By: Kevin Buettner <kevinb@redhat.com>
> 
> diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
> index 3827a810130..fd9605700c8 100644
> --- a/gdb/dwarf2/index-write.c
> +++ b/gdb/dwarf2/index-write.c
> @@ -137,7 +137,7 @@ class data_buf
>     }
>   
>     /* Return the size of the buffer.  */
> -  size_t size () const
> +  virtual size_t size () const
>     {
>       return m_vec.size ();
>     }
> @@ -1117,6 +1117,9 @@ write_gdbindex_1 (FILE *out_file,
>     if (total_len > max_size)
>       error (_("gdb-index maximum file size of %zu exceeded"), max_size);
>   
> +  if (out_file == nullptr)
> +    return;
> +
>     contents.file_write (out_file);
>     cu_list.file_write (out_file);
>     types_cu_list.file_write (out_file);
> @@ -1537,10 +1540,70 @@ save_gdb_index_command (const char *arg, int from_tty)
>       }
>   }
>   
> +#if GDB_SELF_TEST
> +#include "gdbsupport/selftest.h"
> +
> +namespace selftests {
> +
> +class pretend_data_buf : public data_buf
> +{
> +public:
> +  /* Set the pretend size.  */
> +  void set_pretend_size (size_t s) {
> +    m_pretend_size = s;
> +  }
> +
> +  /* Override size method of data_buf, returning the pretend size instead.  */
> +  size_t size () const override {
> +    return m_pretend_size;
> +  }
> +
> +private:
> +  size_t m_pretend_size;
> +};
> +
> +static void
> +gdb_index ()
> +{
> +  pretend_data_buf cu_list;
> +  pretend_data_buf types_cu_list;
> +  pretend_data_buf addr_vec;
> +  pretend_data_buf symtab_vec;
> +  pretend_data_buf constant_pool;
> +
> +  /* Test that an overly large index will throw an error.  */
> +  symtab_vec.set_pretend_size (~(offset_type)0);
> +  constant_pool.set_pretend_size (1);
> +
> +  bool saw_exception = false;
> +  try
> +    {
> +      write_gdbindex_1 (nullptr, cu_list, types_cu_list, addr_vec,
> +                        symtab_vec, constant_pool);
> +    }
> +  catch (const gdb_exception_error &e)
> +    {
> +      SELF_CHECK (e.reason == RETURN_ERROR);
> +      SELF_CHECK (e.error == GENERIC_ERROR);
> +      SELF_CHECK (e.message->find (_("gdb-index maximum file size of"))
> +                    != std::string::npos);
> +      SELF_CHECK (e.message->find (_("exceeded")) != std::string::npos);
> +      saw_exception = true;
> +    }
> +  SELF_CHECK (saw_exception);
> +}
> +
> +} /* selftests namespace.  */
> +#endif
> +
>   void _initialize_dwarf_index_write ();
>   void
>   _initialize_dwarf_index_write ()
>   {
> +#if GDB_SELF_TEST
> +  selftests::register_test ("gdb_index", selftests::gdb_index);
> +#endif
> +
>     cmd_list_element *c = add_cmd ("gdb-index", class_files,
>   				 save_gdb_index_command, _("\
>   Save a gdb-index file.\n\
>
  

Patch

diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
index 11f254e263a..6e9bf148d49 100644
--- a/gdb/dwarf2/index-write.c
+++ b/gdb/dwarf2/index-write.c
@@ -1083,7 +1083,7 @@  write_gdbindex_1 (FILE *out_file,
 {
   data_buf contents;
   const offset_type size_of_header = 6 * sizeof (offset_type);
-  offset_type total_len = size_of_header;
+  size_t total_len = size_of_header;
 
   /* The version number.  */
   contents.append_offset (8);
@@ -1110,6 +1110,13 @@  write_gdbindex_1 (FILE *out_file,
 
   gdb_assert (contents.size () == size_of_header);
 
+  /* The maximum size of an index file is limited by the maximum value
+     capable of being represented by 'offset_type'.  Throw an error if
+     that length has been exceeded.  */
+  size_t max_size = ~(offset_type) 0;
+  if (total_len > max_size)
+    error ("gdb-index maximum file size of %zu exceeded", max_size);
+
   contents.file_write (out_file);
   cu_list.file_write (out_file);
   types_cu_list.file_write (out_file);