Fix PR 21337 v2: segfault when re-reading symbols with remote debugging.

Message ID d266f2a4-5482-6178-f4fd-9280e27757b4@imgtec.com
State New, archived
Headers

Commit Message

Doug Gilmore April 28, 2017, 11:44 p.m. UTC
  > ...
> Hi Doug,
> 
> I've ran this in my head many times, but I still don't understand which field exactly is stale and causes the segfault.
> 
> According to the backtrace, the line of the crash is:
> 
>   if (pc < obj_section_addr (section))
> 
> That macro expands to
> 
> #define obj_section_addr(s)                              \
>   (bfd_get_section_vma ((s)->objfile->obfd, s->the_bfd_section)        \
>    + obj_section_offset (s))
> 
> which further expands to
> 
> #define obj_section_offset(s)                        \
>   (((s)->objfile->section_offsets)->offsets[gdb_bfd_section_index ((s)->objfile->obfd, (s)->the_bfd_section)])
> 
> 
> Could you point out which dereferencing operator/memory access causes the crash?  At first I thought it would be ->section_offsets, but that field is set properly before calling read_symbols:
> 
>       /* We use the same section offsets as from last time.  I'm not
>          sure whether that is always correct for shared libraries.  */
>       objfile->section_offsets = (struct section_offsets *)
>         obstack_alloc (&objfile->objfile_obstack,
>                SIZEOF_N_SECTION_OFFSETS (num_offsets));
>       memcpy (objfile->section_offsets, offsets,
>           SIZEOF_N_SECTION_OFFSETS (num_offsets));
>       objfile->num_sections = num_offsets;
> 
> so it should not be the culprit...  The s variable itself should point to an instance of obj_section, contained in the objfile_pspace_info::sections array.  This one is allocated with XNEWVEC, so it shouldn't be affected by the fact that we clear the obstack.
> 
> The objfile object itself doesn't change address, so the pointers to it should still be valid...
> 
> I find the obj_section_addr and obj_section_offset very bad for readability and debuggability.  Could you change them for inline functions that are not one liners?  Then it will be obvious in the backtrace which operation causes the crash.
> 
> Thanks,
> 
> Simon
Hi Simon,

After thinking about it my comment and code placement wasn't
particularly good.  Something along the line's of Luis's change
is better.

Does Luis's comment address the question you have?

If so, Luis: Should is it OK we incorporate your changes in the patch?

I attached a diff for the change.

Thanks,

Doug
  

Comments

Luis Machado April 29, 2017, 1:13 a.m. UTC | #1
On 04/28/2017 06:44 PM, Doug Gilmore wrote:
>> ...
>>       /* We use the same section offsets as from last time.  I'm not
>>          sure whether that is always correct for shared libraries.  */
>>       objfile->section_offsets = (struct section_offsets *)
>>         obstack_alloc (&objfile->objfile_obstack,
>>                SIZEOF_N_SECTION_OFFSETS (num_offsets));
>>       memcpy (objfile->section_offsets, offsets,
>>           SIZEOF_N_SECTION_OFFSETS (num_offsets));
>>       objfile->num_sections = num_offsets;
>>
>> so it should not be the culprit...  The s variable itself should point to an instance of obj_section, contained in the objfile_pspace_info::sections array.  This one is allocated with XNEWVEC, so it shouldn't be affected by the fact that we clear the obstack.
>>
>> The objfile object itself doesn't change address, so the pointers to it should still be valid...
>>
>> I find the obj_section_addr and obj_section_offset very bad for readability and debuggability.  Could you change them for inline functions that are not one liners?  Then it will be obvious in the backtrace which operation causes the crash.
>>
>> Thanks,
>>
>> Simon
> Hi Simon,
>
> After thinking about it my comment and code placement wasn't
> particularly good.  Something along the line's of Luis's change
> is better.
>
> Does Luis's comment address the question you have?
>
> If so, Luis: Should is it OK we incorporate your changes in the patch?

Sure. I have no problems with that.
  
Simon Marchi April 29, 2017, 1:41 a.m. UTC | #2
On 2017-04-28 19:44, Doug Gilmore wrote:
> Hi Simon,
> 
> After thinking about it my comment and code placement wasn't
> particularly good.  Something along the line's of Luis's change
> is better.
> 
> Does Luis's comment address the question you have?
> 
> If so, Luis: Should is it OK we incorporate your changes in the patch?
> 
> I attached a diff for the change.
> 
> Thanks,
> 
> Doug

Hi Doug,

The comment certainly helps, but in the commit log I'd like to see a 
more detailed list of events that leads to the crash.

Now that I look into it again, I think I understand.  The 
objfile_pspace_info::sections array/vector is a list of obj_section 
pointers (in C++ we'd probably use an std::vector<obj_section*>).  That 
list contains pointers to all the sections from all the objfiles sorted 
in order of increasing address.  They point directly to the sections 
allocated by the objfile in their obstacks (and accessible through 
objfile::sections).  So when the obstack is freed in reread_symbols, the 
sorted list contains stale pointers.  Is that it?

If that's what's happening, then I'm more convinced the fix is right.  
Is this behaviour caught by a test?  If not, could you write one?

> 
> diff --git a/gdb/symfile.c b/gdb/symfile.c
> index 846aabe..aff4341 100644
> --- a/gdb/symfile.c
> +++ b/gdb/symfile.c
> @@ -2628,6 +2628,20 @@ reread_symbols (void)
>  	  clear_complaints (&symfile_complaints, 1, 1);
> 
>  	  objfile->flags &= ~OBJF_PSYMTABS_READ;
> +
> +	  /* We are about to read new symbols and potentially also DWARF
> +	     information.  Some targets may want to pass addresses read from
> +	     DWARF DIE's through an adjustment function before saving them, 
> like
> +	     MIPS, which may call into "find_pc_section".  When called, that
> +	     function will make use of per-objfile program space data.

If you are talking about objfile_pspace_info, it's not per-objfile.  
There's one instance of it per program space, so it's actually 
"objfiles-related per-program-space data".  It contains the sorted list 
of all sections of all objfiles loaded in the pspace.

> +
> +	     Since we discarded our section information above, we have 
> dangling
> +	     pointers in the per-objfile program space data structure.  Force

again

> +	     GDB to update the section mapping information by letting it know
> +	     the objfile has changed, making the dangling pointers point to
> +	     correct data again.  */
> +	  objfiles_changed ();
> +
>  	  read_symbols (objfile, 0);
> 
>  	  if (!objfile_has_symbols (objfile))
> @@ -2660,9 +2674,6 @@ reread_symbols (void)
> 
>    if (!new_objfiles.empty ())
>      {
> -      /* Notify objfiles that we've modified objfile sections.  */
> -      objfiles_changed ();
> -
>        clear_symtab_users (0);
> 
>        /* clear_objfile_data for each objfile was called before freeing 
> it and

Thanks,

Simon
  
Doug Gilmore April 29, 2017, 5:12 p.m. UTC | #3
On 04/28/17 18:41, Simon Marchi wrote:
> On 2017-04-28 19:44, Doug Gilmore wrote:
>> Hi Simon,
>>
>> After thinking about it my comment and code placement wasn't
>> particularly good.  Something along the line's of Luis's change
>> is better.
>>
>> Does Luis's comment address the question you have?
>>
>> If so, Luis: Should is it OK we incorporate your changes in the patch?
>>
>> I attached a diff for the change.
>>
>> Thanks,
>>
>> Doug
> 
> Hi Doug,
> 
> The comment certainly helps, but in the commit log I'd like to see a
> more detailed list of events that leads to the crash.
> 
> Now that I look into it again, I think I understand.  The
> objfile_pspace_info::sections array/vector is a list of obj_section
> pointers (in C++ we'd probably use an std::vector<obj_section*>).
> That list contains pointers to all the sections from all the
> objfiles sorted in order of increasing address.  They point directly
> to the sections allocated by the objfile in their obstacks (and
> accessible through objfile::sections).  So when the obstack is freed
> in reread_symbols, the sorted list contains stale pointers.  Is that
> it?
Right.
> 
> If that's what's happening, then I'm more convinced the fix is
> right.  Is this behaviour caught by a test?  If not, could you write
> one?
> ...
I'll need to take a look.  Last time I tried I it was more difficult
to expose the problem on the native build of GDB.

Thanks,

Doug
  
Simon Marchi April 29, 2017, 11:26 p.m. UTC | #4
On 2017-04-29 13:12, Doug Gilmore wrote:
> On 04/28/17 18:41, Simon Marchi wrote:
>> On 2017-04-28 19:44, Doug Gilmore wrote:
>>> Hi Simon,
>>> 
>>> After thinking about it my comment and code placement wasn't
>>> particularly good.  Something along the line's of Luis's change
>>> is better.
>>> 
>>> Does Luis's comment address the question you have?
>>> 
>>> If so, Luis: Should is it OK we incorporate your changes in the 
>>> patch?
>>> 
>>> I attached a diff for the change.
>>> 
>>> Thanks,
>>> 
>>> Doug
>> 
>> Hi Doug,
>> 
>> The comment certainly helps, but in the commit log I'd like to see a
>> more detailed list of events that leads to the crash.
>> 
>> Now that I look into it again, I think I understand.  The
>> objfile_pspace_info::sections array/vector is a list of obj_section
>> pointers (in C++ we'd probably use an std::vector<obj_section*>).
>> That list contains pointers to all the sections from all the
>> objfiles sorted in order of increasing address.  They point directly
>> to the sections allocated by the objfile in their obstacks (and
>> accessible through objfile::sections).  So when the obstack is freed
>> in reread_symbols, the sorted list contains stale pointers.  Is that
>> it?
> Right.
>> 
>> If that's what's happening, then I'm more convinced the fix is
>> right.  Is this behaviour caught by a test?  If not, could you write
>> one?
>> ...
> I'll need to take a look.  Last time I tried I it was more difficult
> to expose the problem on the native build of GDB.

reread_symbols is called when using the run (run_command_1), attach 
(attach_post_wait which then calls setup_inferior) and load 
(load_command) commands.  So maybe something like this would reproduce 
it?

- compile test program
- launch gdb with test program
- touch test program
- run

Simon
  
Doug Gilmore April 30, 2017, 5:14 a.m. UTC | #5
>>> ...
>> I'll need to take a look.  Last time I tried I it was more difficult
>> to expose the problem on the native build of GDB.
> 
> reread_symbols is called when using the run (run_command_1), attach (attach_post_wait which then calls setup_inferior) and load (load_command) commands.  So maybe something like this would reproduce it?
> 
> - compile test program
> - launch gdb with test program
> - touch test program
> - run
> 
> Simon
> 
Hi Simon,

What I meant was that when I previously did tests with the native
build, for some reason the freed data was not being overwritten, or
possible written with the same data, at the time read_symbols was
called in reread_symbols.  Thus problem wasn't exposed before the
objfiles_changed is eventually called in reread_symbols.

I just did a test with a native build of gdb (7.9.1) and the problem
was exposed, so chances are it will be also be exposed with a ToT
build.

Doug
  

Patch

diff --git a/gdb/symfile.c b/gdb/symfile.c
index 846aabe..aff4341 100644
--- a/gdb/symfile.c
+++ b/gdb/symfile.c
@@ -2628,6 +2628,20 @@  reread_symbols (void)
 	  clear_complaints (&symfile_complaints, 1, 1);
 
 	  objfile->flags &= ~OBJF_PSYMTABS_READ;
+
+	  /* We are about to read new symbols and potentially also DWARF
+	     information.  Some targets may want to pass addresses read from
+	     DWARF DIE's through an adjustment function before saving them, like
+	     MIPS, which may call into "find_pc_section".  When called, that
+	     function will make use of per-objfile program space data.
+
+	     Since we discarded our section information above, we have dangling
+	     pointers in the per-objfile program space data structure.  Force
+	     GDB to update the section mapping information by letting it know
+	     the objfile has changed, making the dangling pointers point to
+	     correct data again.  */
+	  objfiles_changed ();
+
 	  read_symbols (objfile, 0);
 
 	  if (!objfile_has_symbols (objfile))
@@ -2660,9 +2674,6 @@  reread_symbols (void)
 
   if (!new_objfiles.empty ())
     {
-      /* Notify objfiles that we've modified objfile sections.  */
-      objfiles_changed ();
-
       clear_symtab_users (0);
 
       /* clear_objfile_data for each objfile was called before freeing it and