ping: [patchv2] Fix 100x slowdown regression on DWZ files

  On Thu, 02 Oct 2014 17:56:53 +0200, Jan Kratochvil wrote:
------------------------------------------------------------------------------

On Thu, 02 Oct 2014 01:51:38 +0200, Doug Evans wrote:
> I tested this patch with --target_board=dwarf4-gdb-index
> and got a failure in m-static.exp:

That is particularly with -fdebug-types-section.

> Type units read the line table in a separate path,

OK, therefore I dropped that separate struct dwarf2_lineinfo
and reused struct line_header instead.

> OTOH, I do want to avoid any confusion that this patch may inadvertently
> introduce. For example, IIUC with your patch as is,
> if we read a partial_unit first, before a compile_unit
> that has the same stmt_list value, we'll do more processing in
> dwarf_decode_lines than we really need to since we only need a file
> number to symtab mapping. And if we later read in a compile_unit
> with the same stmt_value we'll call dwarf_decode_lines again,
> and this time we need the pc/line mapping it computes.
> Whereas if we process these in the opposite order we'll only call
> dwarf_decode_lines once. I'm sure this will be confusing at first
> to some later developer going through this code.
> [I could be missing something of course, and I'm happy for any corrections.]

Implemented (omitting some story why I did not include it before).

> The code that processes stmt_list for type_units is in setup_type_unit_groups.
> Note that this code goes to the trouble of re-initializing the buildsym
> machinery (see the calls to restart_symtab in dwarf2read.c) when we process
> the second and subsequent type units that share a stmt_list value.
> This is something that used to be done before your patch and will no
> longer be done with your patch (since if we get a cache hit we exit).
> It may be that the type_unit support is doing this unnecessarily,
> which would be great because we can then simplify it.

I hope this patch should no longer break -fdebug-types-section.
If it additionally enables some future optimization for -fdebug-types-section
the better.

>  > +  /* Offset of line number information in .debug_line section.  */
>  > +  sect_offset offset;
>  > +  unsigned offset_in_dwz : 1;
> 
> IWBN to document why offset_in_dwz is here.
> It's not obvious why it's needed.
+
On Thu, 02 Oct 2014 01:57:03 +0200, Doug Evans wrote:
> Ah, I guess the offset_in_dwz flag will ensure dwarf_decode_lines gets called
> twice regardless of order.  But is that the only reason for the flag?

I have added there now:
+  /* OFFSET is for struct dwz_file associated with dwarf2_per_objfile.  */

If one removes it regressions really happen.  What happens is that this
line_header_hash (former lineinfo_hash) is in struct dwarf2_per_objfile which
is common for both objfile and its objfile.dwz (that one is normally in
/usr/lib/debug/.dwz/ - common for multiple objfiles).  And there are two
different DIEs at offset 0xb - one in objfile and one in objfile.dwz - which
would match single line_header if offset_in_dwz was not there.

Also existing dwarf2read.c code usually transfers "dwz flag" together with DIE
offset, such as:
	dwarf2_find_containing_comp_unit (sect_offset offset,
					  unsigned int offset_in_dwz,
					  struct objfile *objfile)
This reminds me - why doesn't similar ambiguity happen also for dwp_file?
I am unfortunately not much aware of the dwp implementation details.

>  > -      struct line_header *line_header
>  > -	= dwarf_decode_line_header (line_offset, cu);
>  > +      dwarf2_per_objfile->lineinfo_hash =
> 
> As much as I prefer "=" going here, convention says to put it on the
> next line.

I have changed it but this was just blind copy from existing line 21818.

>  > +	htab_create_alloc_ex (127, dwarf2_lineinfo_hash, dwarf2_lineinfo_eq,
> 
> I don't have any data, but 127 seems high.

I have not changed it but this was just blind copy from existing line 21818.

> I wouldn't change it, I just wanted to ask if you have any data
> guiding this choice.

Tuning some constants really makes no sense when GDB has missing + insanely
complicated data structures and in consequence GDB is using inappropriate data
structures with bad algorithmic complexity.  One needs to switch GDB to C++
and its STL before one can start talking about data structures performance.

No regressions on {x86_64,x86_64-m32,i686}-fedora20-linux-gnu in DWZ mode and
in -fdebug-types-section mode.

Thanks,
Jan
gdb/
2014-10-02  Jan Kratochvil  <jan.kratochvil@redhat.com>

	Fix 100x slowdown regression on DWZ files.
	* dwarf2read.c (struct dwarf2_per_objfile): Add line_header_hash.
	(struct line_header): Add offset and offset_in_dwz.
	(dwarf_decode_lines): Add parameter decode_mapping to the declaration.
	(free_line_header_voidp): New declaration.
	(line_header_hash, line_header_eq): New functions.
	(dwarf2_build_include_psymtabs): Update dwarf_decode_lines caller.
	(handle_DW_AT_stmt_list): Use dwarf2_per_objfile->line_header_hash.
	(free_line_header_voidp): New function.
	(dwarf_decode_line_header): Initialize offset and offset_in_dwz.
	(dwarf_decode_lines): New parameter decode_mapping, use it.

ping: [patchv2] Fix 100x slowdown regression on DWZ files

Commit Message

Patch