From patchwork Thu Oct 2 15:56:53 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Kratochvil X-Patchwork-Id: 3068 Received: (qmail 32338 invoked by alias); 2 Oct 2014 15:57:05 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 32318 invoked by uid 89); 2 Oct 2014 15:57:04 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.1 required=5.0 tests=AWL, BAYES_50, RP_MATCHES_RCVD, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 02 Oct 2014 15:57:02 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s92FuwlK017291 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 2 Oct 2014 11:56:58 -0400 Received: from host2.jankratochvil.net (ovpn-116-49.ams2.redhat.com [10.36.116.49]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s92Fushj011253 (version=TLSv1/SSLv3 cipher=AES128-GCM-SHA256 bits=128 verify=NO); Thu, 2 Oct 2014 11:56:56 -0400 Date: Thu, 2 Oct 2014 17:56:53 +0200 From: Jan Kratochvil To: Doug Evans Cc: gdb-patches@sourceware.org Subject: [patchv2] Fix 100x slowdown regression on DWZ files Message-ID: <20141002155653.GA9001@host2.jankratochvil.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <21548.37770.274873.760290@ruffy2.mtv.corp.google.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-IsSubscribed: yes On Thu, 02 Oct 2014 01:51:38 +0200, Doug Evans wrote: > I tested this patch with --target_board=dwarf4-gdb-index > and got a failure in m-static.exp: That is particularly with -fdebug-types-section. > Type units read the line table in a separate path, OK, therefore I dropped that separate struct dwarf2_lineinfo and reused struct line_header instead. > OTOH, I do want to avoid any confusion that this patch may inadvertently > introduce. For example, IIUC with your patch as is, > if we read a partial_unit first, before a compile_unit > that has the same stmt_list value, we'll do more processing in > dwarf_decode_lines than we really need to since we only need a file > number to symtab mapping. And if we later read in a compile_unit > with the same stmt_value we'll call dwarf_decode_lines again, > and this time we need the pc/line mapping it computes. > Whereas if we process these in the opposite order we'll only call > dwarf_decode_lines once. I'm sure this will be confusing at first > to some later developer going through this code. > [I could be missing something of course, and I'm happy for any corrections.] Implemented (omitting some story why I did not include it before). > The code that processes stmt_list for type_units is in setup_type_unit_groups. > Note that this code goes to the trouble of re-initializing the buildsym > machinery (see the calls to restart_symtab in dwarf2read.c) when we process > the second and subsequent type units that share a stmt_list value. > This is something that used to be done before your patch and will no > longer be done with your patch (since if we get a cache hit we exit). > It may be that the type_unit support is doing this unnecessarily, > which would be great because we can then simplify it. I hope this patch should no longer break -fdebug-types-section. If it additionally enables some future optimization for -fdebug-types-section the better. > > + /* Offset of line number information in .debug_line section. */ > > + sect_offset offset; > > + unsigned offset_in_dwz : 1; > > IWBN to document why offset_in_dwz is here. > It's not obvious why it's needed. + On Thu, 02 Oct 2014 01:57:03 +0200, Doug Evans wrote: > Ah, I guess the offset_in_dwz flag will ensure dwarf_decode_lines gets called > twice regardless of order. But is that the only reason for the flag? I have added there now: + /* OFFSET is for struct dwz_file associated with dwarf2_per_objfile. */ If one removes it regressions really happen. What happens is that this line_header_hash (former lineinfo_hash) is in struct dwarf2_per_objfile which is common for both objfile and its objfile.dwz (that one is normally in /usr/lib/debug/.dwz/ - common for multiple objfiles). And there are two different DIEs at offset 0xb - one in objfile and one in objfile.dwz - which would match single line_header if offset_in_dwz was not there. Also existing dwarf2read.c code usually transfers "dwz flag" together with DIE offset, such as: dwarf2_find_containing_comp_unit (sect_offset offset, unsigned int offset_in_dwz, struct objfile *objfile) This reminds me - why doesn't similar ambiguity happen also for dwp_file? I am unfortunately not much aware of the dwp implementation details. > > - struct line_header *line_header > > - = dwarf_decode_line_header (line_offset, cu); > > + dwarf2_per_objfile->lineinfo_hash = > > As much as I prefer "=" going here, convention says to put it on the > next line. I have changed it but this was just blind copy from existing line 21818. > > + htab_create_alloc_ex (127, dwarf2_lineinfo_hash, dwarf2_lineinfo_eq, > > I don't have any data, but 127 seems high. I have not changed it but this was just blind copy from existing line 21818. > I wouldn't change it, I just wanted to ask if you have any data > guiding this choice. Tuning some constants really makes no sense when GDB has missing + insanely complicated data structures and in consequence GDB is using inappropriate data structures with bad algorithmic complexity. One needs to switch GDB to C++ and its STL before one can start talking about data structures performance. No regressions on {x86_64,x86_64-m32,i686}-fedora20-linux-gnu in DWZ mode and in -fdebug-types-section mode. Thanks, Jan gdb/ 2014-10-02 Jan Kratochvil Fix 100x slowdown regression on DWZ files. * dwarf2read.c (struct dwarf2_per_objfile): Add line_header_hash. (struct line_header): Add offset and offset_in_dwz. (dwarf_decode_lines): Add parameter decode_mapping to the declaration. (free_line_header_voidp): New declaration. (line_header_hash, line_header_eq): New functions. (dwarf2_build_include_psymtabs): Update dwarf_decode_lines caller. (handle_DW_AT_stmt_list): Use dwarf2_per_objfile->line_header_hash. (free_line_header_voidp): New function. (dwarf_decode_line_header): Initialize offset and offset_in_dwz. (dwarf_decode_lines): New parameter decode_mapping, use it. diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c index 9d0ee13..206dc10 100644 --- a/gdb/dwarf2read.c +++ b/gdb/dwarf2read.c @@ -309,6 +309,9 @@ struct dwarf2_per_objfile /* The CUs we recently read. */ VEC (dwarf2_per_cu_ptr) *just_read_cus; + + /* Table containing line_header indexed by offset and offset_in_dwz. */ + htab_t line_header_hash; }; static struct dwarf2_per_objfile *dwarf2_per_objfile; @@ -1025,6 +1028,12 @@ typedef void (die_reader_func_ftype) (const struct die_reader_specs *reader, which contains the following information. */ struct line_header { + /* Offset of line number information in .debug_line section. */ + sect_offset offset; + + /* OFFSET is for struct dwz_file associated with dwarf2_per_objfile. */ + unsigned offset_in_dwz : 1; + unsigned int total_length; unsigned short version; unsigned int header_length; @@ -1513,7 +1522,7 @@ static struct line_header *dwarf_decode_line_header (unsigned int offset, static void dwarf_decode_lines (struct line_header *, const char *, struct dwarf2_cu *, struct partial_symtab *, - CORE_ADDR); + CORE_ADDR, int decode_mapping); static void dwarf2_start_subfile (const char *, const char *, const char *); @@ -1845,6 +1854,8 @@ static void free_dwo_file_cleanup (void *); static void process_cu_includes (void); static void check_producer (struct dwarf2_cu *cu); + +static void free_line_header_voidp (void *arg); /* Various complaints about symbol reading that don't abort the process. */ @@ -1911,6 +1922,29 @@ dwarf2_invalid_attrib_class_complaint (const char *arg1, const char *arg2) _("invalid attribute class or form for '%s' in '%s'"), arg1, arg2); } + +/* Hash function for line_header_hash. */ + +static hashval_t +line_header_hash (const void *item) +{ + const struct line_header *ofs = item; + + return ofs->offset.sect_off ^ ofs->offset_in_dwz; +} + +/* Equality function for line_header_hash. */ + +static int +line_header_eq (const void *item_lhs, const void *item_rhs) +{ + const struct line_header *ofs_lhs = item_lhs; + const struct line_header *ofs_rhs = item_rhs; + + return (ofs_lhs->offset.sect_off == ofs_rhs->offset.sect_off + && ofs_lhs->offset_in_dwz == ofs_rhs->offset_in_dwz); +} + #if WORDS_BIGENDIAN @@ -4449,7 +4483,7 @@ dwarf2_build_include_psymtabs (struct dwarf2_cu *cu, return; /* No linetable, so no includes. */ /* NOTE: pst->dirname is DW_AT_comp_dir (if present). */ - dwarf_decode_lines (lh, pst->dirname, cu, pst, pst->textlow); + dwarf_decode_lines (lh, pst->dirname, cu, pst, pst->textlow, 1); free_line_header (lh); } @@ -8975,24 +9009,64 @@ static void handle_DW_AT_stmt_list (struct die_info *die, struct dwarf2_cu *cu, const char *comp_dir, CORE_ADDR lowpc) /* ARI: editCase function */ { + struct objfile *objfile = dwarf2_per_objfile->objfile; struct attribute *attr; + unsigned int line_offset; + struct line_header *line_header, line_header_local; + unsigned u; + void **slot; + int decode_mapping; gdb_assert (! cu->per_cu->is_debug_types); attr = dwarf2_attr (die, DW_AT_stmt_list, cu); - if (attr) + if (attr == NULL) + return; + + line_offset = DW_UNSND (attr); + + if (dwarf2_per_objfile->line_header_hash == NULL) { - unsigned int line_offset = DW_UNSND (attr); - struct line_header *line_header - = dwarf_decode_line_header (line_offset, cu); + dwarf2_per_objfile->line_header_hash + = htab_create_alloc_ex (127, line_header_hash, line_header_eq, + free_line_header_voidp, + &objfile->objfile_obstack, + hashtab_obstack_allocate, + dummy_obstack_deallocate); + } - if (line_header) - { - cu->line_header = line_header; - make_cleanup (free_cu_line_header, cu); - dwarf_decode_lines (line_header, comp_dir, cu, NULL, lowpc); - } + line_header_local.offset.sect_off = line_offset; + line_header_local.offset_in_dwz = cu->per_cu->is_dwz; + slot = htab_find_slot (dwarf2_per_objfile->line_header_hash, + &line_header_local, NO_INSERT); + + /* For DW_TAG_compile_unit we need info like symtab::linetable which + is not present in *SLOT. */ + if (die->tag == DW_TAG_partial_unit && slot != NULL) + { + gdb_assert (*slot != NULL); + cu->line_header = *slot; + return; + } + + line_header = dwarf_decode_line_header (line_offset, cu); + if (line_header == NULL) + return; + cu->line_header = line_header; + + slot = htab_find_slot (dwarf2_per_objfile->line_header_hash, + &line_header_local, INSERT); + gdb_assert (slot != NULL); + if (*slot == NULL) + *slot = line_header; + else + { + gdb_assert (die->tag != DW_TAG_partial_unit); + make_cleanup (free_cu_line_header, cu); } + decode_mapping = (die->tag != DW_TAG_partial_unit); + dwarf_decode_lines (line_header, comp_dir, cu, NULL, lowpc, + decode_mapping); } /* Process DW_TAG_compile_unit or DW_TAG_partial_unit. */ @@ -16863,6 +16937,16 @@ free_line_header (struct line_header *lh) xfree (lh); } +/* Stub for free_line_header to match void * callback types. */ + +static void +free_line_header_voidp (void *arg) +{ + struct line_header *lh = arg; + + free_line_header (lh); +} + /* Add an entry to LH's include directory table. */ static void @@ -16993,6 +17077,9 @@ dwarf_decode_line_header (unsigned int offset, struct dwarf2_cu *cu) back_to = make_cleanup ((make_cleanup_ftype *) free_line_header, (void *) lh); + lh->offset.sect_off = offset; + lh->offset_in_dwz = cu->per_cu->is_dwz; + line_ptr = section->buffer + offset; /* Read in the header. */ @@ -17605,18 +17692,23 @@ dwarf_decode_lines_1 (struct line_header *lh, const char *comp_dir, E.g. expand_line_sal requires this when finding psymtabs to expand. A good testcase for this is mb-inline.exp. - LOWPC is the lowest address in CU (or 0 if not known). */ + LOWPC is the lowest address in CU (or 0 if not known). + + Boolean DECODE_MAPPING specifies we need to fully decode .debug_line + for its PC<->lines mapping information. Otherwise only filenames + tables is read in. */ static void dwarf_decode_lines (struct line_header *lh, const char *comp_dir, struct dwarf2_cu *cu, struct partial_symtab *pst, - CORE_ADDR lowpc) + CORE_ADDR lowpc, int decode_mapping) { struct objfile *objfile = cu->objfile; const int decode_for_pst_p = (pst != NULL); struct subfile *first_subfile = current_subfile; - dwarf_decode_lines_1 (lh, comp_dir, cu, decode_for_pst_p, lowpc); + if (decode_mapping) + dwarf_decode_lines_1 (lh, comp_dir, cu, decode_for_pst_p, lowpc); if (decode_for_pst_p) {