From patchwork Mon Mar  2 16:13:24 2020
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Tom de Vries <tdevries@suse.de>
X-Patchwork-Id: 38372
Received: (qmail 112560 invoked by alias); 2 Mar 2020 16:13:30 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Delivered-To: mailing list gdb-patches@sourceware.org
Received: (qmail 111855 invoked by uid 89); 2 Mar 2020 16:13:29 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-25.1 required=5.0 tests=AWL, BAYES_00,
	GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,
	SPF_PASS autolearn=ham version=3.3.1 spammy=Age, 8.1, 81,
	consumers
X-HELO: mx2.suse.de
Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by
	sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
	Mon, 02 Mar 2020 16:13:28 +0000
Received: from relay2.suse.de (unknown [195.135.220.254])	by mx2.suse.de
	(Postfix) with ESMTP id D9BF3AF0D	for <gdb-patches@sourceware.org>;
	Mon,  2 Mar 2020 16:13:25 +0000 (UTC)
Date: Mon, 2 Mar 2020 17:13:24 +0100
From: Tom de Vries <tdevries@suse.de>
To: gdb-patches@sourceware.org
Subject: [PATCH][gdb] Skip imports of c++ CUs
Message-ID: <20200302161322.GA13204@delia>
MIME-Version: 1.0
Content-Disposition: inline
User-Agent: Mutt/1.10.1 (2018-07-13)
X-IsSubscribed: yes

Hi,

The DWARF standard appendix E.1 describes techniques that can be used for
compression and deduplication: DIEs can be factored out into a new compilation
unit, and referenced using DW_FORM_ref_addr.

Such a new compilation unit can either use a DW_TAG_compile_unit or
DW_TAG_partial_unit.  If a DW_TAG_compile_unit is used, its contents is
evaluated by consumers as though it were an ordinary compilation unit.  If a
DW_TAG_partial_unit is used, it's only considered by consumers in the context
of a DW_TAG_imported_unit.

An example of when DW_TAG_partial_unit is required is when the factored out
DIEs are not top-level, f.i. because they were children of a namespace.  In
such a case the corresponding DW_TAG_imported_unit will occur as child of the
namespace.

In the case of factoring out DIEs from c++ compilation units, we can factor
out into a new DW_TAG_compile_unit, and no DW_TAG_imported_unit is required.

This begs the question how to interpret a top-level DW_TAG_imported_unit of a
c++ DW_TAG_compile_unit compilation unit.  The semantics of
DW_TAG_imported_unit describe that the imported unit logically appears at the
point of the DW_TAG_imported_unit entry.  But it's not clear what the effect
should be in this case, since all the imported DIEs are already globally
visible anyway, due to the use of DW_TAG_compile_unit.

So, skip top-level imports of c++ DW_TAG_compile_unit compilation units in
process_imported_unit_die.

Using the cc1 binary from PR23710 comment 1 and setting a breakpoint on do_rpo_vn:
...
$ gdb \
    -batch \
    -iex "maint set dwarf max-cache-age 316" \
    -iex "set language c++" \
    -ex "b do_rpo_vn" \
    cc1
...
we get a 8.1% reduction in execution time, due to reducing the number of
partial symtabs expanded into full symtabs from 212 to 175.

Build and reg-tested on x86_64-linux.

OK for trunk?

Thanks,
- Tom

[gdb] Skip imports of c++ CUs

gdb/ChangeLog:

2020-03-02  Tom de Vries  <tdevries@suse.de>

	PR gdb/23710
	* dwarf2/read.h (struct dwarf2_per_cu_data): Add unit_type and lang
	fields.
	* dwarf2/read.c (process_psymtab_comp_unit): Initialize unit_type and lang
	fields.
	(process_imported_unit_die): Skip import of c++ CUs.
---
 gdb/dwarf2/read.c | 22 ++++++++++++++++++++++
 gdb/dwarf2/read.h |  6 ++++++
 2 files changed, 28 insertions(+)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 07cee58c1f..fd16ebb3f7 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -7425,6 +7425,18 @@ process_psymtab_comp_unit (struct dwarf2_per_cu_data *this_cu,
 
   cutu_reader reader (this_cu, NULL, 0, false);
 
+  switch (reader.comp_unit_die->tag)
+    {
+    case DW_TAG_compile_unit:
+      this_cu->unit_type = DW_UT_compile;
+      break;
+    case DW_TAG_partial_unit:
+      this_cu->unit_type = DW_UT_partial;
+      break;
+    default:
+      abort ();
+    }
+
   if (reader.dummy_p)
     {
       /* Nothing.  */
@@ -7438,6 +7450,8 @@ process_psymtab_comp_unit (struct dwarf2_per_cu_data *this_cu,
 				      reader.comp_unit_die,
 				      pretend_language);
 
+  this_cu->lang = this_cu->cu->language;
+
   /* Age out any secondary CUs.  */
   age_cached_comp_units (this_cu->dwarf2_per_objfile);
 }
@@ -9760,6 +9774,14 @@ process_imported_unit_die (struct die_info *die, struct dwarf2_cu *cu)
 	= dwarf2_find_containing_comp_unit (sect_off, is_dwz,
 					    cu->per_cu->dwarf2_per_objfile);
 
+      /* We're importing a C++ compilation unit with tag DW_TAG_compile_unit
+	 into another compilation unit, at root level.  Regard this as a hint,
+	 and ignore it.  */
+      if (die->parent && die->parent->parent == NULL
+	  && per_cu->unit_type == DW_UT_compile
+	  && per_cu->lang == language_cplus)
+	return;
+
       /* If necessary, add it to the queue and load its DIEs.  */
       if (maybe_queue_comp_unit (cu, per_cu, cu->language))
 	load_full_comp_unit (per_cu, false, cu->language);
diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
index 00652c2b45..f0e3d6bb72 100644
--- a/gdb/dwarf2/read.h
+++ b/gdb/dwarf2/read.h
@@ -323,6 +323,12 @@ struct dwarf2_per_cu_data
      dummy CUs (a CU header, but nothing else).  */
   struct dwarf2_cu *cu;
 
+  /* The unit type of this CU.  */
+  enum dwarf_unit_type unit_type;
+
+  /* The language of this CU.  */
+  enum language lang;
+
   /* The corresponding dwarf2_per_objfile.  */
   struct dwarf2_per_objfile *dwarf2_per_objfile;