From patchwork Fri Apr 7 22:27:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom Tromey X-Patchwork-Id: 67537 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D390E3858438 for ; Fri, 7 Apr 2023 22:28:13 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from outbound-ss-820.bluehost.com (outbound-ss-820.bluehost.com [69.89.24.241]) by sourceware.org (Postfix) with ESMTPS id 4CDD03858D32 for ; Fri, 7 Apr 2023 22:27:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4CDD03858D32 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=tromey.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tromey.com Received: from cmgw10.mail.unifiedlayer.com (unknown [10.0.90.125]) by progateway2.mail.pro1.eigbox.com (Postfix) with ESMTP id ACA0B10047F8A for ; Fri, 7 Apr 2023 22:27:55 +0000 (UTC) Received: from box5379.bluehost.com ([162.241.216.53]) by cmsmtp with ESMTP id kuYlps4QSFhsVkuYlpNnXi; Fri, 07 Apr 2023 22:27:55 +0000 X-Authority-Reason: nr=8 X-Authority-Analysis: v=2.4 cv=ELjDb3VC c=1 sm=1 tr=0 ts=643098eb a=ApxJNpeYhEAb1aAlGBBbmA==:117 a=ApxJNpeYhEAb1aAlGBBbmA==:17 a=dLZJa+xiwSxG16/P+YVxDGlgEgI=:19 a=dKHAf1wccvYA:10:nop_rcvd_month_year a=Qbun_eYptAEA:10:endurance_base64_authed_username_1 a=mDV3o1hIAAAA:8 a=82FKP-OwGTc7kfbUVSMA:9 a=_FVE-zBwftR9WsbkzFJk:22 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tromey.com; s=default; h=Content-Transfer-Encoding:MIME-Version:Message-Id:Date:Subject: Cc:To:From:Sender:Reply-To:Content-Type:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=jLJLaexDEBsZF3VB6e4BaJ68AN1cf+wR2EYh2PdTYqI=; b=SY9SfSA04+o5EjQl7J3eyyWQ9o 9rqdbrnaEvUQcNRAdrMuL0CGX0WjuerlyWLwDyd0eZ2xfxJKWQbCnw+I6MHRY+AO14JAdh1yjpapb 2VGfscDYeIDSBGrPba5gKww9V; Received: from 75-166-159-36.hlrn.qwest.net ([75.166.159.36]:57168 helo=localhost.localdomain) by box5379.bluehost.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pkuYl-002sLO-EK; Fri, 07 Apr 2023 16:27:55 -0600 From: Tom Tromey To: gdb-patches@sourceware.org Cc: Tom Tromey Subject: [PATCH] Simplify decode_locdesc Date: Fri, 7 Apr 2023 16:27:42 -0600 Message-Id: <20230407222742.3880841-1-tom@tromey.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - box5379.bluehost.com X-AntiAbuse: Original Domain - sourceware.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - tromey.com X-BWhitelist: no X-Source-IP: 75.166.159.36 X-Source-L: No X-Exim-ID: 1pkuYl-002sLO-EK X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: 75-166-159-36.hlrn.qwest.net (localhost.localdomain) [75.166.159.36]:57168 X-Source-Auth: tom+tromey.com X-Email-Count: 1 X-Source-Cap: ZWx5bnJvYmk7ZWx5bnJvYmk7Ym94NTM3OS5ibHVlaG9zdC5jb20= X-Local-Domain: yes X-Spam-Status: No, score=-3026.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Sender: "Gdb-patches" While looking into another bug, I noticed that the DWARF cooked indexer picks up an address for this symbol: <1><82>: Abbrev Number: 2 (DW_TAG_variable) <83> DW_AT_specification: <0x9f> <87> DW_AT_location : 10 byte block: e 0 0 0 0 0 0 0 0 e0 (DW_OP_const8u: 0 0; DW_OP_GNU_push_tls_address or DW_OP_HP_unknown) <92> DW_AT_linkage_name: (indirect string, offset: 0x156): _ZN9container8tlsvar_0E This happens because decode_locdesc allows the use of DW_OP_GNU_push_tls_address. This didn't make sense to me. I looked into it a bit more, and I think decode_locdesc is used in three ways: 1. Find a constant address of a symbol that happens to be encoded as a location expression. 2. Find the offset of a function in a virtual table. (This one should probably be replaced by code to just evaluate the expression in gnu-v3-abi.c -- but there's no point yet because no compiler actually seems to emit correct DWARF here, see the bug linked in the patch.) 3. Find the offset of a field, if the offset is a constant. None of these require TLS. This patch simplifies decode_locdesc by removing any opcodes that don't fit into the above. It also changes the API a little, to make it less difficult to use. Regression tested on x86-64 Fedora 36. --- gdb/dwarf2/read.c | 203 +++++++++++++++------------------------------- 1 file changed, 66 insertions(+), 137 deletions(-) diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c index 8f35b973f3e..682d0551bdf 100644 --- a/gdb/dwarf2/read.c +++ b/gdb/dwarf2/read.c @@ -942,8 +942,8 @@ static const char *namespace_name (struct die_info *die, static void process_enumeration_scope (struct die_info *, struct dwarf2_cu *); -static CORE_ADDR decode_locdesc (struct dwarf_block *, struct dwarf2_cu *, - bool * = nullptr); +static bool decode_locdesc (struct dwarf_block *, struct dwarf2_cu *, + CORE_ADDR *addr); static enum dwarf_array_dim_ordering read_array_order (struct die_info *, struct dwarf2_cu *); @@ -11457,6 +11457,7 @@ handle_member_location (struct die_info *die, struct dwarf2_cu *cu, if (attr != NULL) { *offset = 0; + CORE_ADDR temp; /* Note that we do not check for a section offset first here. This is because DW_AT_data_member_location is new in DWARF 4, @@ -11466,8 +11467,11 @@ handle_member_location (struct die_info *die, struct dwarf2_cu *cu, *offset = attr->constant_value (0); else if (attr->form_is_section_offset ()) dwarf2_complex_location_expr_complaint (); - else if (attr->form_is_block ()) - *offset = decode_locdesc (attr->as_block (), cu); + else if (attr->form_is_block () + && decode_locdesc (attr->as_block (), cu, &temp)) + { + *offset = temp; + } else dwarf2_complex_location_expr_complaint (); @@ -11520,9 +11524,8 @@ handle_member_location (struct die_info *die, struct dwarf2_cu *cu, dwarf2_complex_location_expr_complaint (); else if (attr->form_is_block ()) { - bool handled; - CORE_ADDR offset = decode_locdesc (attr->as_block (), cu, &handled); - if (handled) + CORE_ADDR offset; + if (decode_locdesc (attr->as_block (), cu, &offset)) field->set_loc_bitpos (offset * bits_per_byte); else { @@ -12242,18 +12245,25 @@ dwarf2_add_member_fn (struct field_info *fip, struct die_info *die, if (attr->form_is_block () && attr->as_block ()->size > 0) { struct dwarf_block *block = attr->as_block (); + CORE_ADDR offset; - if (block->data[0] == DW_OP_constu) + if (block->data[0] == DW_OP_constu + && decode_locdesc (block, cu, &offset)) { - /* Old-style GCC. */ - fnp->voffset = decode_locdesc (block, cu) + 2; + /* "Old"-style GCC. See + https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44126 + for discussion. This was known and a patch available + in 2010, but as of 2023, both GCC and clang still + emit this. */ + fnp->voffset = offset + 2; } - else if (block->data[0] == DW_OP_deref - || (block->size > 1 - && block->data[0] == DW_OP_deref_size - && block->data[1] == cu->header.addr_size)) + else if ((block->data[0] == DW_OP_deref + || (block->size > 1 + && block->data[0] == DW_OP_deref_size + && block->data[1] == cu->header.addr_size)) + && decode_locdesc (block, cu, &offset)) { - fnp->voffset = decode_locdesc (block, cu); + fnp->voffset = offset; if ((fnp->voffset % cu->header.addr_size) != 0) dwarf2_complex_location_expr_complaint (); else @@ -16211,9 +16221,10 @@ cooked_indexer::scan_attributes (dwarf2_per_cu_data *scanning_per_cu, if (!scanning_per_cu->addresses_seen && attr.form_is_block ()) { struct dwarf_block *locdesc = attr.as_block (); - CORE_ADDR addr = decode_locdesc (locdesc, reader->cu); - if (addr != 0 - || reader->cu->per_objfile->per_bfd->has_section_at_zero) + CORE_ADDR addr; + if (decode_locdesc (locdesc, reader->cu, &addr) + && (addr != 0 + || reader->cu->per_objfile->per_bfd->has_section_at_zero)) { low_pc = addr; /* For variables, we don't want to try decoding the @@ -20911,14 +20922,34 @@ read_signatured_type (signatured_type *sig_type, } /* Decode simple location descriptions. - Given a pointer to a dwarf block that defines a location, compute - the location and return the value. If COMPUTED is non-null, it is - set to true to indicate that decoding was successful, and false - otherwise. If COMPUTED is null, then this function may emit a - complaint. */ -static CORE_ADDR -decode_locdesc (struct dwarf_block *blk, struct dwarf2_cu *cu, bool *computed) + Given a pointer to a DWARF block that defines a location, compute + the location. Returns true if the expression was computable by + this function, false otherwise. On a true return, *RESULT is set. + + Note that this function does not implement a full DWARF expression + evaluator. Instead, it is used for a few limited purposes: + + - Getting the address of a symbol that has a constant address. For + example, if a symbol has a location like "DW_OP_addr", the address + can be extracted. + + - Getting the offset of a virtual function in its vtable. There + are two forms of this, one of which involves DW_OP_deref -- so this + function handles derefs in a peculiar way to make this 'work'. + (Probably this area should be rewritten.) + + - Getting the offset of a field, when it is constant. + + Opcodes that cannot be part of a constant expression, for example + those involving registers, simply result in a return of + 'false'. + + This function may emit a complaint. */ + +static bool +decode_locdesc (struct dwarf_block *blk, struct dwarf2_cu *cu, + CORE_ADDR *result) { struct objfile *objfile = cu->per_objfile->objfile; size_t i; @@ -20926,12 +20957,10 @@ decode_locdesc (struct dwarf_block *blk, struct dwarf2_cu *cu, bool *computed) const gdb_byte *data = blk->data; CORE_ADDR stack[64]; int stacki; - unsigned int bytes_read, unsnd; + unsigned int bytes_read; gdb_byte op; - if (computed != nullptr) - *computed = false; - + *result = 0; i = 0; stacki = 0; stack[stacki] = 0; @@ -20977,61 +21006,6 @@ decode_locdesc (struct dwarf_block *blk, struct dwarf2_cu *cu, bool *computed) stack[++stacki] = op - DW_OP_lit0; break; - case DW_OP_reg0: - case DW_OP_reg1: - case DW_OP_reg2: - case DW_OP_reg3: - case DW_OP_reg4: - case DW_OP_reg5: - case DW_OP_reg6: - case DW_OP_reg7: - case DW_OP_reg8: - case DW_OP_reg9: - case DW_OP_reg10: - case DW_OP_reg11: - case DW_OP_reg12: - case DW_OP_reg13: - case DW_OP_reg14: - case DW_OP_reg15: - case DW_OP_reg16: - case DW_OP_reg17: - case DW_OP_reg18: - case DW_OP_reg19: - case DW_OP_reg20: - case DW_OP_reg21: - case DW_OP_reg22: - case DW_OP_reg23: - case DW_OP_reg24: - case DW_OP_reg25: - case DW_OP_reg26: - case DW_OP_reg27: - case DW_OP_reg28: - case DW_OP_reg29: - case DW_OP_reg30: - case DW_OP_reg31: - stack[++stacki] = op - DW_OP_reg0; - if (i < size) - { - if (computed == nullptr) - dwarf2_complex_location_expr_complaint (); - else - return 0; - } - break; - - case DW_OP_regx: - unsnd = read_unsigned_leb128 (NULL, (data + i), &bytes_read); - i += bytes_read; - stack[++stacki] = unsnd; - if (i < size) - { - if (computed == nullptr) - dwarf2_complex_location_expr_complaint (); - else - return 0; - } - break; - case DW_OP_addr: stack[++stacki] = cu->header.read_address (objfile->obfd.get (), &data[i], @@ -21112,37 +21086,7 @@ decode_locdesc (struct dwarf_block *blk, struct dwarf2_cu *cu, bool *computed) global symbols, although the variable's address will be bogus in the psymtab. */ if (i < size) - { - if (computed == nullptr) - dwarf2_complex_location_expr_complaint (); - else - return 0; - } - break; - - case DW_OP_GNU_push_tls_address: - case DW_OP_form_tls_address: - /* The top of the stack has the offset from the beginning - of the thread control block at which the variable is located. */ - /* Nothing should follow this operator, so the top of stack would - be returned. */ - /* This is valid for partial global symbols, but the variable's - address will be bogus in the psymtab. Make it always at least - non-zero to not look as a variable garbage collected by linker - which have DW_OP_addr 0. */ - if (i < size) - { - if (computed == nullptr) - dwarf2_complex_location_expr_complaint (); - else - return 0; - } - stack[stacki]++; - break; - - case DW_OP_GNU_uninit: - if (computed != nullptr) - return 0; + return false; break; case DW_OP_addrx: @@ -21154,41 +21098,26 @@ decode_locdesc (struct dwarf_block *blk, struct dwarf2_cu *cu, bool *computed) break; default: - if (computed == nullptr) - { - const char *name = get_DW_OP_name (op); - - if (name) - complaint (_("unsupported stack op: '%s'"), - name); - else - complaint (_("unsupported stack op: '%02x'"), - op); - } - - return (stack[stacki]); + return false; } /* Enforce maximum stack depth of SIZE-1 to avoid writing outside of the allocated space. Also enforce minimum>0. */ if (stacki >= ARRAY_SIZE (stack) - 1) { - if (computed == nullptr) - complaint (_("location description stack overflow")); - return 0; + complaint (_("location description stack overflow")); + return false; } if (stacki <= 0) { - if (computed == nullptr) - complaint (_("location description stack underflow")); - return 0; + complaint (_("location description stack underflow")); + return false; } } - if (computed != nullptr) - *computed = true; - return (stack[stacki]); + *result = stack[stacki]; + return true; } /* memory allocation interface */