From patchwork Sun Mar 9 17:31:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom Tromey X-Patchwork-Id: 107560 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4A21C3858D37 for ; Sun, 9 Mar 2025 17:32:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4A21C3858D37 Authentication-Results: sourceware.org; dkim=fail reason="signature verification failed" (768-bit key, unprotected) header.d=tromey.com header.i=@tromey.com header.a=rsa-sha256 header.s=default header.b=oeYSoEbs X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from omta36.uswest2.a.cloudfilter.net (omta36.uswest2.a.cloudfilter.net [35.89.44.35]) by sourceware.org (Postfix) with ESMTPS id 4E62B3858D1E for ; Sun, 9 Mar 2025 17:32:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4E62B3858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=tromey.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tromey.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4E62B3858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=35.89.44.35 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1741541520; cv=none; b=FlvVTxrkheDCX+Zu4lLM2ZDxd5HxbjdABP8xhbhIq8WIGRSo78m6cE9C9yQuVED5ld+9vvEUa6081rE23n3ntfyBHuHd9retn9/RUXjFLzGY4Ihj0TCLaC/4iYnJOet+FU2PZptZonSp/HQjbe2mNXgTsC2R5MVSB2hbIiFyJBE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1741541520; c=relaxed/simple; bh=PGgquGX/i9qVg2HL0mBC25dMMu2lnw/vDOeH5ue5duA=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=D0RA4vf7ORZ9R0wf1MV/Y4hWj73+tqopWKh16oxLO4ySe0fNApMbk7ZdOsa9iXAJlzVq69sB3m8v58X3g4zrbWbGScLdcbQRCwOtEAIwk93etjmxpcwG7rvnDkrWOqRn00wc8e5+qJfcs4IsdrXfFHkD+IjCxzfKxu+TyotKBMY= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4E62B3858D1E Received: from eig-obgw-5009a.ext.cloudfilter.net ([10.0.29.176]) by cmsmtp with ESMTPS id rJ4OthFwmMETlrKVLtVVoP; Sun, 09 Mar 2025 17:31:59 +0000 Received: from box5379.bluehost.com ([162.241.216.53]) by cmsmtp with ESMTPS id rKVJt7pskJNl6rKVKts9XE; Sun, 09 Mar 2025 17:31:58 +0000 X-Authority-Analysis: v=2.4 cv=V7t70/ni c=1 sm=1 tr=0 ts=67cdd08e a=ApxJNpeYhEAb1aAlGBBbmA==:117 a=ApxJNpeYhEAb1aAlGBBbmA==:17 a=Vs1iUdzkB0EA:10 a=ItBw4LHWJt0A:10 a=mDV3o1hIAAAA:8 a=FLg6Vo8v4KQ-wErHSS0A:9 a=6Ogn3jAGHLSNbaov7Orx:22 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tromey.com; s=default; h=Content-Transfer-Encoding:MIME-Version:Message-ID:Date:Subject: Cc:To:From:Sender:Reply-To:Content-Type:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=M1f16ZtLPYa1HzTi17Pee79X3a54weNDyST61DbTUtQ=; b=oeYSoEbs9l3XCmmc+9F45khXer R6lNjmtd2qLxp+oiapztI/VhenNVY+bvppjokmWa100ek1y8hI1S2X4gl57QDD4vEraOBxpBi7dTv 4MwV5dDHPS/J0DSPmvdBitSJE; Received: from 97-118-51-80.hlrn.qwest.net ([97.118.51.80]:43458 helo=localhost.localdomain) by box5379.bluehost.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.1) (envelope-from ) id 1trKVJ-00000003mjZ-1ncA; Sun, 09 Mar 2025 11:31:57 -0600 From: Tom Tromey To: gdb-patches@sourceware.org Cc: Tom Tromey Subject: [PATCH] Add string cache and use it in cooked index Date: Sun, 9 Mar 2025 11:31:46 -0600 Message-ID: <20250309173146.1675304-1-tom@tromey.com> X-Mailer: git-send-email 2.46.1 MIME-Version: 1.0 X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - box5379.bluehost.com X-AntiAbuse: Original Domain - sourceware.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - tromey.com X-BWhitelist: no X-Source-IP: 97.118.51.80 X-Source-L: No X-Exim-ID: 1trKVJ-00000003mjZ-1ncA X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: 97-118-51-80.hlrn.qwest.net (localhost.localdomain) [97.118.51.80]:43458 X-Source-Auth: tom+tromey.com X-Email-Count: 2 X-Org: HG=bhshared;ORG=bluehost; X-Source-Cap: ZWx5bnJvYmk7ZWx5bnJvYmk7Ym94NTM3OS5ibHVlaG9zdC5jb20= X-Local-Domain: yes X-CMAE-Envelope: MS4xfKF8I5NEJpV4m/hQ4yE+PaaBD1pC9DawvazuU5iFrKev2LqnvUc5XlYWRX5n1ZccLrT6HZA2yFrnSrTS/Qxe6Nx04MWOGNWPpAyHbxvCi3tGb+bGkLep YCTupGhh/6B/6mPz9ZJaYmJkseirz40JuTFL41m5vMI2teAu5wwJS5Mp4wHoehQDxRhEteTZstlcSW9nPTXqNf797pxvepr12+o= X-Spam-Status: No, score=-3016.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces~patchwork=sourceware.org@sourceware.org The cooked index needs to allocate names in some cases -- when canonicalizing or when synthesizing Ada package names. This process currently uses a vector of unique_ptrs to manage the memory. Another series I'm writing adds another spot where this allocation must be done, and examining the result showed that certain names were allocated multiple times. To clean this up, this patch introduces a string cache object and changes the cooked indexer to use it. I considered using bcache here, but bcache doesn't work as nicely with string_view -- because bcache is fundamentally memory-based, a temporary copy of the contents must be made to ensure that bcache can see the trailing \0. Furthermore, writing a custom class lets us avoid another copy when canonicalizing C++ names. --- gdb/dwarf2/cooked-index.c | 16 ++--- gdb/dwarf2/cooked-index.h | 3 +- gdbsupport/string-set.h | 138 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 144 insertions(+), 13 deletions(-) create mode 100644 gdbsupport/string-set.h diff --git a/gdb/dwarf2/cooked-index.c b/gdb/dwarf2/cooked-index.c index 6612585649f..9d3a4b03489 100644 --- a/gdb/dwarf2/cooked-index.c +++ b/gdb/dwarf2/cooked-index.c @@ -368,13 +368,11 @@ cooked_index_shard::handle_gnat_encoded_entry cooked_index_entry *last = (cooked_index_entry *) *slot; if (last == nullptr || last->per_cu != entry->per_cu) { - gdb::unique_xmalloc_ptr new_name - = make_unique_xstrndup (name.data (), name.length ()); + const char *new_name = m_names.insert (name); last = create (entry->die_offset, DW_TAG_module, - IS_SYNTHESIZED, language_ada, new_name.get (), parent, + IS_SYNTHESIZED, language_ada, new_name, parent, entry->per_cu); last->canonical = last->name; - m_names.push_back (std::move (new_name)); new_entries.push_back (last); *slot = last; } @@ -383,9 +381,7 @@ cooked_index_shard::handle_gnat_encoded_entry } entry->set_parent (parent); - auto new_canon = make_unique_xstrndup (tail.data (), tail.length ()); - entry->canonical = new_canon.get (); - m_names.push_back (std::move (new_canon)); + entry->canonical = m_names.insert (tail); } /* See cooked-index.h. */ @@ -503,10 +499,7 @@ cooked_index_shard::finalize (const parent_map_map *parent_maps) if (canon_name == nullptr) entry->canonical = entry->name; else - { - entry->canonical = canon_name.get (); - m_names.push_back (std::move (canon_name)); - } + entry->canonical = m_names.insert (std::move (canon_name)); *slot = entry; } else @@ -526,7 +519,6 @@ cooked_index_shard::finalize (const parent_map_map *parent_maps) m_entries.insert (m_entries.end (), new_gnat_entries.begin (), new_gnat_entries.end ()); - m_names.shrink_to_fit (); m_entries.shrink_to_fit (); std::sort (m_entries.begin (), m_entries.end (), [] (const cooked_index_entry *a, const cooked_index_entry *b) diff --git a/gdb/dwarf2/cooked-index.h b/gdb/dwarf2/cooked-index.h index f6586359770..c6911c23781 100644 --- a/gdb/dwarf2/cooked-index.h +++ b/gdb/dwarf2/cooked-index.h @@ -32,6 +32,7 @@ #include "dwarf2/read.h" #include "dwarf2/parent-map.h" #include "gdbsupport/range-chain.h" +#include "gdbsupport/string-set.h" #include "complaints.h" #if CXX_STD_THREAD @@ -367,7 +368,7 @@ class cooked_index_shard /* The addrmap. This maps address ranges to dwarf2_per_cu objects. */ addrmap_fixed *m_addrmap = nullptr; /* Storage for canonical names. */ - std::vector> m_names; + gdb::string_set m_names; }; using cooked_index_shard_up = std::unique_ptr; diff --git a/gdbsupport/string-set.h b/gdbsupport/string-set.h new file mode 100644 index 00000000000..836dd6a7afc --- /dev/null +++ b/gdbsupport/string-set.h @@ -0,0 +1,138 @@ +/* String-interning set + + Copyright (C) 2025 Free Software Foundation, Inc. + + This file is part of GDB. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef GDBSUPPORT_STRING_SET_H +#define GDBSUPPORT_STRING_SET_H + +#include "gdbsupport/common-utils.h" +#include "gdbsupport/unordered_set.h" +#include + +namespace gdb +{ + +/* This is a string-interning set. It manages storage for strings, + ensuring that just a single copy of a given string is kept. The + underlying C string will remain valid for the lifetime of this + object. */ + +class string_set +{ +public: + + string_set () = default; + + /* Insert STR into this set. Returns a pointer to the interned + string. */ + const char *insert (const char *str) + { + /* We need to take the length to hash the string anyway, so it's + convenient to just wrap it here. */ + return insert (std::string_view (str)); + } + + /* An overload accepting a string. */ + const char *insert (const std::string &str) + { + return m_set.insert (str).first->get (); + } + + /* An overload accepting a string view. */ + const char *insert (std::string_view str) + { + return m_set.insert (str).first->get (); + } + + /* An overload that takes ownership of the string. */ + const char *insert (gdb::unique_xmalloc_ptr str) + { + local_string ls (std::move (str)); + return m_set.insert (std::move (ls)).first->get (); + } + +private: + + /* The type of string we store. Note that we do not store + std::string here to avoid the small-string optimization + invalidating a pointer on rehash. */ + struct local_string + { + explicit local_string (std::string_view str) + : contents (xstrndup (str.data (), str.size ())), + len (str.size ()) + { } + + explicit local_string (gdb::unique_xmalloc_ptr str) + : contents (std::move (str)), + len (strlen (contents.get ())) + { } + + const char *get () const + { return contents.get (); } + + std::string_view as_view () const + { return std::string_view (contents.get (), len); } + + /* \0-terminated string contents. */ + gdb::unique_xmalloc_ptr contents; + /* Length of the string. */ + size_t len; + }; + + /* Equality object for the set. */ + struct str_eq + { + using is_transparent = void; + + bool operator() (std::string_view lhs, const local_string &rhs) + const noexcept + { + return lhs == rhs.as_view (); + } + + bool operator() (const local_string &lhs, const local_string &rhs) + const noexcept + { + return strcmp (lhs.get (), rhs.get ()) == 0; + } + }; + + /* Hash object for the set. */ + struct str_hash + { + using is_transparent = void; + + bool operator() (const local_string &rhs) const noexcept + { + return fast_hash (rhs.get (), rhs.len); + } + + bool operator() (std::string_view rhs) const noexcept + { + return fast_hash (rhs.data (), rhs.size ()); + } + }; + + /* The strings. */ + gdb::unordered_set m_set; +}; + +} + +#endif /* GDBSUPPORT_STRING_SET_H */