From patchwork Mon Jan 30 14:56:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom Tromey X-Patchwork-Id: 63915 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1721A3858D28 for ; Mon, 30 Jan 2023 14:56:52 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1721A3858D28 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1675090612; bh=DPz8+v0n1JIugzX1Xlu4lqv1OQjDzhVQETuE2uW4sCE=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=p8LTXX0LZzGbm6P8+AngcIcXK5HsjHcC5Ok0NY0gV4andXNsVlAtCLmvcOropYoCX i6q8Z0NIDijyBuVtks0yvhHXb1nZ+NbmLZdqVOvzwY2lAZEDghE4hECHKfGYJRdhPr kVNK2ECFhEpGg/dQR/sz0kbL1p3ECZzykKT3FLsQ= X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from mail-il1-x12c.google.com (mail-il1-x12c.google.com [IPv6:2607:f8b0:4864:20::12c]) by sourceware.org (Postfix) with ESMTPS id DF2C03858D1E for ; Mon, 30 Jan 2023 14:56:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DF2C03858D1E Received: by mail-il1-x12c.google.com with SMTP id m15so564717ilh.9 for ; Mon, 30 Jan 2023 06:56:26 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DPz8+v0n1JIugzX1Xlu4lqv1OQjDzhVQETuE2uW4sCE=; b=vSx9a60wByKLEFZ7VQKxyRhrDVSEpf+JhSDIkpByDnkZJBcwtm4M5Hh8VEA8F+b2pn QyTPXD7obHW65d6ZcrtX3zG8AOgNDPCde0T+/PrXYBco0VC9x02SrEwWem42S0yep0WP C8yOE0lNgKbh8bhlG/nZzZAdqJsAYE8SkVFkR53HcvdLcOiZhNEBS+ocuz7aYfZjpJnR tRR4qJxFAbwsqRun2RzLEKz4Qo2pjTRfLxtwOJdvFca24PKh2okkQJMe9yxUikTc2Rfm y5prljZdvHoQUs4lUvNSx33nNNr3A1iwqUZhc/cJs/mMKV1+zzFKPQjappv62hFhbLpz M7hQ== X-Gm-Message-State: AO0yUKVYIJ5+guH0YDPkFg8ZkrRxH1UAyWsuGL2EvKqU/5RD7LtsWQBm PTW80mNNT0V1NqB8EuMnbIKMDuxWnZmq36W7 X-Google-Smtp-Source: AK7set9nQxUPQJ5/XfkrdMglSUr2oHJ6h4GJ0HD4M7qMuRKZvmQFmENSzKZocdabTg91/LKwN/SpNQ== X-Received: by 2002:a05:6e02:12eb:b0:310:a087:9dc4 with SMTP id l11-20020a056e0212eb00b00310a0879dc4mr15937133iln.0.1675090586052; Mon, 30 Jan 2023 06:56:26 -0800 (PST) Received: from localhost.localdomain (75-166-146-144.hlrn.qwest.net. [75.166.146.144]) by smtp.gmail.com with ESMTPSA id d15-20020a02604f000000b0039e89ea68f9sm4693734jaf.135.2023.01.30.06.56.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Jan 2023 06:56:25 -0800 (PST) To: gdb-patches@sourceware.org Cc: Tom Tromey Subject: [PATCH v2] Fix comparator bug in cooked index Date: Mon, 30 Jan 2023 07:56:17 -0700 Message-Id: <20230130145617.1619439-1-tromey@adacore.com> X-Mailer: git-send-email 2.38.1 MIME-Version: 1.0 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tom Tromey via Gdb-patches From: Tom Tromey Reply-To: Tom Tromey Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Sender: "Gdb-patches" Simon pointed out that the cooked index template-matching patch introduced a failure in libstdc++ debug mode. In particular, the new code violates the assumption of std::lower_bound and std::upper_bound that the range is sorted with respect to the comparison. When I first debugged this, I thought the problem was unfixable as-is and that a second layer of filtering would have to be done. However, on irc, Simon pointed out that it could perhaps be solved if the comparison function were assured that one operand always came from the index, with the other always being the search string. This patch implements this idea. First, a new mode is introduced: a sorting mode for cooked_index_entry::compare. In this mode, strings are compared case-insensitively, but we're careful to always sort '<' before any other printable character. This way, two names like "func" and "func" will be sorted next to each other -- i.e., "func1" will not be seen between them. This is important when searching. Second, the compare function is changed to work in a strcmp-like way. This makes it easier to test and (IMO) understand. Third, the compare function is modified so that in non-sorting modes, the index entry is always the first argument. This allows consistency in compares. I regression tested this in libstdc++ debug mode on x86-64 Fedora 36. It fixes the crash that Simon saw. This is v2. I believe it addresses the review comments, except for the 'enum class' change, as I mentioned in email on the list. --- gdb/dwarf2/cooked-index.c | 167 ++++++++++++++++++++------------------ gdb/dwarf2/cooked-index.h | 50 ++++++++++-- 2 files changed, 132 insertions(+), 85 deletions(-) diff --git a/gdb/dwarf2/cooked-index.c b/gdb/dwarf2/cooked-index.c index 1a568e6aae7..9829f6f2b1f 100644 --- a/gdb/dwarf2/cooked-index.c +++ b/gdb/dwarf2/cooked-index.c @@ -30,56 +30,43 @@ /* See cooked-index.h. */ -bool +int cooked_index_entry::compare (const char *stra, const char *strb, - bool completing) + comparison_mode mode) { - /* If we've ever matched "<" in both strings, then we disable the - special template parameter handling. */ - bool seen_lt = false; + auto munge = [] (char c) -> unsigned char + { + /* We want to sort '<' before any other printable character. + So, rewrite '<' to something just before ' '. */ + if (c == '<') + return '\x1f'; + return TOLOWER ((unsigned char) c); + }; while (*stra != '\0' && *strb != '\0' - && (TOLOWER ((unsigned char) *stra) - == TOLOWER ((unsigned char ) *strb))) + && (munge (*stra) == munge (*strb))) { - if (*stra == '<') - seen_lt = true; ++stra; ++strb; } - unsigned c1 = TOLOWER ((unsigned char) *stra); - unsigned c2 = TOLOWER ((unsigned char) *strb); + unsigned char c1 = munge (*stra); + unsigned char c2 = munge (*strb); - if (completing) - { - /* When completing, if one string ends earlier than the other, - consider them as equal. Also, completion mode ignores the - special '<' handling. */ - if (c1 == '\0' || c2 == '\0') - return false; - /* Fall through to the generic case. */ - } - else if (seen_lt) - { - /* Fall through to the generic case. */ - } - else if (c1 == '\0' || c1 == '<') - { - /* Maybe they both end at the same spot. */ - if (c2 == '\0' || c2 == '<') - return false; - /* First string ended earlier. */ - return true; - } - else if (c2 == '\0' || c2 == '<') + if (c1 == c2) + return 0; + + /* When completing, if STRB ends earlier than STRA, consider them as + equal. When comparing, if STRB ends earlier and STRA ends with + '<', consider them as equal. */ + if (mode == COMPLETE || (mode == MATCH && c1 == munge ('<'))) { - /* Second string ended earlier. */ - return false; + if (c2 == '\0') + return 0; } - return c1 < c2; + return c1 < c2 ? -1 : 1; } #if GDB_SELF_TEST @@ -89,45 +76,65 @@ namespace { void test_compare () { - SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", false)); - SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", false)); - SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", true)); - SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", true)); - - SELF_CHECK (cooked_index_entry::compare ("abcd", "ABCDE", false)); - SELF_CHECK (!cooked_index_entry::compare ("ABCDE", "abcd", false)); - SELF_CHECK (!cooked_index_entry::compare ("abcd", "ABCDE", true)); - SELF_CHECK (!cooked_index_entry::compare ("ABCDE", "abcd", true)); - - SELF_CHECK (!cooked_index_entry::compare ("name", "name<>", false)); - SELF_CHECK (!cooked_index_entry::compare ("name<>", "name", false)); - SELF_CHECK (!cooked_index_entry::compare ("name", "name<>", true)); - SELF_CHECK (!cooked_index_entry::compare ("name<>", "name", true)); - - SELF_CHECK (!cooked_index_entry::compare ("name", "name", false)); - SELF_CHECK (!cooked_index_entry::compare ("name", "name", false)); - SELF_CHECK (!cooked_index_entry::compare ("name", "name", true)); - SELF_CHECK (!cooked_index_entry::compare ("name", "name", true)); - - SELF_CHECK (!cooked_index_entry::compare ("name>", - "name>", false)); - - SELF_CHECK (!cooked_index_entry::compare ("name", "name>", false)); - SELF_CHECK (!cooked_index_entry::compare ("name>", "name", false)); - SELF_CHECK (cooked_index_entry::compare ("name>", - false)); - SELF_CHECK (!cooked_index_entry::compare ("name>", - true)); - SELF_CHECK (!cooked_index_entry::compare ("name>", "name>", "name 0); + SELF_CHECK (cooked_index_entry::compare ("abcd", "ABCDE", + mode_complete) < 0); + SELF_CHECK (cooked_index_entry::compare ("ABCDE", "abcd", + mode_complete) == 0); + + SELF_CHECK (cooked_index_entry::compare ("name", "name<>", + mode_compare) < 0); + SELF_CHECK (cooked_index_entry::compare ("name<>", "name", + mode_compare) == 0); + SELF_CHECK (cooked_index_entry::compare ("name", "name<>", + mode_complete) < 0); + SELF_CHECK (cooked_index_entry::compare ("name<>", "name", + mode_complete) == 0); + + SELF_CHECK (cooked_index_entry::compare ("name", "name", + mode_compare) == 0); + SELF_CHECK (cooked_index_entry::compare ("name", "name", + mode_compare) > 0); + SELF_CHECK (cooked_index_entry::compare ("name", "name", + mode_complete) == 0); + SELF_CHECK (cooked_index_entry::compare ("name", "name", + mode_complete) > 0); + + SELF_CHECK (cooked_index_entry::compare ("name>", + "name>", + mode_compare) == 0); + + SELF_CHECK (cooked_index_entry::compare ("name", "name>", + mode_compare) < 0); + SELF_CHECK (cooked_index_entry::compare ("name>", "name", + mode_compare) == 0); + SELF_CHECK (cooked_index_entry::compare ("name>", "name 0); + SELF_CHECK (cooked_index_entry::compare ("name>", "name 0); + SELF_CHECK (cooked_index_entry::compare ("abcd", "", mode_complete) == 0); + + SELF_CHECK (cooked_index_entry::compare ("func", "func", + mode_sort) < 0); + SELF_CHECK (cooked_index_entry::compare ("func", "func1", + mode_sort) < 0); } } /* anonymous namespace */ @@ -359,20 +366,22 @@ cooked_index::find (const std::string &name, bool completing) const { wait (); + cooked_index_entry::comparison_mode mode = (completing + ? cooked_index_entry::COMPLETE + : cooked_index_entry::MATCH); + auto lower = std::lower_bound (m_entries.cbegin (), m_entries.cend (), name, [=] (const cooked_index_entry *entry, const std::string &n) { - return cooked_index_entry::compare (entry->canonical, n.c_str (), - completing); + return cooked_index_entry::compare (entry->canonical, n.c_str (), mode) < 0; }); auto upper = std::upper_bound (m_entries.cbegin (), m_entries.cend (), name, [=] (const std::string &n, const cooked_index_entry *entry) { - return cooked_index_entry::compare (n.c_str (), entry->canonical, - completing); + return cooked_index_entry::compare (entry->canonical, n.c_str (), mode) > 0; }); return range (lower, upper); diff --git a/gdb/dwarf2/cooked-index.h b/gdb/dwarf2/cooked-index.h index e12376cdf98..76e571795fa 100644 --- a/gdb/dwarf2/cooked-index.h +++ b/gdb/dwarf2/cooked-index.h @@ -143,16 +143,54 @@ struct cooked_index_entry : public allocate_on_obstack STORAGE. */ const char *full_name (struct obstack *storage) const; - /* Compare two strings, case-insensitively. Return true if STRA is - less than STRB. If one string has template parameters, but the - other does not, then they are considered to be equal; so for - example "t" == "t", "t" < "t", but "t" == "t". */ - static bool compare (const char *stra, const char *strb, bool completing); + /* Comparison modes for the 'compare' function. See the function + for a description. */ + enum comparison_mode + { + MATCH, + SORT, + COMPLETE, + }; + + /* Compare two strings, case-insensitively. Return -1 if STRA is + less than STRB, 0 if they are equal, and 1 if STRA is greater. + + When comparing, '<' is considered to be less than all other + printable characters. This ensures that "t" sorts before + "t1", which is necessary when looking up "t". This '<' handling + is to ensure that certain C++ lookups work correctly. It is + inexact, and applied regardless of the search language, but this + is ok because callers of this code do more precise filtering + according to their needs. This is also why using a + case-insensitive comparison works even for languages that are + case sensitive. + + MODE controls how the comparison proceeds. + + MODE==SORT is used when sorting and the only special '<' handling + that it does is to ensure that '<' sorts before all other + printable characters. This ensures that the resulting ordering + will be binary-searchable. + + MODE==MATCH is used when searching for a symbol. In this case, + STRB must always be the search name, and STRA must be the name in + the index that is under consideration. In compare mode, early + termination of STRB may match STRA -- for example, "t" and + "t" will be considered to be equal. (However, if A=="t" and + B=="t", then this will not consider them as equal.) + + MODE==COMPLETE is used when searching for a symbol for + completion. In this case, STRB must always be the search name, + and STRA must be the name in the index that is under + consideration. In completion mode, early termination of STRB + always results in a match. */ + static int compare (const char *stra, const char *strb, + comparison_mode mode); /* Compare two entries by canonical name. */ bool operator< (const cooked_index_entry &other) const { - return compare (canonical, other.canonical, false); + return compare (canonical, other.canonical, SORT) < 0; } /* The name as it appears in DWARF. This always points into one of