From patchwork Tue Dec 20 13:55:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Burgess X-Patchwork-Id: 62191 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 093563858421 for ; Tue, 20 Dec 2022 13:56:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 093563858421 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1671544564; bh=HT0ECnVvjk2VTIWwB28BdgKojGXBSE6H2UWis5EeUUA=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=sN7N/rzIFBiWQL/K266sT3ATMo+NbDWZu/mersQYcA67gQmWJTXTbFk8mgNmF8NIL 7zbWD2RK9ZY/Wf0QDTYsw0hMd2TR82AoE07oJ8KKqvsZ9EvWNmn0yUkeNEsu1zc33g RLhQOVN23rLdOX4TA8ui2WmrL0USzfchCXw6Eifg= X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 19F853858421 for ; Tue, 20 Dec 2022 13:55:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 19F853858421 Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-153-uuj7WxAGMPSbYdp1dAkQWw-1; Tue, 20 Dec 2022 08:55:36 -0500 X-MC-Unique: uuj7WxAGMPSbYdp1dAkQWw-1 Received: by mail-wm1-f71.google.com with SMTP id f1-20020a1cc901000000b003cf703a4f08so2668276wmb.2 for ; Tue, 20 Dec 2022 05:55:36 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HT0ECnVvjk2VTIWwB28BdgKojGXBSE6H2UWis5EeUUA=; b=fvmECmoDTE++aenfCzTQB1Sod6ul1q2hFZPFm5lcWw6s8YSBgNnyJS2pstIxmB8DK7 YJQ000WR641Ec71rhHOA9p3Ns2bE3PXQvOhD/7BcEKEcmRBzpnnfRiwtpo1IcPsDkTI9 ja41P/1362P6/qitOKJmPvVRqYp/miK7V9Urb170SX+7TPHb5UstBVwexd7VsP5A+rX4 07AYkTsriEJTPG9jZ+y/JaSspF9+zIBgB8SbgT6adw+cmb23cd6zwFtryyInvSPa6Anb 9TbOuxm7Kw6LHOPI1ov2Wxf8XeepoXK+HsTjLQZOuX0Un6LW5iPy1S09bqm35gsOavz9 CHxw== X-Gm-Message-State: ANoB5pmU9hqE3AIApcAag/Y41PEbP8sI8Wml55xkDnivv8nbmo+EdaDf vXdejhNqyAxZRAUSjkr1z5FcfaW0FF1kco9CgSJRt42xyc1rvLse9IUYmFlZhwstn1gB+W/ohHF QA9733g7mI+JJrTQ50nuwlx7S713eyf2JjzqqfgPMdv5AOzd1D+K6mXBHL6bzrCapYOMrYDHBzw == X-Received: by 2002:a05:600c:20a:b0:3d2:7e0:3d51 with SMTP id 10-20020a05600c020a00b003d207e03d51mr33692880wmi.17.1671544534832; Tue, 20 Dec 2022 05:55:34 -0800 (PST) X-Google-Smtp-Source: AA0mqf5cbLKbQRh7FKvl8M/e2Gto1BsWRzTtOmZpvM9Wc4UF9fFAXujEmzziPhgTfQI15TAqhl/edA== X-Received: by 2002:a05:600c:20a:b0:3d2:7e0:3d51 with SMTP id 10-20020a05600c020a00b003d207e03d51mr33692860wmi.17.1671544534251; Tue, 20 Dec 2022 05:55:34 -0800 (PST) Received: from localhost ([31.111.84.238]) by smtp.gmail.com with ESMTPSA id bh16-20020a05600c3d1000b003cfd4e6400csm16109238wmb.19.2022.12.20.05.55.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Dec 2022 05:55:33 -0800 (PST) To: gdb-patches@sourceware.org Cc: Andrew Burgess Subject: [PATCH] gdb/c++: fix handling of breakpoints on @plt symbols Date: Tue, 20 Dec 2022 13:55:32 +0000 Message-Id: <756de5175770d79b3f0642fa3035ef348388bda4.1671544509.git.aburgess@redhat.com> X-Mailer: git-send-email 2.25.4 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP, WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Andrew Burgess via Gdb-patches From: Andrew Burgess Reply-To: Andrew Burgess Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Sender: "Gdb-patches" This commit should fix PR gdb/20091, PR gdb/17201, and PR gdb/17071. Additionally, PR gdb/17199 relates to this area of code, but is more of a request to refactor some parts of GDB, this commit does not address that request, but it is probably worth reading that PR when looking at this commit. When the current language is C++, and the user places a breakpoint on a function in a shared library, GDB will currently find two locations for the breakpoint, one location will be within the function itself as we would expect, but the other location will be within the PLT table for the call to the named function. Consider this session: $ gdb -q /tmp/breakpoint-shlib-func Reading symbols from /tmp/breakpoint-shlib-func... (gdb) start Temporary breakpoint 1 at 0x40112e: file /tmp/breakpoint-shlib-func.cc, line 20. Starting program: /tmp/breakpoint-shlib-func Temporary breakpoint 1, main () at /tmp/breakpoint-shlib-func.cc:20 20 int answer = foo (); (gdb) break foo Breakpoint 2 at 0x401030 (2 locations) (gdb) info breakpoints Num Type Disp Enb Address What 2 breakpoint keep y 2.1 y 0x0000000000401030 2.2 y 0x00007ffff7fc50fd in foo() at /tmp/breakpoint-shlib-func-lib.cc:20 This is not the expected behaviour. If we compile the same test using a C compiler then we see this: (gdb) break foo Breakpoint 2 at 0x7ffff7fc50fd: file /tmp/breakpoint-shlib-func-c-lib.c, line 20. (gdb) info breakpoints Num Type Disp Enb Address What 2 breakpoint keep y 0x00007ffff7fc50fd in foo at /tmp/breakpoint-shlib-func-c-lib.c:20 Here's what's happening. When GDB parses the symbols in the main executable and the shared library we see a number of different symbols for foo, and use these to create entries in GDB's msymbol table: - In the main executable we see a symbol 'foo@plt' that points at the plt entry for foo, from this we add two entries into GDB's msymbol table, one called 'foo@plt' which points at the plt entry and has type mst_text, then we create a second symbol, this time called 'foo' with type mst_solib_trampoline which also points at the plt entry, - Then, when the shared library is loaded we see another symbol called 'foo', this one points at the actual implementation in the shared library. This time GDB creates a msymbol called 'foo' with type mst_text that points at the implementation. This means that GDB creates 3 msymbols to represent the 2 symbols found in the executable and shared library. When the user creates a breakpoint on 'foo' GDB eventually ends up in search_minsyms_for_name (linespec.c), this function then calls iterate_over_minimal_symbols passing in the name we are looking for wrapped in a lookup_name_info object. In iterate_over_minimal_symbols we iterate over two hash tables (using the name we're looking for as the hash key), first we walk the hash table of symbol linkage names, then we walk the hash table of demangled symbol names. When the language is C++ the symbols for 'foo' will all have been mangled, as a result, in this case, the iteration of the linkage name hash table will find no matching results. However, when we walk the demangled hash table we do find some results. In order to match symbol names, GDB obtains a symbol name matching function by calling the get_symbol_name_matcher method on the language_defn class. For C++, in this case, the matching function we use is cp_fq_symbol_name_matches, which delegates the work to strncmp_iw_with_mode with mode strncmp_iw_mode::MATCH_PARAMS and language set to language_cplus. The strncmp_iw_mode::MATCH_PARAMS mode means that strncmp_iw_mode will skip any parameters in the demangled symbol name when checking for a match, e.g. 'foo' will match the demangled name 'foo()'. The way this is done is that the strings are matched character by character, but, once the string we are looking for ('foo' here) is exhausted, if we are looking at '(' then we consider the match a success. Lets consider the 3 symbols GDB created. If the function declaration is 'void foo ()' then from the main executable we added symbols '_Z3foov@plt' and '_Z3foov', while from the shared library we added another symbol call '_Z3foov'. When these are demangled they become 'foo()@plt', 'foo', and 'foo' respectively. Now, the '_Z3foov' symbol from the main executable has the type mst_solib_trampoline, and in search_minsyms_for_name, we search for any symbols of type mst_solib_trampoline and filter these out of the results. However, the '_Z3foov@plt' symbol (from the main executable), and the '_Z3foov' symbol (from the shared library) both have type mst_text. During the demangled name matching, due to the use of MATCH_PARAMS mode, we stop the comparison as soon as we hit a '(' in the demangled name. And so, '_Z3foov@plt', which demangles to 'foo()@plt' matches 'foo', and '_Z3foov', which demangles to 'foo()' also matches 'foo'. By contrast, for C, there are no demangled hash table entries to be iterated over (in iterate_over_minimal_symbols), we only consider the linkage name symbols which are 'foo@plt' and 'foo'. The plain 'foo' symbol obviously matches when we are looking for 'foo', but in this case the 'foo@plt' will not match due to the '@plt' suffix. And so, when the user asks for a breakpoint in 'foo', and the language is C, search_minsyms_for_name, returns a single msymbol, the mst_text symbol for foo in the shared library, while, when the language is C++, we get two results, '_Z3foov' for the shared library function, and '_Z3foov@plt' for the plt entry in the main executable. I propose to fix this in strncmp_iw_with_mode. When the mode is MATCH_PARAMS, instead of stopping at a '(' and assuming the match is a success, GDB will instead search forward for the matching, closing, ')', effectively skipping the parameter list, and then resume matching. Thus, when comparing 'foo' to 'foo()@plt' GDB will effectively compare against 'foo@plt' (skipping the parameter list), and the match will fail, just as it does when the language is C. There is one slight complication, which is revealed by the test gdb.linespec/cpcompletion.exp, when searching for the symbol of a const member function, the demangled symbol will have 'const' at the end of its name, e.g.: struct_with_const_overload::const_overload_fn() const Previously, the matching would stop at the '(' character, but after my change the whole '()' is skipped, and the match resumes. As a result, the 'const' modifier results in a failure to match, when previously GDB would have found a match. To work around this issue, in strncmp_iw_with_mode, for language C++ and mode MATCH_PARAMS, I explicitly allow a trailing 'const' to be skipped. With these changes in place I now see GDB correctly setting a breakpoint only at the implementation of 'foo' in the shared library. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=20091 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=17201 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=17071 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=17199 --- .../gdb.cp/breakpoint-shlib-func-lib.cc | 21 +++++ gdb/testsuite/gdb.cp/breakpoint-shlib-func.cc | 22 ++++++ .../gdb.cp/breakpoint-shlib-func.exp | 78 +++++++++++++++++++ gdb/utils.c | 26 ++++++- 4 files changed, 146 insertions(+), 1 deletion(-) create mode 100644 gdb/testsuite/gdb.cp/breakpoint-shlib-func-lib.cc create mode 100644 gdb/testsuite/gdb.cp/breakpoint-shlib-func.cc create mode 100644 gdb/testsuite/gdb.cp/breakpoint-shlib-func.exp base-commit: 832a980e1725a4a7fa527f2dcd92a64d7e68ab0f diff --git a/gdb/testsuite/gdb.cp/breakpoint-shlib-func-lib.cc b/gdb/testsuite/gdb.cp/breakpoint-shlib-func-lib.cc new file mode 100644 index 00000000000..c88620a8bc5 --- /dev/null +++ b/gdb/testsuite/gdb.cp/breakpoint-shlib-func-lib.cc @@ -0,0 +1,21 @@ +/* Copyright 2022 Free Software Foundation, Inc. + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +extern int foo (); + +int +foo () +{ + return 0; +} diff --git a/gdb/testsuite/gdb.cp/breakpoint-shlib-func.cc b/gdb/testsuite/gdb.cp/breakpoint-shlib-func.cc new file mode 100644 index 00000000000..3970e44ec8d --- /dev/null +++ b/gdb/testsuite/gdb.cp/breakpoint-shlib-func.cc @@ -0,0 +1,22 @@ +/* Copyright 2022 Free Software Foundation, Inc. + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +extern int foo (); + +int +main () +{ + int answer = foo (); + return answer; +} diff --git a/gdb/testsuite/gdb.cp/breakpoint-shlib-func.exp b/gdb/testsuite/gdb.cp/breakpoint-shlib-func.exp new file mode 100644 index 00000000000..3d15b68c182 --- /dev/null +++ b/gdb/testsuite/gdb.cp/breakpoint-shlib-func.exp @@ -0,0 +1,78 @@ +# Copyright 2022 Free Software Foundation, Inc. +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# Places a breakpoint on a function in a shared library before the +# inferior has started. GDB will place the breakpoint on the @plt +# symbol in the main executable. +# +# When the inferior is started GDB will re-evaluate the breakpoint +# location and move the breakpoint to the function implementation in +# the shared library. +# +# Then, with the inferior started, delete all breakpoints, and +# re-create the breakpoint on the shared library function, GDB should +# place a single breakpoint on the function implementation in the +# shared library. + +if {[skip_shlib_tests]} { + return 0 +} + +standard_testfile .cc -lib.cc + +set libobj [standard_output_file libfoo.so] +if { [build_executable "build shared library" $libobj $srcfile2 {debug c++ shlib}] != 0 } { + return -1 +} + +if { [prepare_for_testing "failed to prepare" $testfile $srcfile [list debug c++ shlib=$libobj]] } { + return -1 +} + +# Place the breakpoint before the shared library has been loaded, the +# breakpoint should be placed on the @plt symbol. +gdb_test "break foo" "Breakpoint $decimal at $hex" +gdb_test "info breakpoints" "" + +# This is used as an override for delete_breakpoints when we don't +# want functions in gdb.exp to delete breakpoints behind the scenes +# for us. +proc do_not_delete_breakpoints {} { + # Just do nothing. +} + +# Runto main, but don't delete all the breakpoints. +with_override delete_breakpoints do_not_delete_breakpoints { + if ![runto_main] { + return -1 + } +} + +# The breakpoint should now be showing in `foo` for real. +gdb_test "info breakpoints" \ + "\r\n$decimal\\s+\[^\r\n\]+ in foo\\(\\) at \[^\r\n\]+\r\n.*" \ + "check breakpoints after starting the inferior" + +# Now we can delete the breakpoints. +delete_breakpoints + +# And recreate the foo breakpoint, we should only get one location, +# the actual location. +gdb_test "break foo" "Breakpoint $decimal at \[^\r\n\]+" \ + "recreate foo breakpoint" + +# Check the breakpoint was recreated correctly. +gdb_test "info breakpoints" \ + "\r\n$decimal\\s+\[^\r\n\]+ in foo\\(\\) at \[^\r\n\]+" \ + "check breakpoints after recreation" diff --git a/gdb/utils.c b/gdb/utils.c index 4a715b945f6..b50e57b39cd 100644 --- a/gdb/utils.c +++ b/gdb/utils.c @@ -2397,7 +2397,31 @@ strncmp_iw_with_mode (const char *string1, const char *string2, return 0; } else - return (*string1 != '\0' && *string1 != '('); + { + if (*string1 == '(') + { + int p_count = 0; + + do + { + if (*string1 == '(') + ++p_count; + else if (*string1 == ')') + --p_count; + ++string1; + } + while (*string1 != '\0' && p_count > 0); + + /* There maybe things like 'const' after the parameters, + which we do want to ignore. However, if there's an '@' + then this likely indicates something like '@plt' which we + should not ignore. */ + return *string1 == '@'; + } + + return *string1 == '\0' ? 0 : 1; + } + } else return 1;