From patchwork Tue Oct 29 14:08:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Burgess X-Patchwork-Id: 99771 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8BC9B3858431 for ; Tue, 29 Oct 2024 14:09:25 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTP id 702413858C60 for ; Tue, 29 Oct 2024 14:08:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 702413858C60 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 702413858C60 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730210925; cv=none; b=uOf0W+SLeOebvMVTVGfDuVZvy6klNs81cepPXW1yuFgZC/V7SMeSpAqaWKTdrMrcWRmVzwACpDdQWUi67BnMSZ6gtDZ1nZedwJc5itHY27oL5NDt+REPH9MBSP5JdLHKwr2b5VhlXSiucJdfhQQBV5/qKC7KLB9mlkAiG8JgrAw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730210925; c=relaxed/simple; bh=gV82kay5dZYN+NTGbloK6KehaRUNGBqmvLqn+sgNRq0=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=Jn8PLolXdSVRYSrNtmHGZTydj5ra97c3jlVkhcOlIOxKO5kvuB1yAzpBi+dmv8oWHFvXRSZEaVsWu+HpsyO7JIFmn3rTOiX4BNF1HiE/DmoLJvuSeM8CLLiI2IbNtGSEYJS4fHRnTq5Y8sAt9hQMKsHc1Da8tkDglATXnN114tg= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1730210920; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PH1Db8+OpdL++VulTzJf1Y5pImsd9q985OKZEIZOo1c=; b=hOgmGG45ISBPeRZ/I90RWEVlANKODDdRX8f4Ld36OCyoNKF34vjqKpPXyX05BSMTQJjWS9 Mod6Aw2f6YKNCOQkUVTa7M8p/rCNgH94nYYFgfF/GiGWWoBaDrviVd+cU+y33O71IMMfUU gV5pXBxh88PkLzpClyp1rd7yPsSG3So= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-235-4TtHhQ2nOJ2HUah_O-c__A-1; Tue, 29 Oct 2024 10:08:38 -0400 X-MC-Unique: 4TtHhQ2nOJ2HUah_O-c__A-1 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-37d458087c0so3901676f8f.1 for ; Tue, 29 Oct 2024 07:08:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730210915; x=1730815715; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kb1yruKReHK2iNQJbVRLET9LjHnxR83q42ZmOJGJyq8=; b=DmMNGEQGBMs5+bDR1MEeostpU+rEX7IQfGH7CKQQkTgupSR2TD+bZU/TpLhrnb4LYq 2sIfO+TDKCKNBSnMk5rCWdBUbqPB4THwmOuENXTGngK5Dq0NQurYl98mgXjD0+jip3c9 l0IIlorWc7H9Rc240Bko4g0LFSpLcN2RjZT+gsostDNTcRHW2FtBgpX8yRvdmbFUwsHU iE0rIZp07OpcWcinX4mUrLDnZbZO27BFq/kvQeekZTmljyQ0dTYCp8heqxd/+nuP2rPE 7m9LX/6BwTX7/lmtArC4WBPLBbaf39jyHeLgvUGPmcBabXLSAF2/3D3JUTVOj72W9BX2 3U3g== X-Gm-Message-State: AOJu0Yz+KUMAJeQyAVgoK/yL4xe+Z3gwcIMYm3k7MnKj6lAoMPtvsJAz v2lgzeZE0IpTO2GWB35e0d0ZxS5y5CPj9k5COwvpkkXD5UnY4FJXhrMxbzss0xb26oj4eeTA9xg di151I3ieBx56gMNXLGwIAyoCZTQlz4LI5cfhaHyWKtumSeuRd3txuQ2k06+xDvpj9M8FBc8OBR 55wsdveVW5VW3ImAMnE9uNEGzYmqEp6x/5BMiue3IASoU= X-Received: by 2002:adf:ee48:0:b0:37c:bafd:5624 with SMTP id ffacd0b85a97d-38162914ee7mr2088179f8f.25.1730210914517; Tue, 29 Oct 2024 07:08:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IERW2Xs4mXc8qs/s0L39gsVGO/lBoVsWYpOtE5kX2A1mh+ceLIUUzXmzsgtl13IBR9A6ZWn2Q== X-Received: by 2002:adf:ee48:0:b0:37c:bafd:5624 with SMTP id ffacd0b85a97d-38162914ee7mr2088117f8f.25.1730210913542; Tue, 29 Oct 2024 07:08:33 -0700 (PDT) Received: from localhost (197.209.200.146.dyn.plus.net. [146.200.209.197]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4318b55df56sm174482495e9.10.2024.10.29.07.08.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Oct 2024 07:08:32 -0700 (PDT) From: Andrew Burgess To: gdb-patches@sourceware.org Cc: Andrew Burgess Subject: [PATCHv3 1/5] gdb: add gdbarch method to get execution context from core file Date: Tue, 29 Oct 2024 14:08:25 +0000 Message-Id: <0ee7841a11e8d00260aa24387a42e04443429a7e.1730205615.git.aburgess@redhat.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: References: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_INVALID, DKIM_SIGNED, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces~patchwork=sourceware.org@sourceware.org Add a new gdbarch method which can read the execution context from a core file. An execution context, for this commit, means the filename of the executable used to generate the core file and the arguments passed to the executable. In later commits this will be extended further to include the environment in which the executable was run, but this commit is already pretty big, so I've split that part out into a later commit. Initially this new gdbarch method is only implemented for Linux targets, but a later commit will add FreeBSD support too. Currently when GDB opens a core file, GDB reports the command and arguments used to generate the core file. For example: (gdb) core-file ./core.521524 [New LWP 521524] Core was generated by `./gen-core abc def'. However, this information comes from the psinfo structure in the core file, and this struct only allows 80 characters for the command and arguments combined. If the command and arguments exceed this then they are truncated. Additionally, neither the executable nor the arguments are quoted in the psinfo structure, so if, for example, the executable was named 'aaa bbb' (i.e. contains white space) and was run with the arguments 'ccc' and 'ddd', then when this core file was opened by GDB we'd see: (gdb) core-file ./core.521524 [New LWP 521524] Core was generated by `./aaa bbb ccc ddd'. It is impossible to know if 'bbb' is part of the executable filename, or another argument. However, the kernel places the executable command onto the user stack, this is pointed to by the AT_EXECFN entry in the auxv vector. Additionally, the inferior arguments are all available on the user stack. The new gdbarch method added in this commit extracts this information from the user stack and allows GDB to access it. The information on the stack is writable by the user, so a user application can start up, edit the arguments, override the AT_EXECFN string, and then dump core. In this case GDB will report incorrect information, however, it is worth noting that the psinfo structure is also filled (by the kernel) by just copying information from the user stack, so, if the user edits the on stack arguments, the values reported in psinfo will change, so the new approach is no worse than what we currently have. The benefit of this approach is that GDB gets to report the full executable name and all the arguments without the 80 character limit, and GDB is aware which parts are the executable name, and which parts are arguments, so we can, for example, style the executable name. Another benefit is that, now we know all the arguments, we can poke these into the inferior object. This means that after loading a core file a user can 'show args' to see the arguments used. A user could even transition from core file debugging to live inferior debugging using, e.g. 'run', and GDB would restart the inferior with the correct arguments. Now the downside: finding the AT_EXECFN string is easy, the auxv entry points directly too it. However, finding the arguments is a little trickier. There's currently no easy way to get a direct pointer to the arguments. Instead, I've got a heuristic which I believe should find the arguments in most cases. The algorithm is laid out in linux-tdep.c, I'll not repeat it here, but it's basically a search of the user stack, starting from AT_EXECFN. If the new heuristic fails then GDB just falls back to the old approach, asking bfd to read the psinfo structure for us, which gives the old 80 character limited answer. For testing, I've run this series on (all GNU/Linux) x86-64. s390, ppc64le, and the new test passes in each case. I've done some very basic testing on ARM which does things a little different than the other architectures mentioned, see ARM specific notes in linux_corefile_parse_exec_context_1 for details. --- gdb/arch-utils.h | 57 ++++ gdb/corefile.c | 10 + gdb/corelow.c | 38 ++- gdb/gdbarch-gen.c | 22 ++ gdb/gdbarch-gen.h | 15 + gdb/gdbarch.h | 1 + gdb/gdbarch_components.py | 20 ++ gdb/linux-tdep.c | 293 ++++++++++++++++++ .../gdb.base/corefile-exec-context.c | 25 ++ .../gdb.base/corefile-exec-context.exp | 102 ++++++ 10 files changed, 579 insertions(+), 4 deletions(-) create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.c create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.exp diff --git a/gdb/arch-utils.h b/gdb/arch-utils.h index 40c62f30a65..8d9f1625bdd 100644 --- a/gdb/arch-utils.h +++ b/gdb/arch-utils.h @@ -74,6 +74,58 @@ struct bp_manipulation_endian bp_manipulation_endian +/* Structure returned from gdbarch core_parse_exec_context method. Wraps + the execfn string and a vector containing the inferior argument. If a + gdbarch is unable to parse this information then an empty structure is + returned, check the execfn as an indication, if this is nullptr then no + other fields should be considered valid. */ + +struct core_file_exec_context +{ + /* Constructor, just move everything into place. The EXEC_NAME should + never be nullptr. Only call this constructor if all the arguments + have been collected successfully, i.e. if the EXEC_NAME could be + found but not ARGV then use the no-argument constructor to create an + empty context object. */ + core_file_exec_context (gdb::unique_xmalloc_ptr exec_name, + std::vector> argv) + : m_exec_name (std::move (exec_name)), + m_arguments (std::move (argv)) + { + gdb_assert (m_exec_name != nullptr); + } + + /* Create a default context object. In its default state a context + object holds no useful information, and will return false from its + valid() method. */ + core_file_exec_context () = default; + + /* Return true if this object contains valid context information. */ + bool valid () const + { return m_exec_name != nullptr; } + + /* Return the execfn string (executable name) as extracted from the core + file. Will always return non-nullptr if valid() returns true. */ + const char *execfn () const + { return m_exec_name.get (); } + + /* Return the vector of inferior arguments as extracted from the core + file. This does not include argv[0] (the executable name) for that + see the execfn() function. */ + const std::vector> &args () const + { return m_arguments; } + +private: + + /* The executable filename as reported in the core file. Can be nullptr + if no executable name is found. */ + gdb::unique_xmalloc_ptr m_exec_name; + + /* List of arguments. Doesn't include argv[0] which is the executable + name, for this look at m_exec_name field. */ + std::vector> m_arguments; +}; + /* Default implementation of gdbarch_displaced_hw_singlestep. */ extern bool default_displaced_step_hw_singlestep (struct gdbarch *); @@ -305,6 +357,11 @@ extern void default_read_core_file_mappings read_core_file_mappings_pre_loop_ftype pre_loop_cb, read_core_file_mappings_loop_ftype loop_cb); +/* Default implementation of gdbarch_core_parse_exec_context. Returns + an empty core_file_exec_context. */ +extern core_file_exec_context default_core_parse_exec_context + (struct gdbarch *gdbarch, bfd *cbfd); + /* Default implementation of gdbarch use_target_description_from_corefile_notes. */ extern bool default_use_target_description_from_corefile_notes diff --git a/gdb/corefile.c b/gdb/corefile.c index f6ec3cd5ca1..c3089e4516e 100644 --- a/gdb/corefile.c +++ b/gdb/corefile.c @@ -35,6 +35,7 @@ #include "cli/cli-utils.h" #include "gdbarch.h" #include "interps.h" +#include "arch-utils.h" void reopen_exec_file (void) @@ -76,6 +77,15 @@ validate_files (void) } } +/* See arch-utils.h. */ + +core_file_exec_context +default_core_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd) +{ + return {}; +} + + std::string memory_error_message (enum target_xfer_status err, struct gdbarch *gdbarch, CORE_ADDR memaddr) diff --git a/gdb/corelow.c b/gdb/corelow.c index 5820ffed332..5cc11d71b7b 100644 --- a/gdb/corelow.c +++ b/gdb/corelow.c @@ -854,7 +854,6 @@ locate_exec_from_corefile_build_id (bfd *abfd, int from_tty) void core_target_open (const char *arg, int from_tty) { - const char *p; int siggy; int scratch_chan; int flags; @@ -990,9 +989,40 @@ core_target_open (const char *arg, int from_tty) exception_print (gdb_stderr, except); } - p = bfd_core_file_failing_command (current_program_space->core_bfd ()); - if (p) - gdb_printf (_("Core was generated by `%s'.\n"), p); + /* See if the gdbarch can find the executable name and argument list from + the core file. */ + core_file_exec_context ctx + = gdbarch_core_parse_exec_context (target->core_gdbarch (), + current_program_space->core_bfd ()); + if (ctx.valid ()) + { + std::string args; + for (const auto &a : ctx.args ()) + { + args += ' '; + args += a.get (); + } + + gdb_printf (_("Core was generated by `%ps%s'.\n"), + styled_string (file_name_style.style (), + ctx.execfn ()), + args.c_str ()); + + /* Copy the arguments into the inferior. */ + std::vector argv; + for (const auto &a : ctx.args ()) + argv.push_back (a.get ()); + gdb::array_view view (argv.data (), argv.size ()); + current_inferior ()->set_args (view); + } + else + { + gdb::unique_xmalloc_ptr failing_command = make_unique_xstrdup + (bfd_core_file_failing_command (current_program_space->core_bfd ())); + if (failing_command != nullptr) + gdb_printf (_("Core was generated by `%s'.\n"), + failing_command.get ()); + } /* Clearing any previous state of convenience variables. */ clear_exit_convenience_vars (); diff --git a/gdb/gdbarch-gen.c b/gdb/gdbarch-gen.c index 0d00cd7c993..6f41ce9d233 100644 --- a/gdb/gdbarch-gen.c +++ b/gdb/gdbarch-gen.c @@ -258,6 +258,7 @@ struct gdbarch gdbarch_get_pc_address_flags_ftype *get_pc_address_flags = default_get_pc_address_flags; gdbarch_read_core_file_mappings_ftype *read_core_file_mappings = default_read_core_file_mappings; gdbarch_use_target_description_from_corefile_notes_ftype *use_target_description_from_corefile_notes = default_use_target_description_from_corefile_notes; + gdbarch_core_parse_exec_context_ftype *core_parse_exec_context = default_core_parse_exec_context; }; /* Create a new ``struct gdbarch'' based on information provided by @@ -527,6 +528,7 @@ verify_gdbarch (struct gdbarch *gdbarch) /* Skip verify of get_pc_address_flags, invalid_p == 0. */ /* Skip verify of read_core_file_mappings, invalid_p == 0. */ /* Skip verify of use_target_description_from_corefile_notes, invalid_p == 0. */ + /* Skip verify of core_parse_exec_context, invalid_p == 0. */ if (!log.empty ()) internal_error (_("verify_gdbarch: the following are invalid ...%s"), log.c_str ()); @@ -1386,6 +1388,9 @@ gdbarch_dump (struct gdbarch *gdbarch, struct ui_file *file) gdb_printf (file, "gdbarch_dump: use_target_description_from_corefile_notes = <%s>\n", host_address_to_string (gdbarch->use_target_description_from_corefile_notes)); + gdb_printf (file, + "gdbarch_dump: core_parse_exec_context = <%s>\n", + host_address_to_string (gdbarch->core_parse_exec_context)); if (gdbarch->dump_tdep != NULL) gdbarch->dump_tdep (gdbarch, file); } @@ -5463,3 +5468,20 @@ set_gdbarch_use_target_description_from_corefile_notes (struct gdbarch *gdbarch, { gdbarch->use_target_description_from_corefile_notes = use_target_description_from_corefile_notes; } + +core_file_exec_context +gdbarch_core_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd) +{ + gdb_assert (gdbarch != NULL); + gdb_assert (gdbarch->core_parse_exec_context != NULL); + if (gdbarch_debug >= 2) + gdb_printf (gdb_stdlog, "gdbarch_core_parse_exec_context called\n"); + return gdbarch->core_parse_exec_context (gdbarch, cbfd); +} + +void +set_gdbarch_core_parse_exec_context (struct gdbarch *gdbarch, + gdbarch_core_parse_exec_context_ftype core_parse_exec_context) +{ + gdbarch->core_parse_exec_context = core_parse_exec_context; +} diff --git a/gdb/gdbarch-gen.h b/gdb/gdbarch-gen.h index b982fd7cd09..29c5ad705f9 100644 --- a/gdb/gdbarch-gen.h +++ b/gdb/gdbarch-gen.h @@ -1751,3 +1751,18 @@ extern void set_gdbarch_read_core_file_mappings (struct gdbarch *gdbarch, gdbarc typedef bool (gdbarch_use_target_description_from_corefile_notes_ftype) (struct gdbarch *gdbarch, struct bfd *corefile_bfd); extern bool gdbarch_use_target_description_from_corefile_notes (struct gdbarch *gdbarch, struct bfd *corefile_bfd); extern void set_gdbarch_use_target_description_from_corefile_notes (struct gdbarch *gdbarch, gdbarch_use_target_description_from_corefile_notes_ftype *use_target_description_from_corefile_notes); + +/* Examine the core file bfd object CBFD and try to extract the name of + the current executable and the argument list, which are return in a + core_file_exec_context object. + + If for any reason the details can't be extracted from CBFD then an + empty context is returned. + + It is required that the current inferior be the one associated with + CBFD, strings are read from the current inferior using target methods + which all assume current_inferior() is the one to read from. */ + +typedef core_file_exec_context (gdbarch_core_parse_exec_context_ftype) (struct gdbarch *gdbarch, bfd *cbfd); +extern core_file_exec_context gdbarch_core_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd); +extern void set_gdbarch_core_parse_exec_context (struct gdbarch *gdbarch, gdbarch_core_parse_exec_context_ftype *core_parse_exec_context); diff --git a/gdb/gdbarch.h b/gdb/gdbarch.h index 60a0f60df39..8359ae762de 100644 --- a/gdb/gdbarch.h +++ b/gdb/gdbarch.h @@ -59,6 +59,7 @@ struct ui_out; struct inferior; struct x86_xsave_layout; struct solib_ops; +struct core_file_exec_context; #include "regcache.h" diff --git a/gdb/gdbarch_components.py b/gdb/gdbarch_components.py index 4006380076d..7a218605d89 100644 --- a/gdb/gdbarch_components.py +++ b/gdb/gdbarch_components.py @@ -2778,3 +2778,23 @@ The corefile's bfd is passed through COREFILE_BFD. predefault="default_use_target_description_from_corefile_notes", invalid=False, ) + +Method( + comment=""" +Examine the core file bfd object CBFD and try to extract the name of +the current executable and the argument list, which are return in a +core_file_exec_context object. + +If for any reason the details can't be extracted from CBFD then an +empty context is returned. + +It is required that the current inferior be the one associated with +CBFD, strings are read from the current inferior using target methods +which all assume current_inferior() is the one to read from. +""", + type="core_file_exec_context", + name="core_parse_exec_context", + params=[("bfd *", "cbfd")], + predefault="default_core_parse_exec_context", + invalid=False, +) diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c index 65ec221ef48..0c81bd72de8 100644 --- a/gdb/linux-tdep.c +++ b/gdb/linux-tdep.c @@ -1835,6 +1835,297 @@ linux_corefile_thread (struct thread_info *info, } } +/* Try to extract the inferior arguments, environment, and executable name + from core file CBFD. */ + +static core_file_exec_context +linux_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd) +{ + gdb_assert (gdbarch != nullptr); + + /* If there's no core file loaded then we're done. */ + if (cbfd == nullptr) + return {}; + + /* This function (currently) assumes the stack grows down. If this is + not the case then this function isn't going to help. */ + if (!gdbarch_stack_grows_down (gdbarch)) + return {}; + + int ptr_bytes = gdbarch_ptr_bit (gdbarch) / TARGET_CHAR_BIT; + + /* Find the .auxv section in the core file. The BFD library creates this + for us from the AUXV note when the BFD is opened. If the section + can't be found then there's nothing more we can do. */ + struct bfd_section * section = bfd_get_section_by_name (cbfd, ".auxv"); + if (section == nullptr) + return {}; + + /* Grab the contents of the .auxv section. If we can't get the contents + then there's nothing more we can do. */ + bfd_size_type size = bfd_section_size (section); + if (bfd_section_size_insane (cbfd, section)) + return {}; + gdb::byte_vector contents (size); + if (!bfd_get_section_contents (cbfd, section, contents.data (), 0, size)) + return {}; + + /* Parse the .auxv section looking for the AT_EXECFN attribute. The + value of this attribute is a pointer to a string, the string is the + executable command. Additionally, this string is placed at the top of + the program stack, and so will be in the same PT_LOAD segment as the + argv and envp arrays. We can use this to try and locate these arrays. + If we can't find the AT_EXECFN attribute then we're not going to be + able to do anything else here. */ + CORE_ADDR execfn_string_addr; + if (target_auxv_search (contents, current_inferior ()->top_target (), + gdbarch, AT_EXECFN, &execfn_string_addr) != 1) + return {}; + + /* Read in the program headers from CBFD. If we can't do this for any + reason then just give up. */ + long phdrs_size = bfd_get_elf_phdr_upper_bound (cbfd); + if (phdrs_size == -1) + return {}; + gdb::unique_xmalloc_ptr + phdrs ((Elf_Internal_Phdr *) xmalloc (phdrs_size)); + int num_phdrs = bfd_get_elf_phdrs (cbfd, phdrs.get ()); + if (num_phdrs == -1) + return {}; + + /* Now scan through the headers looking for the one which contains the + address held in EXECFN_STRING_ADDR, this is the address of the + executable command pointed too by the AT_EXECFN auxv entry. */ + Elf_Internal_Phdr *hdr = nullptr; + for (int i = 0; i < num_phdrs; i++) + { + /* The program header that contains the address EXECFN_STRING_ADDR + should be one where all content is contained within CBFD, hence + the check that the file size matches the memory size. */ + if (phdrs.get ()[i].p_type == PT_LOAD + && phdrs.get ()[i].p_vaddr <= execfn_string_addr + && (phdrs.get ()[i].p_vaddr + + phdrs.get ()[i].p_memsz) > execfn_string_addr + && phdrs.get ()[i].p_memsz == phdrs.get ()[i].p_filesz) + { + hdr = &phdrs.get ()[i]; + break; + } + } + + /* If we failed to find a suitable program header then give up. */ + if (hdr == nullptr) + return {}; + + /* As we assume the stack grows down (see early check in this function) + we know that the information we are looking for sits somewhere between + EXECFN_STRING_ADDR and the segments virtual address. These define + the HIGH and LOW addresses between which we are going to search. */ + CORE_ADDR low = hdr->p_vaddr; + CORE_ADDR high = execfn_string_addr; + + /* This PTR is going to be the address we are currently accessing. */ + CORE_ADDR ptr = align_down (high, ptr_bytes); + + /* Setup DEREF a helper function which loads a value from an address. + The returned value is always placed into a uint64_t, even if we only + load 4-bytes, this allows the code below to be pretty generic. All + the values we're dealing with are unsigned, so this should be OK. */ + enum bfd_endian byte_order = gdbarch_byte_order (gdbarch); + const auto deref = [=] (CORE_ADDR p) -> uint64_t + { + ULONGEST value = read_memory_unsigned_integer (p, ptr_bytes, byte_order); + return (uint64_t) value; + }; + + /* Now search down through memory looking for a PTR_BYTES sized object + which contains the value EXECFN_STRING_ADDR. The hope is that this + will be the AT_EXECFN entry in the auxv table. There is no guarantee + that we'll find the auxv table this way, but we will do our best to + validate that what we find is the auxv table, see below. */ + while (ptr > low) + { + if (deref (ptr) == execfn_string_addr + && (ptr - ptr_bytes) > low + && deref (ptr - ptr_bytes) == AT_EXECFN) + break; + + ptr -= ptr_bytes; + } + + /* If we reached the lower bound then we failed -- bail out. */ + if (ptr <= low) + return {}; + + /* Assuming that we are looking at a value field in the auxv table, move + forward PTR_BYTES bytes so we are now looking at the next key field in + the auxv table, then scan forward until we find the null entry which + will be the last entry in the auxv table. */ + ptr += ptr_bytes; + while ((ptr + (2 * ptr_bytes)) < high + && (deref (ptr) != 0 || deref (ptr + ptr_bytes) != 0)) + ptr += (2 * ptr_bytes); + + /* PTR now points to the null entry in the auxv table, or we think it + does. Now we want to find the start of the auxv table. There's no + in-memory pattern we can search for at the start of the table, but + we can find the start based on the size of the .auxv section within + the core file CBFD object. In the actual core file the auxv is held + in a note, but the bfd library makes this into a section for us. + + The addition of (2 * PTR_BYTES) here is because PTR is pointing at the + null entry, but the null entry is also included in CONTENTS. */ + ptr = ptr + (2 * ptr_bytes) - contents.size (); + + /* If we reached the lower bound then we failed -- bail out. */ + if (ptr <= low) + return {}; + + /* PTR should now be pointing to the start of the auxv table mapped into + the inferior memory. As we got here using a heuristic then lets + compare an auxv table sized block of inferior memory, if this matches + then it's not a guarantee that we are in the right place, but it does + make it more likely. */ + gdb::byte_vector target_contents (size); + if (target_read_memory (ptr, target_contents.data (), size) != 0) + memory_error (TARGET_XFER_E_IO, ptr); + if (memcmp (contents.data (), target_contents.data (), size) != 0) + return {}; + + /* We have reasonable confidence that PTR points to the start of the auxv + table. Below this should be the null terminated list of pointers to + environment strings, and below that the null terminated list of + pointers to arguments strings. After that we should find the + argument count. First, check for the null at the end of the + environment list. */ + if (deref (ptr - ptr_bytes) != 0) + return {}; + + ptr -= (2 * ptr_bytes); + while (ptr > low && deref (ptr) != 0) + ptr -= ptr_bytes; + + /* If we reached the lower bound then we failed -- bail out. */ + if (ptr <= low) + return {}; + + /* PTR is now pointing to the null entry at the end of the argument + string pointer list. We now want to scan backward to find the entire + argument list. There's no handy null marker that we can look for + here, instead, as we scan backward we look for the argument count + (argc) value which appears immediately before the argument list. + + Technically, we could have zero arguments, so the argument count would + be zero, however, we don't support this case. If we find a null entry + in the argument list before we find the argument count then we just + bail out. + + Start by moving to the last argument string pointer, we expect this + to be non-null. */ + ptr -= ptr_bytes; + uint64_t argc = 0; + while (ptr > low) + { + uint64_t val = deref (ptr); + if (val == 0) + return {}; + + if (val == argc) + break; + + /* For GNU/Linux on ARM, glibc removes argc from the stack and + replaces it with the "stack-limit". This actually means a pointer + to the first argument string. This is unfortunate, but we can + still detect this case. */ + if (val == (ptr + ptr_bytes)) + break; + + argc++; + ptr -= ptr_bytes; + } + + /* If we reached the lower bound then we failed -- bail out. */ + if (ptr <= low) + return {}; + + /* PTR is now pointing at the argument count value (or where the argument + count should be, see notes on ARM above). Move it forward so we're + pointing at the first actual argument string pointer. */ + ptr += ptr_bytes; + + /* We can now parse all of the argument strings. */ + std::vector> arguments; + + /* Skip the first argument. This is the executable command, but we'll + load that separately later. */ + ptr += ptr_bytes; + + uint64_t v; + while ((v = deref (ptr)) != 0) + { + gdb::unique_xmalloc_ptr str = target_read_string (v, INT_MAX); + if (str == nullptr) + return {}; + arguments.emplace_back (std::move (str)); + ptr += ptr_bytes; + } + + /* Skip the null-pointer at the end of the argument list. We will now + be pointing at the first environment string. */ + ptr += ptr_bytes; + + /* Parse the environment strings. Nothing is done with this yet, but + will be in a later commit. */ + std::vector> environment; + while ((v = deref (ptr)) != 0) + { + gdb::unique_xmalloc_ptr str = target_read_string (v, INT_MAX); + if (str == nullptr) + return {}; + environment.emplace_back (std::move (str)); + ptr += ptr_bytes; + } + + gdb::unique_xmalloc_ptr execfn + = target_read_string (execfn_string_addr, INT_MAX); + if (execfn == nullptr) + return {}; + + return core_file_exec_context (std::move (execfn), + std::move (arguments)); +} + +/* Parse and return execution context details from core file CBFD. */ + +static core_file_exec_context +linux_corefile_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd) +{ + /* Catch and discard memory errors. + + If the core file format is not as we expect then we can easily trigger + a memory error while parsing the core file. We don't want this to + prevent the user from opening the core file; the information provided + by this function is helpful, but not critical, debugging can continue + without it. Instead just give a warning and return an empty context + object. */ + try + { + return linux_corefile_parse_exec_context_1 (gdbarch, cbfd); + } + catch (const gdb_exception_error &ex) + { + if (ex.error == MEMORY_ERROR) + { + warning + (_("failed to parse execution context from corefile: %s"), + ex.message->c_str ()); + return {}; + } + else + throw; + } +} + /* Fill the PRPSINFO structure with information about the process being debugged. Returns 1 in case of success, 0 for failures. Please note that even if the structure cannot be entirely filled (e.g., GDB was unable to @@ -2785,6 +3076,8 @@ linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch, set_gdbarch_infcall_mmap (gdbarch, linux_infcall_mmap); set_gdbarch_infcall_munmap (gdbarch, linux_infcall_munmap); set_gdbarch_get_siginfo_type (gdbarch, linux_get_siginfo_type); + set_gdbarch_core_parse_exec_context (gdbarch, + linux_corefile_parse_exec_context); } void _initialize_linux_tdep (); diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.c b/gdb/testsuite/gdb.base/corefile-exec-context.c new file mode 100644 index 00000000000..ed4df606a2d --- /dev/null +++ b/gdb/testsuite/gdb.base/corefile-exec-context.c @@ -0,0 +1,25 @@ +/* This testcase is part of GDB, the GNU debugger. + + Copyright 2024 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +int +main (int argc, char **argv) +{ + abort (); + return 0; +} diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.exp b/gdb/testsuite/gdb.base/corefile-exec-context.exp new file mode 100644 index 00000000000..b18a8104779 --- /dev/null +++ b/gdb/testsuite/gdb.base/corefile-exec-context.exp @@ -0,0 +1,102 @@ +# Copyright 2024 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# Check GDB can handle reading the full executable name and argument +# list from a core file. +# +# Currently, only Linux supports reading full executable and arguments +# from a core file. +require {istarget *-linux*} + +standard_testfile + +if {[build_executable $testfile.exp $testfile $srcfile] == -1} { + untested "failed to compile" + return -1 +} + +# Linux core files can encore upto 80 characters for the command and +# arguments in the psinfo. If BINFILE is less than 80 characters in +# length then lets try to make it longer. +set binfile_len [string length $binfile] +if { $binfile_len <= 80 } { + set extra_len [expr 80 - $binfile_len + 1] + set extra_str [string repeat "x" $extra_len] + set new_binfile $binfile$extra_str + remote_exec build "mv $binfile $new_binfile" + set binfile $new_binfile +} + +# Generate a core file, this time the inferior has no additional +# arguments. +set corefile [core_find $binfile {}] +if {$corefile == ""} { + untested "unable to create corefile" + return 0 +} +set corefile_1 "$binfile.1.core" +remote_exec build "mv $corefile $corefile_1" + +# Load the core file and confirm that the full executable name is +# seen. +clean_restart $binfile +set saw_generated_line false +gdb_test_multiple "core-file $corefile_1" "load core file no args" { + -re "^Core was generated by `[string_to_regexp $binfile]'\\.\r\n" { + set saw_generated_line true + exp_continue + } + + -re "^$gdb_prompt $" { + gdb_assert { $saw_generated_line } $gdb_test_name + } + + -re "^\[^\r\n\]*\r\n" { + exp_continue + } +} + +# Generate a core file, this time pass some arguments to the inferior. +set args "aaaaa bbbbb ccccc ddddd eeeee" +set corefile [core_find $binfile {} $args] +if {$corefile == ""} { + untested "unable to create corefile" + return 0 +} +set corefile_2 "$binfile.2.core" +remote_exec build "mv $corefile $corefile_2" + +# Load the core file and confirm that the full executable name and +# argument list are seen. +clean_restart $binfile +set saw_generated_line false +gdb_test_multiple "core-file $corefile_2" "load core file with args" { + -re "^Core was generated by `[string_to_regexp $binfile] $args'\\.\r\n" { + set saw_generated_line true + exp_continue + } + + -re "^$gdb_prompt $" { + gdb_assert { $saw_generated_line } $gdb_test_name + } + + -re "^\[^\r\n\]*\r\n" { + exp_continue + } +} + +# Also, the argument list should be available through 'show args'. +gdb_test "show args" \ + "Argument list to give program being debugged when it is started is \"$args\"\\."