From patchwork Tue Jan 9 14:26:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Burgess X-Patchwork-Id: 83646 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 180F43857BBC for ; Tue, 9 Jan 2024 14:30:51 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id B5B7D3858291 for ; Tue, 9 Jan 2024 14:26:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B5B7D3858291 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B5B7D3858291 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704810418; cv=none; b=LpbtroWc3MRLJ/Bx61fkKy7tvlHaG8YZWJmhNWOG3h5/cmITUtgNszcvEnP4+22NqECEJGetKXzPHg0sbJbMW26WRTbHYA0L2KGapS5uH82wM2CjzVHcwg7p5w95BJeNCmWBqK1/X2Vv7l5wrRl6nol+SAnYsV6TJBQ6nryhZxI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704810418; c=relaxed/simple; bh=AU5jWBPNeMcAOSKBQwc9QvNlx2xXjxW9Z1RfZUJZK60=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=BqpMXMXQjtveTwi0Xm/t69cohv00V9ZU8DJFjlITszm31XG6LzL7cJ809VKTW8OJLktgO0SR3Gh+YsIO+r4YUtA+tztpbrineLiIl63fun1IY+TdSW74Gu449DppxSpinkJu4t7TXE4X+lSn1/x6e5O6aIk06ua9mybH8p8Ml2Y= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1704810415; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=do3L8q1b9TPWiVBLQcvd4iOH85pLjTaUHQfEj9rw4is=; b=h8OKOR6dTkttg5iQVMbENAkpCMQ76hYrH8+SDmmJ+DEkdMYMijw6JzoxAJyCjzoIbypfeg oPvVnYqKs7+sDfgFSNogm7EgwybRExKnJaVNElwdOZcTOLT6AdeMN9tOK+nrQoIGGHq2lS f93LAeu+ES1ML4iNTlkeSHBjwoiJYYw= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-433-Po2sHzFFOzW_Bu4DyWmCZQ-1; Tue, 09 Jan 2024 09:26:53 -0500 X-MC-Unique: Po2sHzFFOzW_Bu4DyWmCZQ-1 Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-40d5d0de143so24976125e9.1 for ; Tue, 09 Jan 2024 06:26:53 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704810411; x=1705415211; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=do3L8q1b9TPWiVBLQcvd4iOH85pLjTaUHQfEj9rw4is=; b=QwJbKcQZq3FtwV/Ov6H39fHH6EsERdmOgsd4f11RO1eXvbBHC2MEvV7D9afTjuMkzg bxB8l8C6OSFMBe+8QCuH+HrB0zUpqfPhnVdDD8cGM+9RrHdUk3Tr0YeM9KeFhTrstAOH bcvkI+5I/7wnbj7WzXVO9JctB12aKb25s9NHp/KgHEegn6YOGFqFB0jrDWcN8sZ18jQP vWYPQC2VriAIUDHw2Sg4D2BTBhUqb8Edp07ZaXQ3M89yfW8wA7d5nNkr8nrnTikk12JZ N586eNa58iHQdBFXZD9b0mggkP2tZ9E41HIOjh+K/8ywBHgNZhhuxhsv9ZV1UDRS82iG KNSA== X-Gm-Message-State: AOJu0Yw5VFU4Mtv8VClkxhEDAmFF3iF6y3jPTL8sZHUznaIKdaaF9DYy 3ohiwyKVMUEji18Fbc+a8IIaazyl5Q53mIB59RqhA6C00G3IfACDFmxT0qeBG1Yghlw/DbVF3F1 jDAsLayZ9L+ccsHZxA2l/TBGxzAHxAQUE8qduqzuJteizEjKABRkhlEm7JbvRNXJWWKpbme1q1q Yy0YFQAY9uQhu2Mg== X-Received: by 2002:a1c:4b15:0:b0:40d:5f62:773e with SMTP id y21-20020a1c4b15000000b0040d5f62773emr1929384wma.184.1704810411252; Tue, 09 Jan 2024 06:26:51 -0800 (PST) X-Google-Smtp-Source: AGHT+IHUUNb9LkX3sU5FYkfOo1PgLdHkOK+cr3dL6T6y3O0xBeytVJudr34OCwqXcXdMzdYOrZ4Oig== X-Received: by 2002:a1c:4b15:0:b0:40d:5f62:773e with SMTP id y21-20020a1c4b15000000b0040d5f62773emr1929377wma.184.1704810410802; Tue, 09 Jan 2024 06:26:50 -0800 (PST) Received: from localhost (185.223.159.143.dyn.plus.net. [143.159.223.185]) by smtp.gmail.com with ESMTPSA id d4-20020adfe884000000b003367a5b6b69sm2543934wrm.106.2024.01.09.06.26.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 Jan 2024 06:26:50 -0800 (PST) From: Andrew Burgess To: gdb-patches@sourceware.org Cc: Andrew Burgess , Michael Weghorn Subject: [PATCH 06/16] gdbsupport: have construct_inferior_arguments take an escape function Date: Tue, 9 Jan 2024 14:26:29 +0000 Message-Id: <1ed825102ecd1f7582659d48d1ac0d5ecdf571e5.1704809585.git.aburgess@redhat.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: References: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-13.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org This commit is a refactor in order to make later commits in this series easier, and continues the work started in the previous commit. The construct_inferior_arguments function takes a vector of strings and does two things: 1. Applies escaping to each string, and 2. Combines all the individual strings into a single string, placing a space between each. The construct_inferior_arguments function is used in a couple of places within GDB and gdbserver, and in each place, the same escaping algorithm is used; it has to be, the algorithm is hard-coded into the function construct_inferior_arguments. However, in later commits I will propose that in different situations we should actually apply different escaping algorithms, depending on whether the incoming arguments are already escaped or not. As a simple concrete example, if we invoke current GDB like this: $ gdb -q --args /tmp/application '$SHELL' ... (gdb) show args Argument list to give program being debugged when it is started is "\$SHELL". Here you can see an example of the default escaping used by construct_inferior_arguments, special shell characters are escaped (the '$' becomes '\$'). The downside of this is that the application will see a literal '$SHELL' string, not the shell expanded $SHELL variable value (e.g. /bin/bash). There's currently no way to get the second behaviour from the GDB command line. This is discussed in PR gdb/28392. If we wanted to add a new GDB command line flag, maybe like this: $ gdb -q --no-escape-args /tmp/application '$SHELL' ... (gdb) show args Argument list to give program being debugged when it is started is "$SHELL". Now, the argument string contains '$SHELL', with no escaping. When the inferior is run (with startup-with-shell on), the application will see the shell expanded value of $SHELL. To achieve this we need construct_inferior_arguments to apply a different escaping algorithm. Another example of where the current construct_inferior_arguments behaviour is not correct is in gdbserver. Like GDB, arguments from the command line are escaped, however, we currently also escape arguments arriving in the vRun packet. Thus, if in GDB we do this: (gdb) set args $SHELL And then run an extended-remote target, the $SHELL will be escaped when it is passed to gdbserver, gdbserver will then hold \$SHELL, which means the inferior will see a literal '$SHELL' string, rather than the shell expanded value of the variable. This is a real bug that exists today in GDB/gdbserver. Given the upcoming patches, I think there is more than a boolean choice between two escaping algorithms, so, at least for now, I propose that construct_inferior_arguments should take a function pointer for a function that will perform the argument escaping. If it turns out that this is excessive then it is easy enough to fold the algorithm selecting back into construct_inferior_arguments and instead pass a bool or enum to choose the correct algorithm, but I think this approach is simple enough. After this commit construct_inferior_arguments merges all the arguments into a single string, using the worker function to escape each argument in turn. However, this commit doesn't actually fix any of the above issues. This is a restructuring commit. All this commit does is change construct_inferior_arguments to take the escaping function, I've split the existing escaping function out from construct_inferior_arguments, and I've updated every call to construct_inferior_arguments to pass in the one and only escaping function. As a result there should be no change in behaviour after this commit. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28392 --- gdb/inferior.c | 5 ++- gdb/inferior.h | 3 +- gdb/main.c | 4 +- gdb/python/py-inferior.c | 2 +- gdbserver/server.cc | 6 ++- gdbsupport/common-inferior.cc | 81 ++++++++++++++++++++--------------- gdbsupport/common-inferior.h | 17 ++++++-- 7 files changed, 73 insertions(+), 45 deletions(-) diff --git a/gdb/inferior.c b/gdb/inferior.c index 076801db51b..ed138888024 100644 --- a/gdb/inferior.c +++ b/gdb/inferior.c @@ -168,9 +168,10 @@ inferior::tty () /* See inferior.h. */ void -inferior::set_args (gdb::array_view args) +inferior::set_args (gdb::array_view args, + escape_args_func escape_func) { - set_args (construct_inferior_arguments (args)); + set_args (construct_inferior_arguments (args, escape_func)); } void diff --git a/gdb/inferior.h b/gdb/inferior.h index e5173e0adac..d3824236ef6 100644 --- a/gdb/inferior.h +++ b/gdb/inferior.h @@ -526,7 +526,8 @@ class inferior : public refcounted_object, }; /* Set the argument string from some strings. */ - void set_args (gdb::array_view args); + void set_args (gdb::array_view args, + escape_args_func escape_func); /* Get the argument string to use when running this inferior. diff --git a/gdb/main.c b/gdb/main.c index 688db7655a9..015ed396f58 100644 --- a/gdb/main.c +++ b/gdb/main.c @@ -1084,8 +1084,8 @@ captured_main_1 (struct captured_main_args *context) symarg = argv[optind]; execarg = argv[optind]; ++optind; - current_inferior ()->set_args - (gdb::array_view (&argv[optind], argc - optind)); + gdb::array_view arg_view (&argv[optind], argc - optind); + current_inferior ()->set_args (arg_view, escape_shell_characters); } else { diff --git a/gdb/python/py-inferior.c b/gdb/python/py-inferior.c index ed153d668ac..8641d8a068b 100644 --- a/gdb/python/py-inferior.c +++ b/gdb/python/py-inferior.c @@ -919,7 +919,7 @@ infpy_set_args (PyObject *self, PyObject *value, void *closure) for (const auto &arg : args) argvec.push_back (arg.get ()); gdb::array_view view (argvec.data (), argvec.size ()); - inf->inferior->set_args (view); + inf->inferior->set_args (view, escape_shell_characters); } else { diff --git a/gdbserver/server.cc b/gdbserver/server.cc index 508e42ee097..52ce9240ca3 100644 --- a/gdbserver/server.cc +++ b/gdbserver/server.cc @@ -3437,7 +3437,8 @@ handle_v_run (char *own_buf) else program_path.set (new_program_name.get ()); - program_args = construct_inferior_arguments (new_argv); + program_args = construct_inferior_arguments (new_argv, + escape_shell_characters); free_vector_argv (new_argv); target_create_inferior (program_path.get (), program_args); @@ -4319,7 +4320,8 @@ captured_main (int argc, char *argv[]) std::vector temp_arg_vector; for (i = 1; i < n; i++) temp_arg_vector.push_back (next_arg[i]); - program_args = construct_inferior_arguments (temp_arg_vector); + program_args = construct_inferior_arguments (temp_arg_vector, + escape_shell_characters); /* Wait till we are at first instruction in program. */ target_create_inferior (program_path.get (), program_args); diff --git a/gdbsupport/common-inferior.cc b/gdbsupport/common-inferior.cc index 076ddc73d51..f5620ec89aa 100644 --- a/gdbsupport/common-inferior.cc +++ b/gdbsupport/common-inferior.cc @@ -28,7 +28,26 @@ bool startup_with_shell = true; /* See common-inferior.h. */ std::string -construct_inferior_arguments (gdb::array_view argv) +construct_inferior_arguments (gdb::array_view argv, + escape_args_func escape_func) +{ + std::string result; + + for (const char *a : argv) + { + if (!result.empty ()) + result += " "; + + result += escape_func (a); + } + + return result; +} + +/* See common-inferior.h. */ + +std::string +escape_shell_characters (const char *arg) { std::string result; @@ -44,55 +63,49 @@ construct_inferior_arguments (gdb::array_view argv) static const char special[] = "\"!#$&*()\\|[]{}<>?'`~^; \t\n"; static const char quote = '\''; #endif - for (int i = 0; i < argv.size (); ++i) + + /* Need to handle empty arguments specially. */ + if (arg[0] == '\0') + { + result += quote; + result += quote; + } + else { - if (i > 0) - result += ' '; +#ifdef __MINGW32__ + bool quoted = false; - /* Need to handle empty arguments specially. */ - if (argv[i][0] == '\0') + if (strpbrk (arg, special)) { - result += quote; + quoted = true; result += quote; } - else +#endif + for (const char *cp = arg; *cp; ++cp) { -#ifdef __MINGW32__ - bool quoted = false; - - if (strpbrk (argv[i], special)) + if (*cp == '\n') { - quoted = true; + /* A newline cannot be quoted with a backslash (it just + disappears), only by putting it inside quotes. */ + result += quote; + result += '\n'; result += quote; } -#endif - for (char *cp = argv[i]; *cp; ++cp) + else { - if (*cp == '\n') - { - /* A newline cannot be quoted with a backslash (it - just disappears), only by putting it inside - quotes. */ - result += quote; - result += '\n'; - result += quote; - } - else - { #ifdef __MINGW32__ - if (*cp == quote) + if (*cp == quote) #else - if (strchr (special, *cp) != NULL) + if (strchr (special, *cp) != NULL) #endif - result += '\\'; - result += *cp; - } + result += '\\'; + result += *cp; } + } #ifdef __MINGW32__ - if (quoted) - result += quote; + if (quoted) + result += quote; #endif - } } return result; diff --git a/gdbsupport/common-inferior.h b/gdbsupport/common-inferior.h index c468c7bf6a8..92d1954c3fe 100644 --- a/gdbsupport/common-inferior.h +++ b/gdbsupport/common-inferior.h @@ -57,9 +57,20 @@ extern const std::string &get_inferior_cwd (); the target is started up with a shell. */ extern bool startup_with_shell; -/* Compute command-line string given argument vector. This does the - same shell processing as fork_inferior. */ +/* The type for a function used by construct_inferior_arguments to add any + quoting needed to an individual argument before combining all the + arguments into a single string. */ +using escape_args_func = std::string (*) (const char *arg); + +/* Return a version of ARG that has special shell characters escaped. */ +extern std::string escape_shell_characters (const char *arg); + +/* Pass each element of ARGS through ESCAPE_FUNC and combine the results + into a single string, separating each element with a single space + character. */ + extern std::string -construct_inferior_arguments (gdb::array_view); +construct_inferior_arguments (gdb::array_view args, + escape_args_func escape_func); #endif /* COMMON_COMMON_INFERIOR_H */