From patchwork Thu Feb 8 16:28:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Burgess X-Patchwork-Id: 85472 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 821CB38582A5 for ; Thu, 8 Feb 2024 16:31:11 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 36ADE3858403 for ; Thu, 8 Feb 2024 16:28:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 36ADE3858403 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 36ADE3858403 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707409741; cv=none; b=PtlbVk6j8ePR1uk0Qui34lBl+Lzi7hKoFH9C/els16fE/UtTKeprFTRpQJDl+3uI99KUJtvud4duLlTEfecglwV59e/wqMVlqJlMaxJwPPfG16h8jOg/Msl9B26av0gmJXphIRLKJGwXUHDl8Ay3cxr421m6jHD1jttoZ/cFUlA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707409741; c=relaxed/simple; bh=INMBhBzHZort4AIjqMgA5bPjfhLuB3pCHBOVRVQxj9s=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=F7iwVY0iel2oqR7Pi3KfITHFn/CRgrPsa7meO3YFWki7MagojBT5+fAAItRN/ynJt67fXimvsjKzBJjnxLsdauJpvCnOAOFj6fL0GVP+LlUonuSEWzu4ABOxQ7rnUzGftg2CA/qFn3tjSmuBG7aMmi5lqzlwo3lvhxIXv+WEcLs= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707409736; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Rb2FADYk9rpUiHhqBp/o0q+cQlRawv2GQzUW8ssJl/k=; b=fySWAlWH+NnSq78O4sLac4EmEAwANc7c1nPxVDe/4cZg7y7YD6J7xR6euApbGzLB+gBedo mZKwP7UGWhbWNjiw4GNwNlcUtNRJGR7dv385OaA+XsKsrEh4RvYhiRvQeibt12BePsNwf0 qz/9M1dM4kJ8ZZmHeO4ZzKQRbXebzZU= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-440-rjhgXl9_P2i8xvcnfCHOKg-1; Thu, 08 Feb 2024 11:28:54 -0500 X-MC-Unique: rjhgXl9_P2i8xvcnfCHOKg-1 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-410040e1fcdso369145e9.0 for ; Thu, 08 Feb 2024 08:28:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707409732; x=1708014532; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Rb2FADYk9rpUiHhqBp/o0q+cQlRawv2GQzUW8ssJl/k=; b=Ij5hD+5BNmrMWwMhGRQI8KgLKBPuB3WQeS88YEyojWKC3Q5tpP5W/sFciR0w1aRO9j Jtfu466pCIKDK+RA/jRHWPYwoThLUTVozkEyn8ySqSlC7wzNCSaAjJiz90P7lX/nPs2z sVSB4UHXVeL5D0e6SqNDIsbNmtPWgfYpEyvarOt93eKgnNvmsvUq+BliIy6cMgUgYJAK tQbl6okCtAVFigM9Db/qeCoh5MXSGb7WMDjWBLKD4H/v5jX/SfvAnefMjHBlBbJTplLC YA+7DThaHqqfHCAvpTkZT+jr7YHxeQl6nV8VoOJX6KoUUp+rIUNOxQyRS2zJjd0ZPyV7 au4w== X-Gm-Message-State: AOJu0YwgjJBUFf75wjE+3XZFu/qmVm/wtzvUDfABdV+Kp7Yd9Ka0nS2X PLAZIY2aTPoeIfQqsHG+EMuJIEkF1b6TiK7IAcIpJoXxOFJpHEi6ulU6upamBfc+mkYVLiWchiE uFmv3Zppy3DoDWiyRaUq67VLSoy85KsWAnE5JUVBzyqq9kyWDcn5tbu9+OKzKEMygdCpXVocQkE hHZ4fZUXf4qjmofWg/HzAB8+WDt7gH+Nq84pla/znmAzs= X-Received: by 2002:a05:600c:1550:b0:40f:bd81:e760 with SMTP id f16-20020a05600c155000b0040fbd81e760mr5659771wmg.40.1707409732419; Thu, 08 Feb 2024 08:28:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IHWPvljStWt1v/wDL6ALVo12m047VwOXN/Gs6N50J6fPY5P9KVeYGEqszZvsT6lgPY2rWdR3Q== X-Received: by 2002:a05:600c:1550:b0:40f:bd81:e760 with SMTP id f16-20020a05600c155000b0040fbd81e760mr5659757wmg.40.1707409731864; Thu, 08 Feb 2024 08:28:51 -0800 (PST) Received: from localhost (185.223.159.143.dyn.plus.net. [143.159.223.185]) by smtp.gmail.com with ESMTPSA id fc14-20020a05600c524e00b004102f16d937sm2116619wmb.17.2024.02.08.08.28.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Feb 2024 08:28:51 -0800 (PST) From: Andrew Burgess To: gdb-patches@sourceware.org Cc: Andrew Burgess Subject: [PATCH 4/4] gdbsupport: rewrite gdb_argv to use extract_string_maybe_quoted Date: Thu, 8 Feb 2024 16:28:43 +0000 Message-Id: <666244da341dd7b2d8621da423868dc02afef2a5.1707409662.git.aburgess@redhat.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: References: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org This commit rewrites the gdb_argv class to make use of the function extract_string_maybe_quoted. I've moved shell_extract_string_ctrl from the unittests directory into the gdbsupport directory to make it more widely available, and this is used within gdb_argv to split arguments and handle escaping correctly. The algorithm behind this rewrite is pretty simple, for an input string we skip white space, and then call extract_string_maybe_quoted to get the next argument. We then repeat this process to get each subsequent argument. The result is almost identical to the libibery buildargv function except for one edge case which I think is a bug in buildargv. The libibery buildargv function, if passed a string that only contains white space, will return a single empty argument. I think this is wrong, and after this rewrite we now return a zero length argument list. This edge case is tested in gdb_argv-selftests.c and also in gdb.python/py-cmd.exp. --- gdb/testsuite/gdb.python/py-cmd.exp | 4 ++ gdb/unittests/extract-string-selftests.c | 3 -- gdb/unittests/gdb_argv-selftests.c | 3 ++ gdbsupport/buildargv.h | 50 ++++++++++++++++-------- gdbsupport/common-utils.cc | 5 +++ gdbsupport/common-utils.h | 11 ++++++ 6 files changed, 57 insertions(+), 19 deletions(-) diff --git a/gdb/testsuite/gdb.python/py-cmd.exp b/gdb/testsuite/gdb.python/py-cmd.exp index 01af475dded..8b1175d44c8 100644 --- a/gdb/testsuite/gdb.python/py-cmd.exp +++ b/gdb/testsuite/gdb.python/py-cmd.exp @@ -146,6 +146,10 @@ gdb_test "python print (gdb.string_to_argv (''))" \ {\[\]} \ "string_to_argv ('')" +gdb_test "python print (gdb.string_to_argv (' '))" \ + {\[\]} \ + "string_to_argv (' ')" + # Test user-defined python commands. gdb_test_multiline "input simple user-defined command" \ "python" "" \ diff --git a/gdb/unittests/extract-string-selftests.c b/gdb/unittests/extract-string-selftests.c index 3b9a5bd104c..5fd8426ec22 100644 --- a/gdb/unittests/extract-string-selftests.c +++ b/gdb/unittests/extract-string-selftests.c @@ -23,9 +23,6 @@ namespace selftests { namespace extract_string { -static extract_string_ctrl shell_extract_string_ctrl - (nullptr, "", "\"$`\\", "\n", "", "\n"); - struct test_def { test_def (const char *input, diff --git a/gdb/unittests/gdb_argv-selftests.c b/gdb/unittests/gdb_argv-selftests.c index b6a87254c93..b374b9b32ec 100644 --- a/gdb/unittests/gdb_argv-selftests.c +++ b/gdb/unittests/gdb_argv-selftests.c @@ -83,6 +83,9 @@ struct test_def /* The array of all tests. */ test_def tests[] = { + { "", { } }, + { " ", { } }, + { "''", { "" } }, { "abc def", {"abc", "def"} }, { "one two 3", {"one", "two", "3"} }, { "one two\\ three", {"one", "two three"} }, diff --git a/gdbsupport/buildargv.h b/gdbsupport/buildargv.h index 8fdd085b386..8e84affed39 100644 --- a/gdbsupport/buildargv.h +++ b/gdbsupport/buildargv.h @@ -1,4 +1,4 @@ -/* RAII wrapper for buildargv +/* RAII wrapper to split an argument string into individual arguments. Copyright (C) 2021-2024 Free Software Foundation, Inc. @@ -21,9 +21,12 @@ #define GDBSUPPORT_BUILDARGV_H #include "libiberty.h" +#include "gdbsupport/common-utils.h" +#include -/* A wrapper for an array of char* that was allocated in the way that - 'buildargv' does, and should be freed with 'freeargv'. */ +/* A class for splitting a single char* string into an array of char* + arguments, each argument is extracted from the original string. The + resulting array will be freed by this classes destructor. */ class gdb_argv { @@ -32,15 +35,15 @@ class gdb_argv /* A constructor that initializes to NULL. */ gdb_argv () - : m_argv (NULL) + : m_argv (nullptr) { } - /* A constructor that calls buildargv on STR. STR may be NULL, in - which case this object is initialized with a NULL array. */ + /* A constructor that splits STR into an array of arguments. STR may be + NULL, in which case this object is initialized with a NULL array. */ explicit gdb_argv (const char *str) - : m_argv (NULL) + : m_argv (nullptr) { reset (str); } @@ -74,18 +77,34 @@ class gdb_argv freeargv (m_argv); } - /* Call buildargv on STR, storing the result in this object. Any - previous state is freed. STR may be NULL, in which case this - object is reset with a NULL array. If buildargv fails due to - out-of-memory, call malloc_failure. Therefore, the value is - guaranteed to be non-NULL, unless the parameter itself is - NULL. */ + /* Read arguments from STR by calling extract_string_maybe_quoted. + Leading and trailing white space in STR will be ignored. Any previous + argument state is freed. STR may be nullptr, in which case this + object is reset with a nullptr array. */ void reset (const char *str) { - char **argv = buildargv (str); freeargv (m_argv); - m_argv = argv; + m_argv = nullptr; + + if (str == nullptr) + return; + + std::vector tmp_argv; + + str = skip_spaces (str); + while (*str != '\0') + { + std::string arg + = extract_string_maybe_quoted (&str, shell_extract_string_ctrl); + tmp_argv.emplace_back (xstrdup (arg.c_str ())); + str = skip_spaces (str); + } + tmp_argv.emplace_back (nullptr); + + size_t sz = sizeof (decltype (tmp_argv)::value_type) * tmp_argv.size (); + m_argv = (char **) xmalloc (sz); + memcpy (m_argv, tmp_argv.data (), sz); } /* Return the underlying array. */ @@ -197,7 +216,6 @@ class gdb_argv private: /* The wrapped array. */ - char **m_argv; }; diff --git a/gdbsupport/common-utils.cc b/gdbsupport/common-utils.cc index 15ee2c43f83..0fb2b6f350a 100644 --- a/gdbsupport/common-utils.cc +++ b/gdbsupport/common-utils.cc @@ -168,6 +168,11 @@ savestring (const char *ptr, size_t len) extract_string_ctrl default_extract_string_ctrl (nullptr, nullptr, nullptr); +/* See common-utils.h. */ + +extract_string_ctrl shell_extract_string_ctrl (nullptr, "", "\"$`\\", + "\n", "", "\n"); + /* See documentation in common-utils.h. */ std::string diff --git a/gdbsupport/common-utils.h b/gdbsupport/common-utils.h index 60578cd7560..bf603ee04e3 100644 --- a/gdbsupport/common-utils.h +++ b/gdbsupport/common-utils.h @@ -203,6 +203,17 @@ struct extract_string_ctrl extern extract_string_ctrl default_extract_string_ctrl; +/* This control object for use with extract_string_maybe_quoted replicates + shell like behaviour. Every character can be escaped in an unquoted + context, nothing can be escaped within single quotes, and a limited set + of characters (see POSIX shell spec) can be escaped within a double + quoted context. + + An escaped newline will be removed in an unquoted and double quoted + context. */ + +extern extract_string_ctrl shell_extract_string_ctrl; + /* Extract the next word from ARG. The next word is defined as either, everything up to the next space, or, if the next word starts with either a single or double quote, then everything up to the closing quote. The