From patchwork Thu Feb 8 16:28:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Burgess X-Patchwork-Id: 85470 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9000238582A9 for ; Thu, 8 Feb 2024 16:30:11 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 894843858C60 for ; Thu, 8 Feb 2024 16:28:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 894843858C60 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 894843858C60 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707409732; cv=none; b=mvvKW7mIJP88rysWUFsOJBnIL4iEyC3dmu4TpGftflVm4MuDAXQeGQ8r9ybJqTZR9AmgPZ4lwf7iPH/+lLMapCUDIbOv0rjvKrb+HwSEFvdOkXy4AzEDKt5AIEJpqdtloENIZ4C2TV1P+Gg6CQ6IbgDFN7+0CrUHge7f0BtXY60= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707409732; c=relaxed/simple; bh=JSLBee/snA3v35gjrzIvcnGI8uhmPFprUPKp4HD0MYY=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=oon1YVIjUaSYWrysdLSSsyY1wCWfb/gmymadsaHtHmtXURT3c5GUtSCx7wlwHrAX0eNGDR2aypLd8aeYtLOtQPO4xhuuBJOTBLQRvWba3TtaW4XTSZP9+OHcqrStCqOMQeFT3s4yBjQMqiMFD6Tb+3gGxibwxSIzH/X3153n63s= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707409730; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3mL+uqY7WrHvOwRvYyrUJRuxpuuXctCjM04uEnfdgqg=; b=UKJBeMqritTCKIixcO0Mbpwbve88JdAkmhX52jNhMJg6DQnU6ijUv3PJHW7p41HLDn5XCX xFJjhEoqcIEnsa9ZeFUfPOuitIXw99UhdaAHhUylOuMM/xiUAlG/xIBv3NBYf9xh/OBUBN v/6VrZaIxYy3ECjsm6pykRrdZHtoaF8= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-42-A__XPAzGMzCsWQq0-oD9hg-1; Thu, 08 Feb 2024 11:28:49 -0500 X-MC-Unique: A__XPAzGMzCsWQq0-oD9hg-1 Received: by mail-wm1-f72.google.com with SMTP id 5b1f17b1804b1-40e435a606aso115175e9.3 for ; Thu, 08 Feb 2024 08:28:48 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707409727; x=1708014527; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3mL+uqY7WrHvOwRvYyrUJRuxpuuXctCjM04uEnfdgqg=; b=mkogjpP3gdxM1+JDj1oycdhF5NEy8YAZdj9ye96FeczgkiZEL7TSht0nVOFsq8mxub 9ECG2E2x2ZQhBRgk7opqi7V16PqulsIlXHCTz6ticY6piPxD4SLE37kS6XiT/Uply6Rq yT7bvfPPQWzV3NcNmxDXHFA7oBMbD8nolSLsOYoPsc1f0pHIl9MHtQx3bIAae4G/ah/4 s44+1IzN2ZQzB+9ATiZg9SpuxgKt7+rlpri10R+TyMtNmns6J408sG30MNuSAslnMPoF RewJUOHJKAdT+U1itsU+VS8e8ZQlFZCfZ56kE6mpacQ12vBX8FMvu+//ApBgc/JIpz/Y DwEA== X-Gm-Message-State: AOJu0Yz9DG1iiTIipdjR6G8GlKIbos8Sg0Lu5nQ6zkoszqHkzwRlYz9Z +CZEI9KdkqTCBeip7SWr+RnPvNTz2J64+3F4IdybRBB+zK8QRO3WEz3tM2Ak+xEjTXB/jI4VhOC +gKZYUx8vrqM70M8Y6vRUX/bejRuuyyBflRvIlmPNvK14CBOOmECY98+gvXmEPzSFw6SE1nzjV6 YwJNZXBcoQ/L0kPwFgBbrKKGDPYMu6lI+lLhSWjfncE7U= X-Received: by 2002:a05:600c:4750:b0:410:2793:4e91 with SMTP id w16-20020a05600c475000b0041027934e91mr2450433wmo.34.1707409727020; Thu, 08 Feb 2024 08:28:47 -0800 (PST) X-Google-Smtp-Source: AGHT+IH0HGjg6KDrtu6UudpQdJgQV1hVtYTMzqSHjI7eRjV1BBEK7Nlh8TSInj9AwxzDf/cPul+5sg== X-Received: by 2002:a05:600c:4750:b0:410:2793:4e91 with SMTP id w16-20020a05600c475000b0041027934e91mr2450413wmo.34.1707409726239; Thu, 08 Feb 2024 08:28:46 -0800 (PST) Received: from localhost (185.223.159.143.dyn.plus.net. [143.159.223.185]) by smtp.gmail.com with ESMTPSA id jr1-20020a05600c560100b0040ef718cf81sm2051007wmb.28.2024.02.08.08.28.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Feb 2024 08:28:46 -0800 (PST) From: Andrew Burgess To: gdb-patches@sourceware.org Cc: Andrew Burgess Subject: [PATCH 1/4] gdb/unittests: add extract_string_maybe_quoted self tests Date: Thu, 8 Feb 2024 16:28:40 +0000 Message-Id: <694821ad9a09cabbc3fa3eccda73836fea550773.1707409662.git.aburgess@redhat.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: References: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org This commit adds some self tests for the extract_string_maybe_quoted function. No functionality is changed in this commit. --- gdb/Makefile.in | 1 + gdb/unittests/extract-string-selftests.c | 85 ++++++++++++++++++++++++ 2 files changed, 86 insertions(+) create mode 100644 gdb/unittests/extract-string-selftests.c diff --git a/gdb/Makefile.in b/gdb/Makefile.in index 0e0f19c40c9..92fe06de489 100644 --- a/gdb/Makefile.in +++ b/gdb/Makefile.in @@ -457,6 +457,7 @@ SELFTESTS_SRCS = \ unittests/copy_bitwise-selftests.c \ unittests/enum-flags-selftests.c \ unittests/environ-selftests.c \ + unittests/extract-string-selftests.c \ unittests/filtered_iterator-selftests.c \ unittests/format_pieces-selftests.c \ unittests/frame_info_ptr-selftests.c \ diff --git a/gdb/unittests/extract-string-selftests.c b/gdb/unittests/extract-string-selftests.c new file mode 100644 index 00000000000..c2eedd75342 --- /dev/null +++ b/gdb/unittests/extract-string-selftests.c @@ -0,0 +1,85 @@ +/* Self tests for the function extract_string_maybe_quoted. + + Copyright (C) 2024 Free Software Foundation, Inc. + + This file is part of GDB. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include "defs.h" +#include "gdbsupport/selftest.h" + +namespace selftests { +namespace extract_string { + +struct test_def +{ + test_def (const char *input, + const char *output, + size_t offset) + : m_input (input), + m_output (output), + m_offset (offset) + { /* Nothing. */ } + + void run () const + { + const char *tmp = m_input; + std::string test_out = extract_string_maybe_quoted (&tmp); + + if (run_verbose ()) + { + debug_printf ("Input: %s\n", m_input); + debug_printf ("Output [Got]: %s\n", test_out.c_str ()); + debug_printf ("Output [Exp]: %s\n", m_output); + debug_printf ("Rest [Got]: %s\n", tmp); + debug_printf ("Rest [Exp]: %s\n", (m_input + m_offset)); + } + + SELF_CHECK (test_out == m_output); + SELF_CHECK (tmp == m_input + m_offset); + } + +private: + const char *m_input; + const char *m_output; + size_t m_offset; +}; + +test_def tests[] = { + { "abc def", "abc", 3 }, + { "'abc' def", "abc", 5 }, + { "\"abc\" def", "abc", 5 }, + { "ab\\ cd ef", "ab cd", 6 }, + { "\"abc\\\"def\" ghi", "abc\"def", 10 }, + { "\"'abc' 'def'\" ghi", "'abc' 'def'", 13 }, +}; + +static void +run_tests () +{ + for (const auto &test : tests) + test.run (); +} + +} /* namespace extract_string */ +} /* namespace selftests */ + +void _initialize_extract_string_selftest (); +void +_initialize_extract_string_selftest () +{ + selftests::register_test ("extract-string", + selftests::extract_string::run_tests); +} From patchwork Thu Feb 8 16:28:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Burgess X-Patchwork-Id: 85471 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1F275385841C for ; Thu, 8 Feb 2024 16:30:50 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 9BDD53858C78 for ; Thu, 8 Feb 2024 16:28:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9BDD53858C78 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9BDD53858C78 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707409735; cv=none; b=qiq2j42mnIsEqVCoynrqaKC+hkfaK/AbzeeOSmr165hJsPjR2rs5Me1P4Z/0Kk+zbP0FE0FlOQpNOoEY8Nk2cnDRsv9kgrSlBVGVQ3nf770VayTSpoKl97SvjNPwh0Wbqiu1eISkg6LWwcZ7TTgi/A4g1XJxSoXRXVvhhGnDnyo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707409735; c=relaxed/simple; bh=rKBSgQjHV4YIgnB+D7BFmvHgI0l3KIWI3mq1uOwuaWc=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=BVdqWP38R5ihzamZwzko3uFr1p6nNpjhBmiX4aSysrkpBnbw78DZh0qrw1lpvIFHeCE+k2gXwF5fppOdmrVrsMAmcPDJPHa1XIUzKYx/Ee/4kZAynFlEyYGlfMAgzVsHbijX5YbmMSew9HaTntigsZcx763/2ro3cG0d19nsBjw= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707409732; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=g3ZOoc6HZA4iXAcvats/dJ9TcUcIDJvc5UopEA6JBEs=; b=G0bjtLNf0PltDqynaTKXgoNxjiaSDgIVbGSbtKv6aaDuyUVBX72P0zJdYzGcZ3g5TCwNN+ iHQJ3qjejba8lzHBb1t6QEZozhYxs60gaPgcK68f+vVIsV2ogd1NRgUO8QHOnb2YG9xTbd L85k7Yee62jeZrxySYgoD+huFa7uHYM= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-213-zfUZEiWNNdqYhqFF2uYNig-1; Thu, 08 Feb 2024 11:28:51 -0500 X-MC-Unique: zfUZEiWNNdqYhqFF2uYNig-1 Received: by mail-wm1-f72.google.com with SMTP id 5b1f17b1804b1-410040e1fcdso368985e9.0 for ; Thu, 08 Feb 2024 08:28:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707409729; x=1708014529; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=g3ZOoc6HZA4iXAcvats/dJ9TcUcIDJvc5UopEA6JBEs=; b=mL9DR3v683vVLuj525XXU+9rxy8b2yfsqphddyOjUkrYJL5R/bJMU5TpVQMi/IJcby L9jz5Zn8NXm8ieGdJ5B9AiiA/pAFgAcyPkc6C9A5Lc4KAXw84VOoSBmy7Nalxw2b9aPi zfs63ShLWdeEPt0G5YBJvnRgNmtgi+g1kCp5fTZlPqoJbESnFfnJYB6EnHrrcPhm+aSf SUeNmnNGsFHEIkL9ThWcEuq8RJmqlxYRK1KGPyCWTAY/oCFRDEiu1AjOhPQ2e0Ehp8+k VZmq3t9RjNs0nMe6KQqyuE8OOjveZ2mi5kR/jmuPy0IlC6LXhKHx3SSGpygL764XVyZE wM2w== X-Gm-Message-State: AOJu0Yx5gTCdpVaP0lk713Wt6ggbrnC1v/R+meC7DR+wfg1Qdt/874/z bW5BJxvVb9CFnsseJ1Or9AxiEJFTah9FyX3nZVOLQEzsBcUgJgdI2yxkNdCCzy3lUZcUY9T5cUL DeMFLmGdDy1H8qWx5AgGFcjr4Woh419NcWUrn+blHZxrQ5OgzSj8Pzv/nzAKNcIOhTpRaHAhALv bnwSzGRqBB9MtVc660C3Vl+DRMnRya/0hthchQaDTJaeg= X-Received: by 2002:a05:600c:3153:b0:40f:dd72:d9f with SMTP id h19-20020a05600c315300b0040fdd720d9fmr6940064wmo.39.1707409728832; Thu, 08 Feb 2024 08:28:48 -0800 (PST) X-Google-Smtp-Source: AGHT+IEevmJKWWJwgRz5X7hvMxaA+hHAVIzy2Ru4WmJelhdWk9SOgJ/DxONjjHfniF3hOfwS9PtcmA== X-Received: by 2002:a05:600c:3153:b0:40f:dd72:d9f with SMTP id h19-20020a05600c315300b0040fdd720d9fmr6940041wmo.39.1707409728332; Thu, 08 Feb 2024 08:28:48 -0800 (PST) Received: from localhost (185.223.159.143.dyn.plus.net. [143.159.223.185]) by smtp.gmail.com with ESMTPSA id n38-20020a05600c502600b0040f035bebfcsm2124932wmr.12.2024.02.08.08.28.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Feb 2024 08:28:47 -0800 (PST) From: Andrew Burgess To: gdb-patches@sourceware.org Cc: Andrew Burgess Subject: [PATCH 2/4] gdbsupport: allow for configuration of extract_string_maybe_quoted Date: Thu, 8 Feb 2024 16:28:41 +0000 Message-Id: <499135b7e99641c873df5c4ce419bf4e5a9751f9.1707409662.git.aburgess@redhat.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: References: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org This function has the extract_string_maybe_quoted function accept an extra parameter, a control structure. This control structure can then be used to modify how extract_string_maybe_quoted operates. What can be controlled exactly? The new argument allows controls over which characters can and can't be escaped within each of three contexts: outside any quotes, within single quotes, and within double quotes. The escaping rules for the current extract_string_maybe_quoted make very little sense to me, we allow every character to be escaped within all three contexts. This doesn't make much sense to me, why do we need to allow so much escaping? In most places in GDB where extract_string_maybe_quoted is used no characters have special meaning, so if I call extract_string_maybe_quoted to extract from this: "Blah\@" Then I'll get back (Blah@) without the parenthesis. But why is the escaping supported? I could just as easily write: "Blah@" In which case I'll get the same (Blah@) result. If I want a result of (Blah\@) then I need to type: "Blah\\@" Now the argument could be used that this is consistent with shell escaping, which is what users will expect. Except it really isn't. Outside of any quotes then yes, any \ sequence will result in even when doesn't have a special meaning, but within double quotes only a limited set of characters need escaping, in all other cases the backslash will be preserved, yet for GDB within double quotes, our escaping is just like the unquoted escaping. And within single quotes no characters can be escaped at a shell, and yet, once again, GDB just uses the unquoted strategy. So I think GDB's approach neither makes sense from a necessity point of view, nor does it make sense from a "like a shell" point of view. My plan, in a later commit, is to reuse extract_string_maybe_quoted as the core of gdb_argv instead of calling into libiberty. To do this I need extract_string_maybe_quoted to behave more "shell like" just as libiberties buildargv function does. And so, by allowing control over which characters can, or can't be quoted, I can retain the current extract_string_maybe_quoted behaviour for the current users, but provide a different control structure for when (later on) I reuse extract_string_maybe_quoted in gdb_argv. I did consider changing the default behaviour of extract_string_maybe_quoted, and I might still propose that in the future. Unfortunately, there is one test (gdb.base/options.exp) that depends on the current extract_string_maybe_quoted behaviour, specifically is relies on being able to escape a single quote character within a single quote context -- at a POSIX shell it is not possible to embed a single quote within a single quoted string. So as this is tested behaviour I'm reluctant to just go changing this. In this commit I've added a new control structure extract_string_ctrl in gdbsupport/common-utils.h, defined a default structure in gdbsupport/common-utils.cc, and this default is used by extract_string_maybe_quoted to maintain the current behaviour. In gdb/unittests/extract-string-selftests.c I've defined a new extract_string_ctrl object which implements shell like behaviour, I've then added some new extract_string_maybe_quoted self tests which check the shell like behaviour. --- gdb/unittests/extract-string-selftests.c | 20 +++- gdbsupport/common-utils.cc | 75 +++++++------ gdbsupport/common-utils.h | 133 ++++++++++++++++++++++- 3 files changed, 191 insertions(+), 37 deletions(-) diff --git a/gdb/unittests/extract-string-selftests.c b/gdb/unittests/extract-string-selftests.c index c2eedd75342..3b9a5bd104c 100644 --- a/gdb/unittests/extract-string-selftests.c +++ b/gdb/unittests/extract-string-selftests.c @@ -23,20 +23,30 @@ namespace selftests { namespace extract_string { +static extract_string_ctrl shell_extract_string_ctrl + (nullptr, "", "\"$`\\", "\n", "", "\n"); + struct test_def { test_def (const char *input, const char *output, - size_t offset) + size_t offset, + extract_string_ctrl *ctrl = nullptr) : m_input (input), m_output (output), - m_offset (offset) + m_offset (offset), + m_ctrl (ctrl) { /* Nothing. */ } void run () const { const char *tmp = m_input; - std::string test_out = extract_string_maybe_quoted (&tmp); + std::string test_out; + + if (m_ctrl == nullptr) + test_out = extract_string_maybe_quoted (&tmp); + else + test_out = extract_string_maybe_quoted (&tmp, *m_ctrl); if (run_verbose ()) { @@ -55,6 +65,7 @@ struct test_def const char *m_input; const char *m_output; size_t m_offset; + extract_string_ctrl *m_ctrl; }; test_def tests[] = { @@ -64,6 +75,9 @@ test_def tests[] = { { "ab\\ cd ef", "ab cd", 6 }, { "\"abc\\\"def\" ghi", "abc\"def", 10 }, { "\"'abc' 'def'\" ghi", "'abc' 'def'", 13 }, + { "'ab\\ cd' ef", "ab\\ cd", 8, &shell_extract_string_ctrl }, + { "ab\\\ncd ef", "abcd", 6, &shell_extract_string_ctrl }, + { "\"ab\\\ncd\" ef", "abcd", 8, &shell_extract_string_ctrl }, }; static void diff --git a/gdbsupport/common-utils.cc b/gdbsupport/common-utils.cc index 99b9cb8609b..15ee2c43f83 100644 --- a/gdbsupport/common-utils.cc +++ b/gdbsupport/common-utils.cc @@ -164,58 +164,67 @@ savestring (const char *ptr, size_t len) return p; } +/* See common-utils.h. */ + +extract_string_ctrl default_extract_string_ctrl (nullptr, nullptr, nullptr); + /* See documentation in common-utils.h. */ std::string -extract_string_maybe_quoted (const char **arg) +extract_string_maybe_quoted (const char **arg, + const extract_string_ctrl &ctrl) { - bool squote = false; - bool dquote = false; - bool bsquote = false; + char quote = '\0'; std::string result; const char *p = *arg; /* Find the start of the argument. */ p = skip_spaces (p); - /* Parse p similarly to gdb_argv buildargv function. */ + /* Parse p similarly to the libiberty buildargv function, but respect + the escaping rules of CTRL. */ while (*p != '\0') { - if (ISSPACE (*p) && !squote && !dquote && !bsquote) + if (ISSPACE (*p) && quote == '\0') break; else { - if (bsquote) - { - bsquote = false; - result += *p; - } - else if (*p == '\\') - bsquote = true; - else if (squote) + if (*p == '\\') { - if (*p == '\'') - squote = false; + /* The character after the backslash. */ + char c = *(p + 1); + + if (ctrl.discard_escaped_char (c, quote)) + { + /* We are discarding the escape character (backslash) and + the character after the escape. This allows us to + emulate POSIX newline handling where an escaped + newline is discarded. */ + ++p; + } + else if (ctrl.is_escaped (c, quote)) + { + /* The character C is escaped. We discard the escape + character (backslash), but do include C. */ + ++p; + result += c; + } else - result += *p; - } - else if (dquote) - { - if (*p == '"') - dquote = false; - else - result += *p; + { + /* We are going to treat the backslash as a literal + character with no special association to the following + character. */ + result += *p; + } } + else if (*p == quote) + quote = '\0'; + else if (quote == '\0' && (*p == '\'' || *p == '"')) + quote = *p; else - { - if (*p == '\'') - squote = true; - else if (*p == '"') - dquote = true; - else - result += *p; - } - p++; + result += *p; + + ++p; } } diff --git a/gdbsupport/common-utils.h b/gdbsupport/common-utils.h index 23cd40c0207..60578cd7560 100644 --- a/gdbsupport/common-utils.h +++ b/gdbsupport/common-utils.h @@ -74,6 +74,135 @@ std::string &string_vappendf (std::string &dest, const char* fmt, va_list args) char *savestring (const char *ptr, size_t len); +/* A control data structure, instances of this class are passed to the + extract_string_maybe_quoted function in order to modify its behaviour. + + This class controls which characters can and can't be quoted within + different context (no quotes, single quotes, or double quotes). */ + +struct extract_string_ctrl +{ + /* Constructor. The CAN_ESCAPE_* arguments are the set of characters + which can be escaped within each context. If any character within + this set appears within the given context with a backslash before it, + then the backslash will be removed and the character retained in the + output. If any of these arguments are nullptr then every character + can be escaped within that context. If you want no character escaping + within a particular context then pass the empty string. + + The DISCARD_* arguments are the set of characters which should be + discarded when escaped within a given context. If any of these + characters appear within the given context with a backslash before + them, then both the backslash and the character will be removed from + the final output. If any of these arguments are nullptr then no + characters will be discarded when escaped within that context. + + During string extraction, if a backslash is encountered in the input, + and the next character is not in the relevant CAN_ESCAPE_* variable, + nor is in the relevant DISCARD_* variable, then the backslash will be + retained in the output. */ + + explicit extract_string_ctrl (const char *can_escape_unquoted, + const char *can_escape_single, + const char *can_escape_double, + const char *discard_unquoted = nullptr, + const char *discard_single = nullptr, + const char *discard_double = nullptr) + : m_can_escape_unquoted (can_escape_unquoted), + m_can_escape_single (can_escape_single), + m_can_escape_double (can_escape_double), + m_discard_unquoted (discard_unquoted), + m_discard_single (discard_single), + m_discard_double (discard_double) + { + /* Nothing. */ + } + + /* Return true if, when character C appears escaped within an input + string, both the backslash escape character, and the character C + should be discarded from the input stream. */ + + bool discard_escaped_char (const char c, const char quote) const + { + const char *char_set = nullptr; + switch (quote) + { + case '\0': + char_set = m_discard_unquoted; + break; + case '\'': + char_set = m_discard_single; + break; + case '"': + char_set = m_discard_double; + break; + default: + gdb_assert_not_reached ("unexpected quote character '%c'", quote); + } + + /* Discard nothing. */ + if (char_set == nullptr) + return false; + + /* If C is in CHAR_SET then discard. */ + return strchr (char_set, c) != nullptr; + } + + /* Return true if, when character C appears escaped within an input + string, we should discard the previous escape character and just + include character C. Otherwise return false. */ + + bool is_escaped (const char c, const char quote) const + { + const char *char_set = nullptr; + switch (quote) + { + case '\0': + char_set = m_can_escape_unquoted; + break; + case '\'': + char_set = m_can_escape_single; + break; + case '"': + char_set = m_can_escape_double; + break; + default: + gdb_assert_not_reached ("unexpected quote character '%c'", quote); + } + + /* When no character set is defined we always return true, this means + that every character can be escaped. */ + if (char_set == nullptr) + return true; + + /* Otherwise, return true if C is in CHAR_SET. */ + return strchr (char_set, c) != nullptr; + } + +private: + /* The set of characters that can be escaped within a particular + context. See the constructor for more information. */ + const char *m_can_escape_unquoted; + const char *m_can_escape_single; + const char *m_can_escape_double; + + /* The set of characters that can be discarded within a particular + context. See the constructor for more information. */ + const char *m_discard_unquoted; + const char *m_discard_single; + const char *m_discard_double; +}; + +/* The default control configuration for extract_string_maybe_quoted. + This is the way it is for backwards compatibility reasons, though it is + unclear if this is actually a good thing or not. Every character in + every context can be escaped. This really makes very little sense as + in most locations in GDB no characters, other than quotes and white + space, actually have any special meaning, so really very little + escaping should be supported. */ + +extern extract_string_ctrl default_extract_string_ctrl; + /* Extract the next word from ARG. The next word is defined as either, everything up to the next space, or, if the next word starts with either a single or double quote, then everything up to the closing quote. The @@ -82,7 +211,9 @@ char *savestring (const char *ptr, size_t len); word, or, for quoted words, the first character after the closing quote. */ -std::string extract_string_maybe_quoted (const char **arg); +std::string extract_string_maybe_quoted + (const char **arg, + const extract_string_ctrl &ctrl = default_extract_string_ctrl); /* The strerror() function can return NULL for errno values that are out of range. Provide a "safe" version that always returns a From patchwork Thu Feb 8 16:28:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Burgess X-Patchwork-Id: 85469 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 93B1B385841D for ; Thu, 8 Feb 2024 16:29:48 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 991963858283 for ; Thu, 8 Feb 2024 16:28:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 991963858283 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 991963858283 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707409737; cv=none; b=SXQ58NWpMxeQ3w4Co9SINqLU1i3zrhQ/A6v/v2ta2z831m010i3haEnalN2yrf/AC054kcGa/QjXMWNEWJUUG5E+NC5KycP5wp0DiW6luaUn6BvcqGYB/MauZpR5xr60J7dGIs6YsZT+U7px4J1c0u9mNyfMP65rZbVsP0hP4KI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707409737; c=relaxed/simple; bh=leJVwhwHf0QkrpCEkh4x+rBSD8zGONw9imS4+conGCs=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=kzbvoYZkIbu1bN//rfqEZk4bUl0/xU2cvBqVyZN4LkYzlnxdKRvCYjdMrHN4IGWkiRKnw4JDZRCkRsFWSu2GME+FhGOu9T2Id7nNsfKhqoyjZkMiqhu+rWMwxcGDGBBFZJrLV+IT/nkDwNxrspzy5ayYHfqfkK9EVSD98U+tt8o= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707409735; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3dxKYTmzgWDhHhRE6Hm8TaOEsXB7xQLHg3z41fZrAgY=; b=Xy2ceGUUoB4UxqjxLKeodIHv/Lv6YJOspzXKJ/5AE0teC0kCP75XWtcVL4Nuadq5jOX16N 7nGSdd935+SxVGUm11VJf+St85XSenFju0MS9Tm/nYjMEyTW6Uy3fBl3z0sfivnFCBja0B P683DhwmswlbEzyXwCe2JO9tEoYbegA= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-161-xqSB_w5INkybPKi1WaR2bQ-1; Thu, 08 Feb 2024 11:28:52 -0500 X-MC-Unique: xqSB_w5INkybPKi1WaR2bQ-1 Received: by mail-wr1-f69.google.com with SMTP id ffacd0b85a97d-33b1ad7dfe1so1059452f8f.2 for ; Thu, 08 Feb 2024 08:28:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707409731; x=1708014531; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3dxKYTmzgWDhHhRE6Hm8TaOEsXB7xQLHg3z41fZrAgY=; b=CDnRxfMfi3KdFuq+mBKMNRJKvbmddhHLArur+HZPPxZCJ8RX0hLFqWUcY3RxScOvSR yEuU+Okp4woYeH2YWmRO6ip8QiWC8K0VNnciWOZ3ZCsJcnIK/DpljrWaUDsynvPcp77T YKQhjl7WaZHcigCmekIgtFPP3vDJnDqIZKRhestjpvSPqsZjkYn4MjPT8nYsQHWNGMBr 3jNRNBkZ+Ho3fyj9k2uYfR9OqDECk1u47R3LzLfuVAzMH26KSaMv27+acamCX3ZSL0hl AU4uHPvyhGgZYq65ABH1f3tXM/oEd1g4fZTciIv5TfFTyVJY6S2FXll+CRkCU2rxmda/ JIIQ== X-Gm-Message-State: AOJu0Yzsr4AMbrOW3tkU+ZOqfz3XJ9MyFhAZKt6g6RPzAQ4MgDmsyT+I 9GWhmTtvzchGDkxP7w0JSoi1gyIheEg/Yv+sUTJ+QnF0XuiCKOqIlUAhf+zpn7wTa5pHx3L49j3 zzUosYL/wJi885lhX7+KeQuey8Lkh48P1B+GkqYaM72X0yHbhe5eWwadensumUu8NzmATB+DlkL HB04LqAwZqMUCQVrZLI7JMRVPwFgfVHPpgrEMrNDVySHY= X-Received: by 2002:adf:e5cb:0:b0:33b:2464:9d8f with SMTP id a11-20020adfe5cb000000b0033b24649d8fmr5921115wrn.58.1707409731021; Thu, 08 Feb 2024 08:28:51 -0800 (PST) X-Google-Smtp-Source: AGHT+IGTys/zK8QLNgSaoNWw1DKlPbCHDumYvvPEHoiD5zDMAJPOMCktNJOHWvhP7YVhM2WfVDvtmw== X-Received: by 2002:adf:e5cb:0:b0:33b:2464:9d8f with SMTP id a11-20020adfe5cb000000b0033b24649d8fmr5921108wrn.58.1707409730680; Thu, 08 Feb 2024 08:28:50 -0800 (PST) Received: from localhost (185.223.159.143.dyn.plus.net. [143.159.223.185]) by smtp.gmail.com with ESMTPSA id j8-20020a5d6e48000000b0033929310ae4sm3924449wrz.73.2024.02.08.08.28.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Feb 2024 08:28:49 -0800 (PST) From: Andrew Burgess To: gdb-patches@sourceware.org Cc: Andrew Burgess Subject: [PATCH 3/4] gdb/unittests: additional tests for gdb_argv class Date: Thu, 8 Feb 2024 16:28:42 +0000 Message-Id: X-Mailer: git-send-email 2.25.4 In-Reply-To: References: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org We already have some gdb_argv tests, but these only test the gdb_argv::as_array_view member function. This commit creates a new unittests/gdb_argv-selftests.c file and adds a wider range of gdb_argv tests, this includes covering the existing as_array_view tests, so the existing tests have been deleted from gdb/utils.c. I've also added an additional test to gdb.python/py-cmd.exp which tests calling gdb.string_to_argv with an empty string. The gdb.string_to_argv function is a wrapper around gdb_argv, and I thought the empty string case was worth covering. --- gdb/Makefile.in | 1 + gdb/testsuite/gdb.python/py-cmd.exp | 4 + gdb/unittests/gdb_argv-selftests.c | 131 ++++++++++++++++++++++++++++ gdb/utils.c | 26 ------ 4 files changed, 136 insertions(+), 26 deletions(-) create mode 100644 gdb/unittests/gdb_argv-selftests.c diff --git a/gdb/Makefile.in b/gdb/Makefile.in index 92fe06de489..18dec570389 100644 --- a/gdb/Makefile.in +++ b/gdb/Makefile.in @@ -462,6 +462,7 @@ SELFTESTS_SRCS = \ unittests/format_pieces-selftests.c \ unittests/frame_info_ptr-selftests.c \ unittests/function-view-selftests.c \ + unittests/gdb_argv-selftests.c \ unittests/gdb_tilde_expand-selftests.c \ unittests/gmp-utils-selftests.c \ unittests/intrusive_list-selftests.c \ diff --git a/gdb/testsuite/gdb.python/py-cmd.exp b/gdb/testsuite/gdb.python/py-cmd.exp index 82cb4cb557a..01af475dded 100644 --- a/gdb/testsuite/gdb.python/py-cmd.exp +++ b/gdb/testsuite/gdb.python/py-cmd.exp @@ -142,6 +142,10 @@ gdb_test "python print (gdb.string_to_argv ('1\\ 2 3'))" \ {\['1 2', '3'\]} \ "string_to_argv ('1\\ 2 3')" +gdb_test "python print (gdb.string_to_argv (''))" \ + {\[\]} \ + "string_to_argv ('')" + # Test user-defined python commands. gdb_test_multiline "input simple user-defined command" \ "python" "" \ diff --git a/gdb/unittests/gdb_argv-selftests.c b/gdb/unittests/gdb_argv-selftests.c new file mode 100644 index 00000000000..b6a87254c93 --- /dev/null +++ b/gdb/unittests/gdb_argv-selftests.c @@ -0,0 +1,131 @@ +/* Self tests for the gdb_argv class from gdbsupport. + + Copyright (C) 2024 Free Software Foundation, Inc. + + This file is part of GDB. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include "defs.h" +#include "gdbsupport/selftest.h" +#include "gdbsupport/buildargv.h" + +namespace selftests { +namespace gdb_argv_tests { + +static extract_string_ctrl shell_extract_string_ctrl + (nullptr, "", "\"$`\\", "\n", "", "\n"); + +struct test_def +{ + /* INPUT is the string to pass to the gdb_argv constructor. OUTPUT is + the vector of arguments that gdb_argv should split INPUT into. */ + + test_def (const char *input, + std::vector output) + : m_input (input), + m_output (output) + { /* Nothing. */ } + + /* Run this test and check the results. Throws an exception if the test + fails. */ + + void run () const + { + gdb_argv argv (m_input); + int count = argv.count (); + + if (run_verbose ()) + { + debug_printf ("Input: %s\n", m_input); + debug_printf ("Output [Count]: %d\n", count); + } + + SELF_CHECK (count == m_output.size ()); + + gdb::array_view view = argv.as_array_view (); + SELF_CHECK (view.size () == count); + + const char *const *data = argv.get (); + for (int i = 0; i < count; ++i) + { + const char *a = argv[i]; + SELF_CHECK (view[i] == a); + SELF_CHECK (strcmp (a, m_output[i]) == 0); + SELF_CHECK (a == *data); + ++data; + } + + SELF_CHECK (*data == nullptr); + } + +private: + /* What to pass to gdb_argv constructor. */ + + const char *m_input; + + /* The expected contents of gdb_argv after construction. */ + + std::vector m_output; +}; + +/* The array of all tests. */ + +test_def tests[] = { + { "abc def", {"abc", "def"} }, + { "one two 3", {"one", "two", "3"} }, + { "one two\\ three", {"one", "two three"} }, + { "one\\ two three", {"one two", "three"} }, + { "'one two' three", {"one two", "three"} }, + { "one \"two three\"", {"one", "two three"} }, + { "'\"' \"''\"", {"\"", "''"} }, +}; + +/* Test the creation of an empty gdb_argv object. */ + +static void +empty_argv_tests () +{ + { + gdb_argv argv; + + SELF_CHECK (argv.get () == nullptr); + SELF_CHECK (argv.count () == 0); + + gdb::array_view view = argv.as_array_view (); + + SELF_CHECK (view.data () == nullptr); + SELF_CHECK (view.size () == 0); + } +} + +static void +run_tests () +{ + empty_argv_tests (); + + for (const auto &test : tests) + test.run (); +} + +} /* namespace gdb_argv_tests */ +} /* namespace selftests */ + +void _initialize_gdb_argv_selftest (); +void +_initialize_gdb_argv_selftest () +{ + selftests::register_test ("gdb_argv", + selftests::gdb_argv_tests::run_tests); +} diff --git a/gdb/utils.c b/gdb/utils.c index 702c3f15826..05fbbdc2b63 100644 --- a/gdb/utils.c +++ b/gdb/utils.c @@ -3294,31 +3294,6 @@ gdb_realpath_tests () gdb_realpath_check_trailer ("", ""); } -/* Test the gdb_argv::as_array_view method. */ - -static void -gdb_argv_as_array_view_test () -{ - { - gdb_argv argv; - - gdb::array_view view = argv.as_array_view (); - - SELF_CHECK (view.data () == nullptr); - SELF_CHECK (view.size () == 0); - } - { - gdb_argv argv ("une bonne 50"); - - gdb::array_view view = argv.as_array_view (); - - SELF_CHECK (view.size () == 3); - SELF_CHECK (strcmp (view[0], "une") == 0); - SELF_CHECK (strcmp (view[1], "bonne") == 0); - SELF_CHECK (strcmp (view[2], "50") == 0); - } -} - #endif /* GDB_SELF_TEST */ /* Simple, portable version of dirname that does not modify its @@ -3794,7 +3769,6 @@ When set, debugging messages will be marked with seconds and microseconds."), #if GDB_SELF_TEST selftests::register_test ("gdb_realpath", gdb_realpath_tests); - selftests::register_test ("gdb_argv_array_view", gdb_argv_as_array_view_test); selftests::register_test ("strncmp_iw_with_mode", strncmp_iw_with_mode_tests); selftests::register_test ("pager", test_pager); From patchwork Thu Feb 8 16:28:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Burgess X-Patchwork-Id: 85472 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 821CB38582A5 for ; Thu, 8 Feb 2024 16:31:11 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 36ADE3858403 for ; Thu, 8 Feb 2024 16:28:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 36ADE3858403 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 36ADE3858403 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707409741; cv=none; b=PtlbVk6j8ePR1uk0Qui34lBl+Lzi7hKoFH9C/els16fE/UtTKeprFTRpQJDl+3uI99KUJtvud4duLlTEfecglwV59e/wqMVlqJlMaxJwPPfG16h8jOg/Msl9B26av0gmJXphIRLKJGwXUHDl8Ay3cxr421m6jHD1jttoZ/cFUlA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707409741; c=relaxed/simple; bh=INMBhBzHZort4AIjqMgA5bPjfhLuB3pCHBOVRVQxj9s=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=F7iwVY0iel2oqR7Pi3KfITHFn/CRgrPsa7meO3YFWki7MagojBT5+fAAItRN/ynJt67fXimvsjKzBJjnxLsdauJpvCnOAOFj6fL0GVP+LlUonuSEWzu4ABOxQ7rnUzGftg2CA/qFn3tjSmuBG7aMmi5lqzlwo3lvhxIXv+WEcLs= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707409736; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Rb2FADYk9rpUiHhqBp/o0q+cQlRawv2GQzUW8ssJl/k=; b=fySWAlWH+NnSq78O4sLac4EmEAwANc7c1nPxVDe/4cZg7y7YD6J7xR6euApbGzLB+gBedo mZKwP7UGWhbWNjiw4GNwNlcUtNRJGR7dv385OaA+XsKsrEh4RvYhiRvQeibt12BePsNwf0 qz/9M1dM4kJ8ZZmHeO4ZzKQRbXebzZU= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-440-rjhgXl9_P2i8xvcnfCHOKg-1; Thu, 08 Feb 2024 11:28:54 -0500 X-MC-Unique: rjhgXl9_P2i8xvcnfCHOKg-1 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-410040e1fcdso369145e9.0 for ; Thu, 08 Feb 2024 08:28:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707409732; x=1708014532; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Rb2FADYk9rpUiHhqBp/o0q+cQlRawv2GQzUW8ssJl/k=; b=Ij5hD+5BNmrMWwMhGRQI8KgLKBPuB3WQeS88YEyojWKC3Q5tpP5W/sFciR0w1aRO9j Jtfu466pCIKDK+RA/jRHWPYwoThLUTVozkEyn8ySqSlC7wzNCSaAjJiz90P7lX/nPs2z sVSB4UHXVeL5D0e6SqNDIsbNmtPWgfYpEyvarOt93eKgnNvmsvUq+BliIy6cMgUgYJAK tQbl6okCtAVFigM9Db/qeCoh5MXSGb7WMDjWBLKD4H/v5jX/SfvAnefMjHBlBbJTplLC YA+7DThaHqqfHCAvpTkZT+jr7YHxeQl6nV8VoOJX6KoUUp+rIUNOxQyRS2zJjd0ZPyV7 au4w== X-Gm-Message-State: AOJu0YwgjJBUFf75wjE+3XZFu/qmVm/wtzvUDfABdV+Kp7Yd9Ka0nS2X PLAZIY2aTPoeIfQqsHG+EMuJIEkF1b6TiK7IAcIpJoXxOFJpHEi6ulU6upamBfc+mkYVLiWchiE uFmv3Zppy3DoDWiyRaUq67VLSoy85KsWAnE5JUVBzyqq9kyWDcn5tbu9+OKzKEMygdCpXVocQkE hHZ4fZUXf4qjmofWg/HzAB8+WDt7gH+Nq84pla/znmAzs= X-Received: by 2002:a05:600c:1550:b0:40f:bd81:e760 with SMTP id f16-20020a05600c155000b0040fbd81e760mr5659771wmg.40.1707409732419; Thu, 08 Feb 2024 08:28:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IHWPvljStWt1v/wDL6ALVo12m047VwOXN/Gs6N50J6fPY5P9KVeYGEqszZvsT6lgPY2rWdR3Q== X-Received: by 2002:a05:600c:1550:b0:40f:bd81:e760 with SMTP id f16-20020a05600c155000b0040fbd81e760mr5659757wmg.40.1707409731864; Thu, 08 Feb 2024 08:28:51 -0800 (PST) Received: from localhost (185.223.159.143.dyn.plus.net. [143.159.223.185]) by smtp.gmail.com with ESMTPSA id fc14-20020a05600c524e00b004102f16d937sm2116619wmb.17.2024.02.08.08.28.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Feb 2024 08:28:51 -0800 (PST) From: Andrew Burgess To: gdb-patches@sourceware.org Cc: Andrew Burgess Subject: [PATCH 4/4] gdbsupport: rewrite gdb_argv to use extract_string_maybe_quoted Date: Thu, 8 Feb 2024 16:28:43 +0000 Message-Id: <666244da341dd7b2d8621da423868dc02afef2a5.1707409662.git.aburgess@redhat.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: References: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org This commit rewrites the gdb_argv class to make use of the function extract_string_maybe_quoted. I've moved shell_extract_string_ctrl from the unittests directory into the gdbsupport directory to make it more widely available, and this is used within gdb_argv to split arguments and handle escaping correctly. The algorithm behind this rewrite is pretty simple, for an input string we skip white space, and then call extract_string_maybe_quoted to get the next argument. We then repeat this process to get each subsequent argument. The result is almost identical to the libibery buildargv function except for one edge case which I think is a bug in buildargv. The libibery buildargv function, if passed a string that only contains white space, will return a single empty argument. I think this is wrong, and after this rewrite we now return a zero length argument list. This edge case is tested in gdb_argv-selftests.c and also in gdb.python/py-cmd.exp. --- gdb/testsuite/gdb.python/py-cmd.exp | 4 ++ gdb/unittests/extract-string-selftests.c | 3 -- gdb/unittests/gdb_argv-selftests.c | 3 ++ gdbsupport/buildargv.h | 50 ++++++++++++++++-------- gdbsupport/common-utils.cc | 5 +++ gdbsupport/common-utils.h | 11 ++++++ 6 files changed, 57 insertions(+), 19 deletions(-) diff --git a/gdb/testsuite/gdb.python/py-cmd.exp b/gdb/testsuite/gdb.python/py-cmd.exp index 01af475dded..8b1175d44c8 100644 --- a/gdb/testsuite/gdb.python/py-cmd.exp +++ b/gdb/testsuite/gdb.python/py-cmd.exp @@ -146,6 +146,10 @@ gdb_test "python print (gdb.string_to_argv (''))" \ {\[\]} \ "string_to_argv ('')" +gdb_test "python print (gdb.string_to_argv (' '))" \ + {\[\]} \ + "string_to_argv (' ')" + # Test user-defined python commands. gdb_test_multiline "input simple user-defined command" \ "python" "" \ diff --git a/gdb/unittests/extract-string-selftests.c b/gdb/unittests/extract-string-selftests.c index 3b9a5bd104c..5fd8426ec22 100644 --- a/gdb/unittests/extract-string-selftests.c +++ b/gdb/unittests/extract-string-selftests.c @@ -23,9 +23,6 @@ namespace selftests { namespace extract_string { -static extract_string_ctrl shell_extract_string_ctrl - (nullptr, "", "\"$`\\", "\n", "", "\n"); - struct test_def { test_def (const char *input, diff --git a/gdb/unittests/gdb_argv-selftests.c b/gdb/unittests/gdb_argv-selftests.c index b6a87254c93..b374b9b32ec 100644 --- a/gdb/unittests/gdb_argv-selftests.c +++ b/gdb/unittests/gdb_argv-selftests.c @@ -83,6 +83,9 @@ struct test_def /* The array of all tests. */ test_def tests[] = { + { "", { } }, + { " ", { } }, + { "''", { "" } }, { "abc def", {"abc", "def"} }, { "one two 3", {"one", "two", "3"} }, { "one two\\ three", {"one", "two three"} }, diff --git a/gdbsupport/buildargv.h b/gdbsupport/buildargv.h index 8fdd085b386..8e84affed39 100644 --- a/gdbsupport/buildargv.h +++ b/gdbsupport/buildargv.h @@ -1,4 +1,4 @@ -/* RAII wrapper for buildargv +/* RAII wrapper to split an argument string into individual arguments. Copyright (C) 2021-2024 Free Software Foundation, Inc. @@ -21,9 +21,12 @@ #define GDBSUPPORT_BUILDARGV_H #include "libiberty.h" +#include "gdbsupport/common-utils.h" +#include -/* A wrapper for an array of char* that was allocated in the way that - 'buildargv' does, and should be freed with 'freeargv'. */ +/* A class for splitting a single char* string into an array of char* + arguments, each argument is extracted from the original string. The + resulting array will be freed by this classes destructor. */ class gdb_argv { @@ -32,15 +35,15 @@ class gdb_argv /* A constructor that initializes to NULL. */ gdb_argv () - : m_argv (NULL) + : m_argv (nullptr) { } - /* A constructor that calls buildargv on STR. STR may be NULL, in - which case this object is initialized with a NULL array. */ + /* A constructor that splits STR into an array of arguments. STR may be + NULL, in which case this object is initialized with a NULL array. */ explicit gdb_argv (const char *str) - : m_argv (NULL) + : m_argv (nullptr) { reset (str); } @@ -74,18 +77,34 @@ class gdb_argv freeargv (m_argv); } - /* Call buildargv on STR, storing the result in this object. Any - previous state is freed. STR may be NULL, in which case this - object is reset with a NULL array. If buildargv fails due to - out-of-memory, call malloc_failure. Therefore, the value is - guaranteed to be non-NULL, unless the parameter itself is - NULL. */ + /* Read arguments from STR by calling extract_string_maybe_quoted. + Leading and trailing white space in STR will be ignored. Any previous + argument state is freed. STR may be nullptr, in which case this + object is reset with a nullptr array. */ void reset (const char *str) { - char **argv = buildargv (str); freeargv (m_argv); - m_argv = argv; + m_argv = nullptr; + + if (str == nullptr) + return; + + std::vector tmp_argv; + + str = skip_spaces (str); + while (*str != '\0') + { + std::string arg + = extract_string_maybe_quoted (&str, shell_extract_string_ctrl); + tmp_argv.emplace_back (xstrdup (arg.c_str ())); + str = skip_spaces (str); + } + tmp_argv.emplace_back (nullptr); + + size_t sz = sizeof (decltype (tmp_argv)::value_type) * tmp_argv.size (); + m_argv = (char **) xmalloc (sz); + memcpy (m_argv, tmp_argv.data (), sz); } /* Return the underlying array. */ @@ -197,7 +216,6 @@ class gdb_argv private: /* The wrapped array. */ - char **m_argv; }; diff --git a/gdbsupport/common-utils.cc b/gdbsupport/common-utils.cc index 15ee2c43f83..0fb2b6f350a 100644 --- a/gdbsupport/common-utils.cc +++ b/gdbsupport/common-utils.cc @@ -168,6 +168,11 @@ savestring (const char *ptr, size_t len) extract_string_ctrl default_extract_string_ctrl (nullptr, nullptr, nullptr); +/* See common-utils.h. */ + +extract_string_ctrl shell_extract_string_ctrl (nullptr, "", "\"$`\\", + "\n", "", "\n"); + /* See documentation in common-utils.h. */ std::string diff --git a/gdbsupport/common-utils.h b/gdbsupport/common-utils.h index 60578cd7560..bf603ee04e3 100644 --- a/gdbsupport/common-utils.h +++ b/gdbsupport/common-utils.h @@ -203,6 +203,17 @@ struct extract_string_ctrl extern extract_string_ctrl default_extract_string_ctrl; +/* This control object for use with extract_string_maybe_quoted replicates + shell like behaviour. Every character can be escaped in an unquoted + context, nothing can be escaped within single quotes, and a limited set + of characters (see POSIX shell spec) can be escaped within a double + quoted context. + + An escaped newline will be removed in an unquoted and double quoted + context. */ + +extern extract_string_ctrl shell_extract_string_ctrl; + /* Extract the next word from ARG. The next word is defined as either, everything up to the next space, or, if the next word starts with either a single or double quote, then everything up to the closing quote. The