From patchwork Sat Apr 20 09:10:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Burgess X-Patchwork-Id: 88799 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8E4BC3858C39 for ; Sat, 20 Apr 2024 09:13:11 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 6805F3858D37 for ; Sat, 20 Apr 2024 09:10:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6805F3858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6805F3858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713604242; cv=none; b=nIWJVjQffyy5cXd/CjQSvc8GL/J1CQUUNwWu+d4TUGcpCn4dbspXoqp62alBjM1bzbZvxy16on0usI4pKbxJjrrCeXrwlrGAVB+ZWUn/L9BKaa7X9wXpm6F5sm0LCv1KFTTfdokYX8Jus+VCol8ib04hAZAa7OPUcy3sFdCl8ks= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713604242; c=relaxed/simple; bh=jRSapzDqKfB+oEWdyfP7MkVZTwI7CgRUtWJMUvCmUAc=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=CAFJ3bXiG3GBXuHubPDVjJ5xBio+PFRH4jvV3Dc4cToNXfoGk5nBAHyZ/quGn4lN7YRVfDKVpvS+Zynakvug9/F2c9fUjwKRyuDrfdjq2SFdXcMrbnBxalzLswFwc8Q/UJXkWEhvu12dXBO1wfyx2wRvWHOtTqGf3PTuzZ0WOcY= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1713604239; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kPHHusjWe4fPJdDnWbLtdxDQqBUn1QVm+ToeOG9RzzU=; b=WEzXCiLM3SEoFaFrVAsFbdb2aOSgaUgJ1GgXD1Jak3+6ezQ3QZZkqC2r/04BV8gP+/yH4b Y4layshfooNsOb+C0SVpzCGUL8+11YiNoiXGzvuT5XOZuAwEWe3HcYFca2FACQlvDpnT+V 9Izfd56aT/WJvST2IjmQYfWxTrEaRWE= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-624-U4f5t-jAOdSlRUM6-nBhvQ-1; Sat, 20 Apr 2024 05:10:37 -0400 X-MC-Unique: U4f5t-jAOdSlRUM6-nBhvQ-1 Received: by mail-wr1-f72.google.com with SMTP id ffacd0b85a97d-34954c03af1so1195223f8f.2 for ; Sat, 20 Apr 2024 02:10:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713604235; x=1714209035; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kPHHusjWe4fPJdDnWbLtdxDQqBUn1QVm+ToeOG9RzzU=; b=fSNBFi6/jPRjEwMyUFcGh7h5v5UwM5Xj4b+kAhp3nU52VbZbGTWIVByqMXvbR6+sA2 v0HKXswwUnM28xVPIWOQNqVmLvIJtLs+vssIu2vrMiYdbPOBA1fqxrqoAz7sC9j2AWL9 +5EPEfSJOPa7hsIygl/7epWDQ/J45yKe2pMelNxLVyDzNvo78ShEWleYcZNiQJgRukxH pYTBplTVb/efR0hMUgn/bbNB5lhdJLxz5fE6nGUqWtZGjgxOJ/yW4IHua7fgyL67iFqA 0TbFr7OxkMbZwkZ8I82qEk/c0P/UVMLJUHl484cyhW3bWeAlrURjYS4HQBQaVQWo0dkD ZMTw== X-Gm-Message-State: AOJu0YxUZ4DD4sOFYItPpPoMQqukz8w4/EmEk3saV5Yvf9qpHxWpM3ze gsL4qR4ZzQwafTjEj0KoCl9VLNeZg2KjAT05YEO5Yig9PoKvXyJ/s8acNTRtvGMIjvXmCpYdxyr MNbjp4PqqVmJfc17+pJlgTXBqfL+9ZGouvexkjcYDiIfDpN1zCOfBYaiqSwCmGoA0YSH3gil5Dt dNRjnpx7oSeBJXBm7LPGm0WYax2LS2e+knLdx+2u/gKh8= X-Received: by 2002:adf:f78f:0:b0:347:d167:1097 with SMTP id q15-20020adff78f000000b00347d1671097mr2638643wrp.31.1713604235134; Sat, 20 Apr 2024 02:10:35 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF7KrdxXVQz9huzMQHD+UvEEKif0LvoaPNaA3LQqybebHwkge66Lak/L4baNzoPqr6M45WjrA== X-Received: by 2002:adf:f78f:0:b0:347:d167:1097 with SMTP id q15-20020adff78f000000b00347d1671097mr2638621wrp.31.1713604234200; Sat, 20 Apr 2024 02:10:34 -0700 (PDT) Received: from localhost (92.40.185.25.threembb.co.uk. [92.40.185.25]) by smtp.gmail.com with ESMTPSA id z2-20020adff742000000b00343ca138924sm6416988wrp.39.2024.04.20.02.10.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 20 Apr 2024 02:10:33 -0700 (PDT) From: Andrew Burgess To: gdb-patches@sourceware.org Cc: Andrew Burgess , Lancelot SIX , Eli Zaretskii Subject: [PATCHv2 8/8] gdb: improve gdb_rl_find_completion_word for quoted words Date: Sat, 20 Apr 2024 10:10:08 +0100 Message-Id: <1b691af28cbfb31ce744d99f2f571cf8a492a620.1713603416.git.aburgess@redhat.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: References: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_ABUSEAT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, RCVD_IN_SBL_CSS, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org The function gdb_rl_find_completion_word is very similar to the readline function _rl_find_completion_word, but was either an older version of that function, or was trimmed when copying to remove code which was considered unnecessary. We maintain this copy because the _rl_find_completion_word function is not part of the public readline API, and we need to replicate the functionality of that function as part of the 'complete' command. Within gdb_rl_find_completion_word when looking for the completion word, if we don't find a unclosed quoted string (which would become the completion word) then we scan backwards looking for a word break character. For example, given: (gdb) complete file /tmp/foo There is no unclosed quoted string so we end up scanning backwards from the end looking for a word break character. In this case the space after 'file' and before '/tmp/foo' is found, so '/tmp/foo' becomes the completion word. However, given this: (gdb) complete file /tmp/foo\" There is still no unclosed quoted string, however, when we can backwards the '"' (double quotes) are treated as a word break character, and so we end up using the empty string as the completion word. The readline function _rl_find_completion_word avoids this mistake by using the rl_char_is_quoted_p hook. This function will return true for the double quote character as it is preceded by a backslash. An earlier commit in this series supplied a rl_char_is_quoted_p function for the filename completion case, however, gdb_rl_find_completion_word doesn't call rl_char_is_quoted_p so this doesn't help for the 'complete' case. In this commit I've copied the code to call rl_char_is_quoted_p from _rl_find_completion_word into gdb_rl_find_completion_word. This half solves the problem. In the case: (gdb) complete file /tmp/foo\" We do now try to complete on the string '/tmp/foo\"', however, when we reach filename_completer we call back into readline to actually perform filename completion. However, at this point the WORD variable points to a string that still contains the backslash. The backslash isn't part of the actual filename, that's just an escape character. Our expectation is that readline will remove the backslash when looking for matching filenames. However, readline contains an optimisation to avoid unnecessary work trying to remove escape characters. The readline variable rl_completion_found_quote is set in the readline function gen_completion_matches before the generation of completion matches. This variable is set to true (non-zero) if there is (or might be) escape characters within the completion word. The function rl_filename_completion_function, which generates the filename matches, only removes escape characters when rl_completion_found_quote is true. When GDB generates completions through readline (e.g. tab completion) then rl_completion_found_quote is set correctly. But when we use the 'complete' command we don't pass through readline, and so gen_completion_matches is never called and rl_completion_found_quote is not set. In this case when we call rl_filename_completion_function readline doesn't remove the escapes from the completion word, and so in our case above, readline looks for completions of the exact filename '/tmp/foo\"', that is, the filename including the backslash. To work around this problem I've added a new flag to our function gdb_rl_find_completion_word which is set true when we find any quoting or escaping. This matches what readline does. Then in the 'complete' function we can set rl_completion_found_quote prior to generating completion matches. With this done the 'complete' command now works correctly when trying to complete filenames that contain escaped word break characters. The tests have been updated accordingly. --- gdb/completer.c | 60 +++++++++++++++---- .../gdb.base/filename-completion.exp | 12 ++-- 2 files changed, 54 insertions(+), 18 deletions(-) diff --git a/gdb/completer.c b/gdb/completer.c index 9c14e6a10ce..6a3428e2bbd 100644 --- a/gdb/completer.c +++ b/gdb/completer.c @@ -50,7 +50,8 @@ /* Forward declarations. */ static const char *completion_find_completion_word (completion_tracker &tracker, const char *text, - int *quote_char); + int *quote_char, + bool *found_any_quoting); static void set_rl_completer_word_break_characters (const char *break_chars); @@ -528,7 +529,9 @@ filename_completer_handle_brkchars boundaries of the current word. QC, if non-null, is set to the opening quote character if we found an unclosed quoted substring, '\0' otherwise. DP, if non-null, is set to the value of the - delimiter character that caused a word break. */ + delimiter character that caused a word break. FOUND_ANY_QUOTING, if + non-null, is set to true if we found any quote characters (single or + double quotes, or a backslash) while finding the completion word. */ struct gdb_rl_completion_word_info { @@ -539,7 +542,7 @@ struct gdb_rl_completion_word_info static const char * gdb_rl_find_completion_word (struct gdb_rl_completion_word_info *info, - int *qc, int *dp, + int *qc, int *dp, bool *found_any_quoting, const char *line_buffer) { int scan, end, delimiter, pass_next, isbrk; @@ -561,6 +564,7 @@ gdb_rl_find_completion_word (struct gdb_rl_completion_word_info *info, end = point; delimiter = 0; quote_char = '\0'; + bool found_quote = false; brkchars = info->word_break_characters; @@ -586,6 +590,7 @@ gdb_rl_find_completion_word (struct gdb_rl_completion_word_info *info, if (quote_char != '\'' && line_buffer[scan] == '\\') { pass_next = 1; + found_quote = true; continue; } @@ -606,6 +611,7 @@ gdb_rl_find_completion_word (struct gdb_rl_completion_word_info *info, /* Found start of a quoted substring. */ quote_char = line_buffer[scan]; point = scan + 1; + found_quote = true; } } } @@ -619,8 +625,22 @@ gdb_rl_find_completion_word (struct gdb_rl_completion_word_info *info, { scan = line_buffer[point]; - if (strchr (brkchars, scan) != 0) - break; + if (strchr (brkchars, scan) == 0) + continue; + + /* Call the application-specific function to tell us whether + this word break character is quoted and should be skipped. + The const_cast is needed here to comply with the readline + API. The only function we register for rl_char_is_quoted_p + treats the input buffer as 'const', so we're OK. */ + if (rl_char_is_quoted_p != nullptr && found_quote + && (*rl_char_is_quoted_p) (const_cast (line_buffer), + point)) + continue; + + /* Convoluted code, but it avoids an n^2 algorithm with calls + to char_is_quoted. */ + break; } } @@ -644,6 +664,8 @@ gdb_rl_find_completion_word (struct gdb_rl_completion_word_info *info, } } + if (found_any_quoting != nullptr) + *found_any_quoting = found_quote; if (qc != NULL) *qc = quote_char; if (dp != NULL) @@ -670,7 +692,7 @@ advance_to_completion_word (completion_tracker &tracker, int delimiter; const char *start - = gdb_rl_find_completion_word (&info, NULL, &delimiter, text); + = gdb_rl_find_completion_word (&info, nullptr, &delimiter, nullptr, text); tracker.advance_custom_word_point_by (start - text); @@ -732,7 +754,8 @@ complete_nested_command_line (completion_tracker &tracker, const char *text) int quote_char = '\0'; const char *word = completion_find_completion_word (tracker, text, - "e_char); + "e_char, + nullptr); if (tracker.use_custom_word_point ()) { @@ -1950,8 +1973,11 @@ complete (const char *line, char const **word, int *quote_char) try { + bool found_any_quoting = false; + *word = completion_find_completion_word (tracker_handle_brkchars, - line, quote_char); + line, quote_char, + &found_any_quoting); /* Completers that provide a custom word point in the handle_brkchars phase also compute their completions then. @@ -1961,6 +1987,12 @@ complete (const char *line, char const **word, int *quote_char) tracker = &tracker_handle_brkchars; else { + /* Setting this global matches what readline does within + gen_completion_matches. We need this set correctly in case + our completion function calls back into readline to perform + completion (e.g. filename_completer does this). */ + rl_completion_found_quote = found_any_quoting; + complete_line (tracker_handle_completions, *word, line, strlen (line)); tracker = &tracker_handle_completions; } @@ -2235,7 +2267,7 @@ gdb_completion_word_break_characters () noexcept static const char * completion_find_completion_word (completion_tracker &tracker, const char *text, - int *quote_char) + int *quote_char, bool *found_any_quoting) { size_t point = strlen (text); @@ -2245,6 +2277,13 @@ completion_find_completion_word (completion_tracker &tracker, const char *text, { gdb_assert (tracker.custom_word_point () > 0); *quote_char = tracker.quote_char (); + /* This isn't really correct, we're ignoring the case where we found + a backslash escaping a character. However, this isn't an issue + right now as we only rely on *FOUND_ANY_QUOTING being set when + performing filename completion, which doesn't go through this + path. */ + if (found_any_quoting != nullptr) + *found_any_quoting = *quote_char != '\0'; return text + tracker.custom_word_point (); } @@ -2254,7 +2293,8 @@ completion_find_completion_word (completion_tracker &tracker, const char *text, info.quote_characters = rl_completer_quote_characters; info.basic_quote_characters = rl_basic_quote_characters; - return gdb_rl_find_completion_word (&info, quote_char, NULL, text); + return gdb_rl_find_completion_word (&info, quote_char, nullptr, + found_any_quoting, text); } /* See completer.h. */ diff --git a/gdb/testsuite/gdb.base/filename-completion.exp b/gdb/testsuite/gdb.base/filename-completion.exp index 3ded82431c8..78f85ace8a1 100644 --- a/gdb/testsuite/gdb.base/filename-completion.exp +++ b/gdb/testsuite/gdb.base/filename-completion.exp @@ -167,10 +167,8 @@ proc run_quoting_and_escaping_tests { root } { if { $qc eq "'" } { set dq "\"" - set dq_re "\"" } else { set dq "\\\"" - set dq_re "\\\\\"" } test_gdb_complete_filename_multiple "file ${qc}${root}/bb2/" \ @@ -194,8 +192,8 @@ proc run_quoting_and_escaping_tests { root } { } "" "${qc}" false \ "expand filenames containing quotes" - test_gdb_complete_tab_unique "file ${qc}${root}/bb1/aa${dq}" \ - "file ${qc}${root}/bb1/aa${dq_re}bb${qc}" " " \ + test_gdb_complete_unique "file ${qc}${root}/bb1/aa${dq}" \ + "file ${qc}${root}/bb1/aa${dq}bb${qc}" " " false \ "expand unique filename containing double quotes" # It is not possible to include a single quote character @@ -206,14 +204,12 @@ proc run_quoting_and_escaping_tests { root } { if { $qc ne "'" } { if { $qc eq "" } { set sq "\\'" - set sq_re "\\\\'" } else { set sq "'" - set sq_re "'" } - test_gdb_complete_tab_unique "file ${qc}${root}/bb1/aa${sq}" \ - "file ${qc}${root}/bb1/aa${sq_re}bb${qc}" " " \ + test_gdb_complete_unique "file ${qc}${root}/bb1/aa${sq}" \ + "file ${qc}${root}/bb1/aa${sq}bb${qc}" " " false \ "expand unique filename containing single quote" } }