From patchwork Wed May 24 14:10:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Burgess X-Patchwork-Id: 69957 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A8F1D3839069 for ; Wed, 24 May 2023 14:11:50 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A8F1D3839069 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1684937510; bh=jpEyo0g0fgc7RI3BXQbBQcXg5idh0gi8gfJNNeau7vw=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=KXb0341+9dndG1pyBdg9jnmvS6Ad20GJ60eD2m159JcByYpdNNzAmab+Q0N1MpLwg F2+3MNUo1e68FxVhqy/I2VYSNHjinQhUC29HFRQLAWtNq8NuWyRey6Npxsn3RibB/+ sw8ZRlvzn5Vl+mTXnY3czQ+1VUW2uMkmoMQqaEkg= X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id EA95E385734A for ; Wed, 24 May 2023 14:10:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EA95E385734A Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-611-i7rDbK-CPwGkf9gAaSD_Ag-1; Wed, 24 May 2023 10:10:40 -0400 X-MC-Unique: i7rDbK-CPwGkf9gAaSD_Ag-1 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-3f612a1b0fdso5176945e9.2 for ; Wed, 24 May 2023 07:10:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684937438; x=1687529438; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jpEyo0g0fgc7RI3BXQbBQcXg5idh0gi8gfJNNeau7vw=; b=AmkNc2S8AluTx7lxKiyW/1sSpgFE88MXOr2of6cLdB3tXMBMHb9vBRLJNzonG+BWES 4UTRl7OtOCsFhOyOCTvkxW/58y1QtA48S72pibt48Fe5AyQpEb7Jd3fKizjyDQxnPLpj RD+elItk0x8z1qtxYch4ZhPAE96fmSs4W1OyMeyO04+3zJG6G9C7FEU8nY/IvEJWVSAy XmHglSkmiC2d5n8uWcn9j3qpbWGKqqFji45Vck47LI6MtL2RNHbZY76mC92FBImOSi6R 6mYwjGTnTZIBK/an5b4fN34+MSQqHhECBRusQNNVkPDBYxllvwJHJnHKh/AnPRMlE1kW Dofw== X-Gm-Message-State: AC+VfDzpBBW52iLLwwOUMiPr4idgji3FkxJMug3FQmKpvtVqpmEjps97 mOShZQxIieQRmV0Zs9r0yLBH3qcxktsBCLTAPO3HVJojvu6Gper8ZgvDmTvKxwGGV9icbjGL2oD Zr7gXguc3kI/gucsI4HNhQuLqZKrq7PzJ628tDgTHuZMe6ZKV8rvM62fiFbjXoM4faJfdp/htEO j9UPKgbg== X-Received: by 2002:a05:600c:364b:b0:3f6:c7b:d3c8 with SMTP id y11-20020a05600c364b00b003f60c7bd3c8mr4324186wmq.16.1684937438132; Wed, 24 May 2023 07:10:38 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7Ija/8R44NdcUDNwmbCuAgVOBYPADIBkhSAOeoYdbiFm3dFHh21FRPbR/D17OTX46iQKpbZQ== X-Received: by 2002:a05:600c:364b:b0:3f6:c7b:d3c8 with SMTP id y11-20020a05600c364b00b003f60c7bd3c8mr4324146wmq.16.1684937437197; Wed, 24 May 2023 07:10:37 -0700 (PDT) Received: from localhost (11.72.115.87.dyn.plus.net. [87.115.72.11]) by smtp.gmail.com with ESMTPSA id v18-20020a05600c215200b003f42ceb3bf4sm2486214wml.32.2023.05.24.07.10.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 May 2023 07:10:36 -0700 (PDT) To: gdb-patches@sourceware.org Cc: Pedro Alves , Simon Marchi , Andrew Burgess , Simon Marchi Subject: [PATCHv3] gdb: building inferior strings from within GDB Date: Wed, 24 May 2023 15:10:32 +0100 Message-Id: X-Mailer: git-send-email 2.25.4 In-Reply-To: <75628df1fbd72166504c9b2a368170121b86e072.1680849242.git.aburgess@redhat.com> References: <75628df1fbd72166504c9b2a368170121b86e072.1680849242.git.aburgess@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Andrew Burgess via Gdb-patches From: Andrew Burgess Reply-To: Andrew Burgess Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Sender: "Gdb-patches" History Of This Patch ===================== This commit aims to address PR gdb/21699. There have now been a couple of attempts to fix this issue. Simon originally posted two patches back in 2021: https://sourceware.org/pipermail/gdb-patches/2021-July/180894.html https://sourceware.org/pipermail/gdb-patches/2021-July/180896.html Before Pedro then posted a version of his own: https://sourceware.org/pipermail/gdb-patches/2021-July/180970.html After this the conversation halted. Then in 2023 I (Andrew) also took a look at this bug and posted two versions: https://sourceware.org/pipermail/gdb-patches/2023-April/198570.html https://sourceware.org/pipermail/gdb-patches/2023-April/198680.html The approach taken in my first patch was pretty similar to what Simon originally posted back in 2021. My second attempt was only a slight variation on the first. Pedro then pointed out his older patch, and so we arrive at this patch. The GDB changes here are mostly Pedro's work, but updated by me (Andrew), any mistakes are mine. The tests here are a combinations of everyone's work, and the commit message is new, but copies bits from everyone's earlier work. Problem Description =================== Bug PR gdb/21699 makes the observation that using $_as_string with GDB's printf can cause GDB to print unexpected data from the inferior. The reproducer is pretty simple: #include static char arena[100]; /* Override malloc() so value_coerce_to_target() gets a known pointer, and we know we"ll see an error if $_as_string() gives a string that isn't null terminated. */ void *malloc (size_t size) { memset (arena, 'x', sizeof (arena)); if (size > sizeof (arena)) return NULL; return arena; } int main () { return 0; } And then in a GDB session: $ gdb -q test Reading symbols from /tmp/test... (gdb) start Temporary breakpoint 1 at 0x4004c8: file test.c, line 17. Starting program: /tmp/test Temporary breakpoint 1, main () at test.c:17 17 return 0; (gdb) printf "%s\n", $_as_string("hello") "hello"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx (gdb) quit The problem above is caused by how value_cstring is used within py-value.c, but once we understand the issue then it turns out that value_cstring is used in an unexpected way in many places within GDB. Within py-value.c we have a null-terminated C-style string. We then pass a pointer to this string, along with the length of this string (so not including the null-character) to value_cstring. In value_cstring GDB allocates an array value of the given character type, and copies in requested number of characters. However value_cstring does not add a null-character of its own. This means that the value created by calling value_cstring is only null-terminated if the null-character is included in the passed in length. In py-value.c this is not the case, and indeed, in most uses of value_cstring, this is not the case. When GDB tries to print one of these strings the value contents are pushed to the inferior, and then read back as a C-style string, that is, GDB reads inferior memory until it finds a null-terminator. For the py-value.c case, no null-terminator is pushed into the inferior, so GDB will continue reading inferior memory until a null-terminator is found, with unpredictable results. Patch Description ================= The first thing this patch does is better define what the arguments for the two function value_cstring and value_string should represent. The comments in the header file are updated to describe whether the length argument should, or should not, include a null-character. Also, the data argument is changed to type gdb_byte. The functions as they currently exist will handle wide-characters, in which case more than one 'char' would be needed for each character. As such using gdb_byte seems to make more sense. To avoid adding casts throughout GDB, I've also added an overload that still takes a 'char *', but asserts that the character type being used is of size '1'. The value_cstring function is now responsible for adding a null character at the end of the string value it creates. However, once we start looking at how value_cstring is used, we realise there's another, related, problem. Not every language's strings are null terminated. Fortran and Ada strings, for example, are just an array of characters, GDB already has the function value_string which can be used to create such values. Consider this example using current GDB: (gdb) set language ada (gdb) p $_gdb_setting("arch") $1 = (97, 117, 116, 111) (gdb) ptype $ type = array (1 .. 4) of char (gdb) p $_gdb_maint_setting("test-settings string") $2 = (0) (gdb) ptype $ type = array (1 .. 1) of char This shows two problems, first, the $_gdb_setting and $_gdb_maint_setting functions are calling value_cstring using the builtin_char character, rather than a language appropriate type. In the first call, the 'arch' case, the value_cstring call doesn't include the null character, so the returned array only contains the expected characters. But, in the $_gdb_maint_setting example we do end up including the null-character, even though this is not expected for Ada strings. This commit adds a new language method language_defn::value_string, this function takes a pointer and length and creates a language appropriate value that represents the string. For C, C++, etc this will be a null-terminated string (by calling value_cstring), and for Fortran and Ada this can be a bounded array of characters with no null terminator. Additionally, this new language_defn::value_string function is responsible for selecting a language appropriate character type. After this commit the only calls to value_cstring are from the C expression evaluator and from the default language_defn::value_string. And the only calls to value_string are from Fortan, Ada, and ObjectC related code. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=21699 Co-Authored-By: Simon Marchi Co-Authored-By: Andrew Burgess Co-Authored-By: Pedro Alves --- gdb/ada-lang.c | 13 + gdb/c-lang.c | 14 +- gdb/cli/cli-cmds.c | 16 +- gdb/f-lang.c | 10 + gdb/f-lang.h | 5 + gdb/guile/scm-math.c | 4 +- gdb/language.c | 10 + gdb/language.h | 6 + gdb/python/py-value.c | 8 +- .../gdb.base/internal-string-values.c | 32 ++ .../gdb.base/internal-string-values.exp | 279 ++++++++++++++++++ .../gdb.base/print-internal-string.c | 56 ++++ .../gdb.base/print-internal-string.exp | 64 ++++ gdb/testsuite/gdb.base/settings.exp | 2 +- gdb/testsuite/gdb.python/py-mi.exp | 2 +- gdb/valops.c | 23 +- gdb/value.c | 5 +- gdb/value.h | 41 ++- 18 files changed, 546 insertions(+), 44 deletions(-) create mode 100644 gdb/testsuite/gdb.base/internal-string-values.c create mode 100644 gdb/testsuite/gdb.base/internal-string-values.exp create mode 100644 gdb/testsuite/gdb.base/print-internal-string.c create mode 100644 gdb/testsuite/gdb.base/print-internal-string.exp base-commit: e84060b489746d031ed1ec9e7b6b39fdf4b6cfe3 diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c index 21f3348a161..567db6ddab7 100644 --- a/gdb/ada-lang.c +++ b/gdb/ada-lang.c @@ -13526,6 +13526,19 @@ class ada_language : public language_defn return symbol->is_artificial (); } + /* See language.h. */ + struct value *value_string (struct gdbarch *gdbarch, + const char *ptr, ssize_t len) const override + { + struct type *type = language_string_char_type (this, gdbarch); + value *val = ::value_string (ptr, len, type); + /* VAL will be a TYPE_CODE_STRING, but Ada only knows how to print + strings that are arrays of characters, so fix the type now. */ + gdb_assert (val->type ()->code () == TYPE_CODE_STRING); + val->type ()->set_code (TYPE_CODE_ARRAY); + return val; + } + /* See language.h. */ void language_arch_info (struct gdbarch *gdbarch, struct language_arch_info *lai) const override diff --git a/gdb/c-lang.c b/gdb/c-lang.c index 484f81e455b..b71457bd25f 100644 --- a/gdb/c-lang.c +++ b/gdb/c-lang.c @@ -652,16 +652,11 @@ c_string_operation::evaluate (struct type *expect_type, } else { - int i; - - /* Write the terminating character. */ - for (i = 0; i < type->length (); ++i) - obstack_1grow (&output, 0); + int element_size = type->length (); if (satisfy_expected) { LONGEST low_bound, high_bound; - int element_size = type->length (); if (!get_discrete_bounds (expect_type->index_type (), &low_bound, &high_bound)) @@ -676,10 +671,13 @@ c_string_operation::evaluate (struct type *expect_type, result = value::allocate (expect_type); memcpy (result->contents_raw ().data (), obstack_base (&output), obstack_object_size (&output)); + /* Write the terminating character. */ + memset (result->contents_raw ().data () + obstack_object_size (&output), + 0, element_size); } else - result = value_cstring ((const char *) obstack_base (&output), - obstack_object_size (&output), + result = value_cstring ((const gdb_byte *) obstack_base (&output), + obstack_object_size (&output) / element_size, type); } return result; diff --git a/gdb/cli/cli-cmds.c b/gdb/cli/cli-cmds.c index b7b65303a0b..3fa833d596c 100644 --- a/gdb/cli/cli-cmds.c +++ b/gdb/cli/cli-cmds.c @@ -2316,11 +2316,9 @@ value_from_setting (const setting &var, struct gdbarch *gdbarch) } if (len > 0) - return value_cstring (value, len, - builtin_type (gdbarch)->builtin_char); + return current_language->value_string (gdbarch, value, len); else - return value_cstring ("", 1, - builtin_type (gdbarch)->builtin_char); + return current_language->value_string (gdbarch, "", 0); } default: gdb_assert_not_reached ("bad var_type"); @@ -2372,8 +2370,8 @@ str_value_from_setting (const setting &var, struct gdbarch *gdbarch) { std::string cmd_val = get_setshow_command_value_string (var); - return value_cstring (cmd_val.c_str (), cmd_val.size (), - builtin_type (gdbarch)->builtin_char); + return current_language->value_string (gdbarch, cmd_val.c_str (), + cmd_val.size ()); } case var_string: @@ -2401,11 +2399,9 @@ str_value_from_setting (const setting &var, struct gdbarch *gdbarch) } if (len > 0) - return value_cstring (value, len, - builtin_type (gdbarch)->builtin_char); + return current_language->value_string (gdbarch, value, len); else - return value_cstring ("", 1, - builtin_type (gdbarch)->builtin_char); + return current_language->value_string (gdbarch, "", 0); } default: gdb_assert_not_reached ("bad var_type"); diff --git a/gdb/f-lang.c b/gdb/f-lang.c index 365e0c0489b..7ab2a7b0688 100644 --- a/gdb/f-lang.c +++ b/gdb/f-lang.c @@ -101,6 +101,16 @@ f_language::get_encoding (struct type *type) return encoding; } +/* See language.h. */ + +struct value * +f_language::value_string (struct gdbarch *gdbarch, + const char *ptr, ssize_t len) const +{ + struct type *type = language_string_char_type (this, gdbarch); + return ::value_string (ptr, len, type); +} + /* A helper function for the "bound" intrinsics that checks that TYPE is an array. LBOUND_P is true for lower bound; this is used for the error message, if any. */ diff --git a/gdb/f-lang.h b/gdb/f-lang.h index 673e273d31a..0a29380d31d 100644 --- a/gdb/f-lang.h +++ b/gdb/f-lang.h @@ -203,6 +203,11 @@ class f_language : public language_defn /* See language.h. */ + struct value *value_string (struct gdbarch *gdbarch, + const char *ptr, ssize_t len) const override; + + /* See language.h. */ + const char *struct_too_deep_ellipsis () const override { return "(...)"; } diff --git a/gdb/guile/scm-math.c b/gdb/guile/scm-math.c index 49fe93b97f8..32595cf0a68 100644 --- a/gdb/guile/scm-math.c +++ b/gdb/guile/scm-math.c @@ -803,9 +803,7 @@ vlscm_convert_typed_value_from_scheme (const char *func_name, 0 /*non-strict*/, &except_scm); if (s != NULL) - value = value_cstring (s.get (), len, - language_string_char_type (language, - gdbarch)); + value = language->value_string (gdbarch, s.get (), len); else value = NULL; } diff --git a/gdb/language.c b/gdb/language.c index f82c5b173b3..c768971be42 100644 --- a/gdb/language.c +++ b/gdb/language.c @@ -874,6 +874,16 @@ language_string_char_type (const struct language_defn *la, /* See language.h. */ +struct value * +language_defn::value_string (struct gdbarch *gdbarch, + const char *ptr, ssize_t len) const +{ + struct type *type = language_string_char_type (this, gdbarch); + return value_cstring (ptr, len, type); +} + +/* See language.h. */ + struct type * language_bool_type (const struct language_defn *la, struct gdbarch *gdbarch) diff --git a/gdb/language.h b/gdb/language.h index 0ebd4c1d957..e4b1e53b279 100644 --- a/gdb/language.h +++ b/gdb/language.h @@ -602,6 +602,12 @@ struct language_defn virtual char string_lower_bound () const { return c_style_arrays_p () ? 0 : 1; } + /* Return the LEN characters long string at PTR as a value suitable for + this language. GDBARCH is used to infer the character type. The + default implementation returns a null-terminated C string. */ + virtual struct value *value_string (struct gdbarch *gdbarch, + const char *ptr, ssize_t len) const; + /* Returns true if the symbols names should be stored in GDB's data structures for minimal/partial/full symbols using their linkage (aka mangled) form; false if the symbol names should be demangled first. diff --git a/gdb/python/py-value.c b/gdb/python/py-value.c index 6c62820c63b..a518965ae46 100644 --- a/gdb/python/py-value.c +++ b/gdb/python/py-value.c @@ -54,9 +54,6 @@ #define builtin_type_pybool \ language_bool_type (current_language, gdbpy_enter::get_gdbarch ()) -#define builtin_type_pychar \ - language_string_char_type (current_language, gdbpy_enter::get_gdbarch ()) - struct value_object { PyObject_HEAD struct value_object *next; @@ -1881,8 +1878,9 @@ convert_value_from_python (PyObject *obj) gdb::unique_xmalloc_ptr s = python_string_to_target_string (obj); if (s != NULL) - value = value_cstring (s.get (), strlen (s.get ()), - builtin_type_pychar); + value + = current_language->value_string (gdbpy_enter::get_gdbarch (), + s.get (), strlen (s.get ())); } else if (PyObject_TypeCheck (obj, &value_object_type)) value = ((value_object *) obj)->value->copy (); diff --git a/gdb/testsuite/gdb.base/internal-string-values.c b/gdb/testsuite/gdb.base/internal-string-values.c new file mode 100644 index 00000000000..e89f71707f7 --- /dev/null +++ b/gdb/testsuite/gdb.base/internal-string-values.c @@ -0,0 +1,32 @@ +/* This testcase is part of GDB, the GNU debugger. + + Copyright 2021-2023 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +static void +trace_me (void) +{} + +static void +end (void) +{} + +int +main (void) +{ + trace_me (); + end (); + return 0; +} diff --git a/gdb/testsuite/gdb.base/internal-string-values.exp b/gdb/testsuite/gdb.base/internal-string-values.exp new file mode 100644 index 00000000000..ddf38a6f482 --- /dev/null +++ b/gdb/testsuite/gdb.base/internal-string-values.exp @@ -0,0 +1,279 @@ +# Copyright 2021-2023 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# Test that string values are correctly allocated inside GDB when doing +# various operations that yield strings. +# +# The issue that lead to this test was a missing NULL terminator in the +# C-string values. We verify that we can print the null terminator of these +# strings. + +load_lib "trace-support.exp" +load_lib "gdb-guile.exp" + +standard_testfile + +if {[build_executable "failed to prepare" $testfile $srcfile ]} { + return +} + +set user_conv_funcs {$_gdb_setting $_gdb_setting_str} +set maint_conv_funcs {$_gdb_maint_setting $_gdb_maint_setting_str} + +# Add language (LANG) appropriate quotation marks around string STR. +proc quote_for_lang {lang str} { + if {$lang == "fortran"} { + return "'$str'" + } else { + return "\"$str\"" + } +} + +# Check that the string contained in the convenienced variable $v is +# EXPECTED_STR. +# +# In particular, check that the null terminator is there and that we can't +# access a character past the end of the string. + +proc check_v_string { expected_str } { + set len [string length $expected_str] + + for { set i 0 } { $i < $len } { incr i } { + set c [string index $expected_str $i] + gdb_test "print \$v\[$i\]" "= $::decimal '$c'" + } + + # Check that the string ends with a null terminator. + gdb_test "print \$v\[$i\]" {= 0 '\\000'} + + # Check that we can't access a character after the end of the string. + incr i + gdb_test "print \$v\[$i\]" "no such vector element" +} + +# Test with string values made by $_gdb_setting & co. + +proc_with_prefix test_setting { } { + clean_restart + + # This is an internal GDB implementation detail, but the variable backing + # a string setting starts as nullptr (unless explicitly initialized at + # startup). When assigning an empty value, the variable then points to an + # empty string. Test both cases, as it triggers different code paths (in + # addition to a non-empty value). + # + # Use "set trace-user" and "maintenance set test-settings string" as they + # are both not initialized at startup. + with_test_prefix "user setting" { + with_test_prefix "not set" { + foreach_with_prefix conv_func $::user_conv_funcs { + gdb_test_no_output "set \$v = ${conv_func}(\"trace-user\")" + check_v_string "" + } + } + + with_test_prefix "set to empty" { + gdb_test "set trace-user" + foreach_with_prefix conv_func $::user_conv_funcs { + gdb_test_no_output "set \$v = ${conv_func}(\"trace-user\")" + check_v_string "" + } + } + + with_test_prefix "set" { + gdb_test "set trace-user poulet" + foreach_with_prefix conv_func $::user_conv_funcs { + gdb_test_no_output {set $v = $_gdb_setting("trace-user")} + check_v_string "poulet" + } + } + } + + with_test_prefix "maintenance setting" { + with_test_prefix "not set" { + foreach_with_prefix conv_func $::maint_conv_funcs { + gdb_test_no_output \ + "set \$v = ${conv_func}(\"test-settings string\")" + check_v_string "" + } + } + + with_test_prefix "set to empty" { + gdb_test "maintenance set test-settings string" + foreach_with_prefix conv_func $::maint_conv_funcs { + gdb_test_no_output \ + "set \$v = ${conv_func}(\"test-settings string\")" + check_v_string "" + } + } + + with_test_prefix "set" { + gdb_test "maintenance set test-settings string perchaude" + foreach_with_prefix conv_func $::maint_conv_funcs { + gdb_test_no_output \ + "set \$v = ${conv_func}(\"test-settings string\")" + check_v_string "perchaude" + } + } + } + + # Test with a non-string setting, this tests yet another code path. + with_test_prefix "integer setting" { + gdb_test_no_output {set $v = $_gdb_setting_str("remotetimeout")} + check_v_string "2" + } + + # Test string values made by $_gdb_setting & co. in all languages. + with_test_prefix "all langs" { + # Get list of supported languages. + set langs [gdb_supported_languages] + + gdb_test "maintenance set test-settings string foo" + foreach_with_prefix lang $langs { + gdb_test_no_output "set language $lang" + + if {$lang == "modula-2"} { + # The Modula-2 parser doesn't know how to build a + # suitable string expression. + gdb_test "print \"foo\"" "strings are not implemented" + continue + } + + if {$lang == "rust"} { + # Rust strings are actually structs, without a running + # inferior into which the string data can be pushed + # GDB can't print anything. + gdb_test "print \"foo\"" \ + "evaluation of this expression requires the target program to be active" + gdb_test "print \$_gdb_maint_setting(\"test-settings string\")" \ + "evaluation of this expression requires the target program to be active" + continue + } + + if {$lang == "unknown"} { + # Skipped because expression parsing is not supported + # for the "unknown" language. See gdb/28093 for more + # details. + continue + } + + set print_output "" + set ptype_output "" + + set foo_str [quote_for_lang $lang foo] + gdb_test_multiple "print $foo_str" "" { + -wrap -re " = (.*)" { + set print_output $expect_out(1,string) + pass $gdb_test_name + } + } + + gdb_test_multiple "ptype $foo_str" "" { + -wrap -re " = (.*)" { + set ptype_output $expect_out(1,string) + pass $gdb_test_name + } + } + + set cmd_str [quote_for_lang $lang "test-settings string"] + set ptype_output_re [string_to_regexp $ptype_output] + set print_output_re [string_to_regexp $print_output] + + foreach_with_prefix conv_func $::maint_conv_funcs { + gdb_test "print ${conv_func}($cmd_str)" \ + " = $print_output_re" + gdb_test "ptype \$" \ + " = $ptype_output_re" + } + } + } +} + +# Test with a string value created by gdb.Value in Python. + +proc_with_prefix test_python_value { } { + clean_restart + + if {![allow_python_tests]} { + untested "skipping test_python_value" + return + } + + gdb_test_no_output "python gdb.set_convenience_variable(\"v\", \"bar\")" \ + "set convenience var" + check_v_string "bar" +} + +# Test with a string value created by make-value in Guile. + +proc_with_prefix test_guile_value { } { + clean_restart + + if {![allow_guile_tests]} { + untested "skipping test_guile_value" + return + } + + # We can't set a convenience var from Guile, but we can append to history. + # Do that, then transfer to a convenience var with a CLI command. + gdb_test_no_output "guile (use-modules (gdb))" + gdb_test_multiple "guile (history-append! (make-value \"foo\"))" "make value" { + -re -wrap "($::decimal)" { + set histnum $expect_out(1,string) + } + } + + gdb_test_no_output "set \$v = \$$histnum" + check_v_string "foo" +} + +# Test with a string value coming from a string internal var. The only internal +# vars of this type, at the time of writing, are $trace_func and $trace_file. +# They both require inspecting a trace frame. So if the target is capable start +# tracing, record one trace frame, and use $trace_func. + +proc_with_prefix test_internal_var { } { + if {![gdb_trace_common_supports_arch]} { + unsupported "arch does not support trace" + return + } + + clean_restart $::binfile + + if {![runto_main]} { + fail "could not run to main" + return + } + + if {![gdb_target_supports_trace]} { + unsupported "target does not support trace" + return + } + + gdb_breakpoint "end" + gdb_test "trace trace_me" "Tracepoint $::decimal at $::hex.*" + gdb_test_no_output "tstart" + gdb_continue_to_breakpoint "breakpoint at end" + gdb_test_no_output "tstop" + gdb_test "tfind" "Found trace frame 0, tracepoint $::decimal.*" + gdb_test_no_output "set \$v = \$trace_func" + gdb_test "tfind none" "No longer looking at any trace frame.*" + check_v_string "trace_me" +} + +test_setting +test_python_value +test_guile_value +test_internal_var diff --git a/gdb/testsuite/gdb.base/print-internal-string.c b/gdb/testsuite/gdb.base/print-internal-string.c new file mode 100644 index 00000000000..09dab882ae1 --- /dev/null +++ b/gdb/testsuite/gdb.base/print-internal-string.c @@ -0,0 +1,56 @@ +/* This testcase is part of GDB, the GNU debugger. + + Copyright 2023 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include +#include + +/* A memory area used as the malloc memory buffer. */ + +static char arena[256]; + +/* Override malloc(). When GDB tries to push strings into the inferior we + will always return the same pointer (to arena). This does mean we can't + have multiple strings in use at the same time, but that's fine for this + simple test. On each malloc call the contents of arena are reset, which + should make it more obvious if GDB tried to print memory that it + shouldn't. */ + +void * +malloc (size_t size) +{ + /* Reset the contents of arena, and ensure there's a null-character at + the end just in case GDB tries to print memory that it shouldn't. */ + memset (arena, 'X', sizeof (arena)); + arena [sizeof (arena) - 1] = '\0'; + if (size > sizeof (arena)) + return NULL; + return arena; +} + +/* This function is called from GDB. */ + +void +take_string (const char *str) +{ + /* Nothing. */ +} + +int +main (void) +{ + return 0; +} diff --git a/gdb/testsuite/gdb.base/print-internal-string.exp b/gdb/testsuite/gdb.base/print-internal-string.exp new file mode 100644 index 00000000000..be5b2f38206 --- /dev/null +++ b/gdb/testsuite/gdb.base/print-internal-string.exp @@ -0,0 +1,64 @@ +# Copyright 2023 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# Test different ways in which GDB can create a string and then push +# that string into the inferior before reading it back. Check that +# the thing that is read back is correctly interpreted as a string. + +standard_testfile + +if { [prepare_for_testing "failed to prepare" $testfile $srcfile] } { + return -1 +} + +if {![runto_main]} { + return 0 +} + +if [allow_python_tests] { + # The $_as_string convenience function is implemented in Python. + gdb_test {printf "%s\n", $_as_string("aabbcc")} "\"aabbcc\"" + + # Create a gdb.Value object for a string. Take its address (which + # forces it into the inferior), and then print the address as a + # string. + gdb_test_no_output {python addr = gdb.Value("ccbbaa").address} + gdb_test {python gdb.execute("x/1s 0x%x" % addr)} \ + "$hex :\\s+\"ccbbaa\"" + + # Call an inferior function through a gdb.Value object, pass a + # string to the inferior function and ensure it arrives correctly. + gdb_test "p/x take_string" " = $hex.*" + gdb_test_no_output "python func_ptr = gdb.history (0)" \ + "place address of take_string into Python variable" + gdb_test "python func_value = func_ptr.dereference()" "" + gdb_breakpoint "take_string" + gdb_test {python result = func_value("qqaazz")} \ + "Breakpoint $decimal, take_string \\(str=$hex \"qqaazz\"\\) at .*" + gdb_test "continue" "Continuing\\." +} + +# Use printf on a string parsed by the C expression parser. +gdb_test {printf "%s\n", "ddeeff"} "ddeeff" + +# Parse a string in the C expression parser, force it into the +# inferior by taking its address, then print it as a string. +gdb_test {x/1s &"gghhii"} "$hex :\\s+\"gghhii\"" + +# Use $_gdb_setting_str and $_gdb_maint_setting_str to create a string +# value, and then print using printf, which forces the string into the +# inferior. +gdb_test {printf "%s\n", $_gdb_setting_str("arch")} "auto" +gdb_test {printf "%s\n", $_gdb_maint_setting_str("bfd-sharing")} "on" diff --git a/gdb/testsuite/gdb.base/settings.exp b/gdb/testsuite/gdb.base/settings.exp index ac885d838a1..dc96f85c1bb 100644 --- a/gdb/testsuite/gdb.base/settings.exp +++ b/gdb/testsuite/gdb.base/settings.exp @@ -504,7 +504,7 @@ proc_with_prefix test-enum {} { gdb_test_no_output "$set_cmd zzz" show_setting "$show_cmd" "zzz" 0 "yyy" - check_type "test-settings enum" "type = char \\\[3\\\]" + check_type "test-settings enum" "type = char \\\[4\\\]" test_gdb_complete_multiple "$set_cmd " "" "" { "xxx" diff --git a/gdb/testsuite/gdb.python/py-mi.exp b/gdb/testsuite/gdb.python/py-mi.exp index 97b8a42f715..c65c83553f1 100644 --- a/gdb/testsuite/gdb.python/py-mi.exp +++ b/gdb/testsuite/gdb.python/py-mi.exp @@ -288,7 +288,7 @@ mi_create_dynamic_varobj nstype2 nstype2 ".*" 1 \ "create nstype2 varobj" mi_list_varobj_children nstype2 { - { {nstype2.} {} 6 {char \[6\]} } + { {nstype2.} {} 7 {char \[7\]} } } "list children after setting exception flag" mi_create_varobj me me \ diff --git a/gdb/valops.c b/gdb/valops.c index 22be4805a52..5806003c3b6 100644 --- a/gdb/valops.c +++ b/gdb/valops.c @@ -1738,39 +1738,38 @@ value_array (int lowbound, int highbound, struct value **elemvec) return val; } +/* See value.h. */ + struct value * -value_cstring (const char *ptr, ssize_t len, struct type *char_type) +value_cstring (const gdb_byte *ptr, ssize_t count, struct type *char_type) { struct value *val; int lowbound = current_language->string_lower_bound (); - ssize_t highbound = len / char_type->length (); + ssize_t highbound = count + 1; struct type *stringtype = lookup_array_range_type (char_type, lowbound, highbound + lowbound - 1); val = value::allocate (stringtype); + ssize_t len = count * char_type->length (); memcpy (val->contents_raw ().data (), ptr, len); + /* Write the terminating null-character. */ + memset (val->contents_raw ().data () + len, 0, char_type->length ()); return val; } -/* Create a value for a string constant by allocating space in the - inferior, copying the data into that space, and returning the - address with type TYPE_CODE_STRING. PTR points to the string - constant data; LEN is number of characters. - - Note that string types are like array of char types with a lower - bound of zero and an upper bound of LEN - 1. Also note that the - string may contain embedded null bytes. */ +/* See value.h. */ struct value * -value_string (const char *ptr, ssize_t len, struct type *char_type) +value_string (const gdb_byte *ptr, ssize_t count, struct type *char_type) { struct value *val; int lowbound = current_language->string_lower_bound (); - ssize_t highbound = len / char_type->length (); + ssize_t highbound = count; struct type *stringtype = lookup_string_range_type (char_type, lowbound, highbound + lowbound - 1); val = value::allocate (stringtype); + ssize_t len = count * char_type->length (); memcpy (val->contents_raw ().data (), ptr, len); return val; } diff --git a/gdb/value.c b/gdb/value.c index 980722a6dd7..95178682284 100644 --- a/gdb/value.c +++ b/gdb/value.c @@ -2044,8 +2044,9 @@ value_of_internalvar (struct gdbarch *gdbarch, struct internalvar *var) break; case INTERNALVAR_STRING: - val = value_cstring (var->u.string, strlen (var->u.string), - builtin_type (gdbarch)->builtin_char); + val = current_language->value_string (gdbarch, + var->u.string, + strlen (var->u.string)); break; case INTERNALVAR_VALUE: diff --git a/gdb/value.h b/gdb/value.h index d042d816409..7a0b74ad6e9 100644 --- a/gdb/value.h +++ b/gdb/value.h @@ -1182,11 +1182,48 @@ class scoped_value_mark const struct value *m_value; }; -extern struct value *value_cstring (const char *ptr, ssize_t len, +/* Create not_lval value representing a NULL-terminated C string. The + resulting value has type TYPE_CODE_ARRAY. The string passed in should + not include embedded null characters. + + PTR points to the string data; COUNT is number of characters (does + not include the NULL terminator) pointed to by PTR, each character is of + type (and size of) CHAR_TYPE. */ + +extern struct value *value_cstring (const gdb_byte *ptr, ssize_t count, struct type *char_type); -extern struct value *value_string (const char *ptr, ssize_t len, + +/* Specialisation of value_cstring above. In this case PTR points to + single byte characters. CHAR_TYPE must have a length of 1. */ +inline struct value *value_cstring (const char *ptr, ssize_t count, + struct type *char_type) +{ + gdb_assert (char_type->length () == 1); + return value_cstring ((const gdb_byte *) ptr, count, char_type); +} + +/* Create a not_lval value with type TYPE_CODE_STRING, the resulting value + has type TYPE_CODE_STRING. + + PTR points to the string data; COUNT is number of characters pointed to + by PTR, each character has the type (and size of) CHAR_TYPE. + + Note that string types are like array of char types with a lower bound + defined by the language (usually zero or one). Also the string may + contain embedded null characters. */ + +extern struct value *value_string (const gdb_byte *ptr, ssize_t count, struct type *char_type); +/* Specialisation of value_string above. In this case PTR points to + single byte characters. CHAR_TYPE must have a length of 1. */ +inline struct value *value_string (const char *ptr, ssize_t count, + struct type *char_type) +{ + gdb_assert (char_type->length () == 1); + return value_string ((const gdb_byte *) ptr, count, char_type); +} + extern struct value *value_array (int lowbound, int highbound, struct value **elemvec);