Make ada_decode not use a static buffer
Commit Message
This makes it safer to use in general, and also allows using it on a
background thread in the future.
Inspired by tromey's patch at:
https://github.com/tromey/gdb/commit/1226cbdfa436297a5dec054d94592c45891afa93
(however, implemented in a different way)
gdb/ChangeLog:
2019-08-28 Christian Biesinger <cbiesinger@google.com>
* ada-lang.c (ada_decode): Return a std::string instead of a char*
and eliminate the static buffer.
(ada_decode_symbol): Update.
(ada_la_decode): Update.
(ada_sniff_from_mangled_name): Update.
(is_valid_name_for_wild_match): Update.
(ada_lookup_name_info::matches): Update and simplify.
(name_matches_regex): Update.
(ada_add_global_exceptions): Update.
* ada-lang.h (ada_decode): Update signature.
* ada-varobj.c (ada_varobj_describe_simple_array_child): Update.
---
gdb/ada-lang.c | 61 ++++++++++++++++++------------------------------
gdb/ada-lang.h | 2 +-
gdb/ada-varobj.c | 6 ++++-
3 files changed, 29 insertions(+), 40 deletions(-)
Comments
>>>>> "Christian" == Christian Biesinger via gdb-patches <gdb-patches@sourceware.org> writes:
Christian> This makes it safer to use in general, and also allows using it on a
Christian> background thread in the future.
Christian> Inspired by tromey's patch at:
Christian> https://github.com/tromey/gdb/commit/1226cbdfa436297a5dec054d94592c45891afa93
Christian> (however, implemented in a different way)
Sorry about the long delay on this.
Christian> /* If ENCODED follows the GNAT entity encoding conventions, then return
Christian> the decoded form of ENCODED. Otherwise, return "<%s>" where "%s" is
Christian> - replaced by ENCODED.
Christian> + replaced by ENCODED. */
Christian> - The resulting string is valid until the next call of ada_decode.
Christian> - If the string is unchanged by decoding, the original string pointer
Christian> - is returned. */
I wonder if the "unchanged" part is a useful optimization.
If so, then maybe my approach is preferred, though perhaps with
a new type -- something Pedro pointed out in review.
Something else he mentioned is that he came across a test case where
this function had a performance impact. To test this, next week I'll
try out your patch on a large Ada program we have here to measure the
impact. Hopefully it will show nothing, and we can just move forward.
On the whole I'm happy with your approach.
Tom
On Fri, Sep 20, 2019 at 3:58 PM Tom Tromey <tom@tromey.com> wrote:
>
> >>>>> "Christian" == Christian Biesinger via gdb-patches <gdb-patches@sourceware.org> writes:
>
> Christian> This makes it safer to use in general, and also allows using it on a
> Christian> background thread in the future.
>
> Christian> Inspired by tromey's patch at:
> Christian> https://github.com/tromey/gdb/commit/1226cbdfa436297a5dec054d94592c45891afa93
> Christian> (however, implemented in a different way)
>
> Sorry about the long delay on this.
>
> Christian> /* If ENCODED follows the GNAT entity encoding conventions, then return
> Christian> the decoded form of ENCODED. Otherwise, return "<%s>" where "%s" is
> Christian> - replaced by ENCODED.
> Christian> + replaced by ENCODED. */
>
> Christian> - The resulting string is valid until the next call of ada_decode.
> Christian> - If the string is unchanged by decoding, the original string pointer
> Christian> - is returned. */
>
> I wonder if the "unchanged" part is a useful optimization.
> If so, then maybe my approach is preferred, though perhaps with
> a new type -- something Pedro pointed out in review.
>
> Something else he mentioned is that he came across a test case where
> this function had a performance impact. To test this, next week I'll
> try out your patch on a large Ada program we have here to measure the
> impact. Hopefully it will show nothing, and we can just move forward.
Thank you! For reference, this is Pedro's email where he pointed that out:
https://sourceware.org/ml/gdb-patches/2019-05/msg00673.html
It sounds like one concern was over sniffing the minsym language. It
seems like that shouldn't need allocations at all.
BTW, another option would be to have a thread-local std::string in the
function and reuse that. (or an std::string in a class like Pedro
described)
Christian
>>>>> "Christian" == Christian Biesinger via gdb-patches <gdb-patches@sourceware.org> writes:
Christian> This makes it safer to use in general, and also allows using it on a
Christian> background thread in the future.
Thanks for doing this.
I tried it out on a large Ada program we have internally, and I couldn't
detect a difference. I tested it by doing:
time gdb -nx -batch the-program
... both before and after applying the patch.
I think this is a reasonable enough test, because it tests minsym
reading and psymbol reading -- IMO, CU expansion isn't usually
significant enough to matter.
When I applied it, I needed a couple of extra changes to get it to build
-- one in ada-exp.y and one in dwarf-index-write.c. This patch is OK
with these minor build issues fixed up.
Tom
@@ -1105,22 +1105,16 @@ ada_remove_po_subprogram_suffix (const char *encoded, int *len)
/* If ENCODED follows the GNAT entity encoding conventions, then return
the decoded form of ENCODED. Otherwise, return "<%s>" where "%s" is
- replaced by ENCODED.
+ replaced by ENCODED. */
- The resulting string is valid until the next call of ada_decode.
- If the string is unchanged by decoding, the original string pointer
- is returned. */
-
-const char *
+std::string
ada_decode (const char *encoded)
{
int i, j;
int len0;
const char *p;
- char *decoded;
int at_start_name;
- static char *decoding_buffer = NULL;
- static size_t decoding_buffer_size = 0;
+ std::string decoded;
/* With function descriptors on PPC64, the value of a symbol named
".FN", if it exists, is the entry point of the function "FN". */
@@ -1179,8 +1173,7 @@ ada_decode (const char *encoded)
/* Make decoded big enough for possible expansion by operator name. */
- GROW_VECT (decoding_buffer, decoding_buffer_size, 2 * len0 + 1);
- decoded = decoding_buffer;
+ decoded.resize (2 * len0 + 1, 'X');
/* Remove trailing __{digit}+ or trailing ${digit}+. */
@@ -1217,7 +1210,7 @@ ada_decode (const char *encoded)
op_len - 1) == 0)
&& !isalnum (encoded[i + op_len]))
{
- strcpy (decoded + j, ada_opname_table[k].decoded);
+ strcpy (&decoded.front() + j, ada_opname_table[k].decoded);
at_start_name = 0;
i += op_len;
j += strlen (ada_opname_table[k].decoded);
@@ -1338,7 +1331,7 @@ ada_decode (const char *encoded)
j += 1;
}
}
- decoded[j] = '\000';
+ decoded.resize (j);
/* Decoded names should never contain any uppercase character.
Double-check this, and abort the decoding if we find one. */
@@ -1347,18 +1340,13 @@ ada_decode (const char *encoded)
if (isupper (decoded[i]) || decoded[i] == ' ')
goto Suppress;
- if (strcmp (decoded, encoded) == 0)
- return encoded;
- else
- return decoded;
+ return decoded;
Suppress:
- GROW_VECT (decoding_buffer, decoding_buffer_size, strlen (encoded) + 3);
- decoded = decoding_buffer;
if (encoded[0] == '<')
- strcpy (decoded, encoded);
+ decoded = encoded;
else
- xsnprintf (decoded, decoding_buffer_size, "<%s>", encoded);
+ decoded = '<' + encoded + '>';
return decoded;
}
@@ -1389,13 +1377,13 @@ ada_decode_symbol (const struct general_symbol_info *arg)
if (!gsymbol->ada_mangled)
{
- const char *decoded = ada_decode (gsymbol->name);
+ std::string decoded = ada_decode (gsymbol->name);
struct obstack *obstack = gsymbol->language_specific.obstack;
gsymbol->ada_mangled = 1;
if (obstack != NULL)
- *resultp = obstack_strdup (obstack, decoded);
+ *resultp = obstack_strdup (obstack, decoded.c_str ());
else
{
/* Sometimes, we can't find a corresponding objfile, in
@@ -1404,10 +1392,10 @@ ada_decode_symbol (const struct general_symbol_info *arg)
significant memory leak (FIXME). */
char **slot = (char **) htab_find_slot (decoded_names_store,
- decoded, INSERT);
+ decoded.c_str (), INSERT);
if (*slot == NULL)
- *slot = xstrdup (decoded);
+ *slot = xstrdup (decoded.c_str ());
*resultp = *slot;
}
}
@@ -1418,7 +1406,7 @@ ada_decode_symbol (const struct general_symbol_info *arg)
static char *
ada_la_decode (const char *encoded, int options)
{
- return xstrdup (ada_decode (encoded));
+ return xstrdup (ada_decode (encoded).c_str ());
}
/* Implement la_sniff_from_mangled_name for Ada. */
@@ -1426,11 +1414,11 @@ ada_la_decode (const char *encoded, int options)
static int
ada_sniff_from_mangled_name (const char *mangled, char **out)
{
- const char *demangled = ada_decode (mangled);
+ std::string demangled = ada_decode (mangled);
*out = NULL;
- if (demangled != mangled && demangled != NULL && demangled[0] != '<')
+ if (demangled != mangled && demangled[0] != '<')
{
/* Set the gsymbol language to Ada, but still return 0.
Two reasons for that:
@@ -5995,7 +5983,7 @@ is_name_suffix (const char *str)
static int
is_valid_name_for_wild_match (const char *name0)
{
- const char *decoded_name = ada_decode (name0);
+ std::string decoded_name = ada_decode (name0);
int i;
/* If the decoded name starts with an angle bracket, it means that
@@ -6241,13 +6229,10 @@ ada_lookup_name_info::matches
that iff we are doing a verbatim match, the decoded version
of the symbol name starts with '<'. Otherwise, this symbol name
is not a suitable completion. */
- const char *sym_name_copy = sym_name;
- bool has_angle_bracket;
- sym_name = ada_decode (sym_name);
- has_angle_bracket = (sym_name[0] == '<');
+ std::string decoded = ada_decode (sym_name);
+ bool has_angle_bracket = (decoded[0] == '<');
match = (has_angle_bracket == m_verbatim_p);
- sym_name = sym_name_copy;
}
if (match && !m_verbatim_p)
@@ -6271,7 +6256,7 @@ ada_lookup_name_info::matches
/* Since we are doing wild matching, this means that TEXT
may represent an unqualified symbol name. We therefore must
also compare TEXT against the unqualified name of the symbol. */
- sym_name = ada_unqualified_name (ada_decode (sym_name));
+ sym_name = ada_unqualified_name (ada_decode (sym_name).c_str ());
if (strncmp (sym_name, text, text_len) == 0)
match = true;
@@ -13496,7 +13481,7 @@ static bool
name_matches_regex (const char *name, compiled_regex *preg)
{
return (preg == NULL
- || preg->exec (ada_decode (name), 0, NULL, 0) == 0);
+ || preg->exec (ada_decode (name).c_str (), 0, NULL, 0) == 0);
}
/* Add all exceptions defined globally whose name name match
@@ -13529,8 +13514,8 @@ ada_add_global_exceptions (compiled_regex *preg,
lookup_name_info::match_any (),
[&] (const char *search_name)
{
- const char *decoded = ada_decode (search_name);
- return name_matches_regex (decoded, preg);
+ std::string decoded = ada_decode (search_name);
+ return name_matches_regex (decoded.c_str (), preg);
},
NULL,
VARIABLES_DOMAIN);
@@ -227,7 +227,7 @@ extern struct type *ada_get_decoded_type (struct type *type);
extern const char *ada_decode_symbol (const struct general_symbol_info *);
-extern const char *ada_decode (const char*);
+extern std::string ada_decode (const char*);
extern enum language ada_update_initial_language (enum language);
@@ -624,6 +624,7 @@ ada_varobj_describe_simple_array_child (struct value *parent_value,
of the array index type when such type qualification is
needed. */
const char *index_type_name = NULL;
+ std::string decoded;
/* If the index type is a range type, find the base type. */
while (TYPE_CODE (index_type) == TYPE_CODE_RANGE)
@@ -634,7 +635,10 @@ ada_varobj_describe_simple_array_child (struct value *parent_value,
{
index_type_name = ada_type_name (index_type);
if (index_type_name)
- index_type_name = ada_decode (index_type_name);
+ {
+ decoded = ada_decode (index_type_name);
+ index_type_name = decoded.c_str ();
+ }
}
if (index_type_name != NULL)