Make ada_decode not use a static buffer

Message ID 20190828204536.251120-1-cbiesinger@google.com
State New, archived
Headers

Commit Message

Terekhov, Mikhail via Gdb-patches Aug. 28, 2019, 8:45 p.m. UTC
  This makes it safer to use in general, and also allows using it on a
background thread in the future.

Inspired by tromey's patch at:
https://github.com/tromey/gdb/commit/1226cbdfa436297a5dec054d94592c45891afa93
(however, implemented in a different way)

gdb/ChangeLog:

2019-08-28  Christian Biesinger  <cbiesinger@google.com>

	* ada-lang.c (ada_decode): Return a std::string instead of a char*
	and eliminate the static buffer.
	(ada_decode_symbol): Update.
	(ada_la_decode): Update.
	(ada_sniff_from_mangled_name): Update.
	(is_valid_name_for_wild_match): Update.
	(ada_lookup_name_info::matches): Update and simplify.
	(name_matches_regex): Update.
	(ada_add_global_exceptions): Update.
	* ada-lang.h (ada_decode): Update signature.
	* ada-varobj.c (ada_varobj_describe_simple_array_child): Update.
---
 gdb/ada-lang.c   | 61 ++++++++++++++++++------------------------------
 gdb/ada-lang.h   |  2 +-
 gdb/ada-varobj.c |  6 ++++-
 3 files changed, 29 insertions(+), 40 deletions(-)
  

Comments

Tom Tromey Sept. 20, 2019, 7:58 p.m. UTC | #1
>>>>> "Christian" == Christian Biesinger via gdb-patches <gdb-patches@sourceware.org> writes:

Christian> This makes it safer to use in general, and also allows using it on a
Christian> background thread in the future.

Christian> Inspired by tromey's patch at:
Christian> https://github.com/tromey/gdb/commit/1226cbdfa436297a5dec054d94592c45891afa93
Christian> (however, implemented in a different way)

Sorry about the long delay on this.

Christian>  /* If ENCODED follows the GNAT entity encoding conventions, then return
Christian>     the decoded form of ENCODED.  Otherwise, return "<%s>" where "%s" is
Christian> -   replaced by ENCODED.
Christian> +   replaced by ENCODED.  */
 
Christian> -   The resulting string is valid until the next call of ada_decode.
Christian> -   If the string is unchanged by decoding, the original string pointer
Christian> -   is returned.  */

I wonder if the "unchanged" part is a useful optimization.
If so, then maybe my approach is preferred, though perhaps with
a new type -- something Pedro pointed out in review.

Something else he mentioned is that he came across a test case where
this function had a performance impact.  To test this, next week I'll
try out your patch on a large Ada program we have here to measure the
impact.  Hopefully it will show nothing, and we can just move forward.

On the whole I'm happy with your approach.

Tom
  
Terekhov, Mikhail via Gdb-patches Sept. 22, 2019, 7:49 a.m. UTC | #2
On Fri, Sep 20, 2019 at 3:58 PM Tom Tromey <tom@tromey.com> wrote:
>
> >>>>> "Christian" == Christian Biesinger via gdb-patches <gdb-patches@sourceware.org> writes:
>
> Christian> This makes it safer to use in general, and also allows using it on a
> Christian> background thread in the future.
>
> Christian> Inspired by tromey's patch at:
> Christian> https://github.com/tromey/gdb/commit/1226cbdfa436297a5dec054d94592c45891afa93
> Christian> (however, implemented in a different way)
>
> Sorry about the long delay on this.
>
> Christian>  /* If ENCODED follows the GNAT entity encoding conventions, then return
> Christian>     the decoded form of ENCODED.  Otherwise, return "<%s>" where "%s" is
> Christian> -   replaced by ENCODED.
> Christian> +   replaced by ENCODED.  */
>
> Christian> -   The resulting string is valid until the next call of ada_decode.
> Christian> -   If the string is unchanged by decoding, the original string pointer
> Christian> -   is returned.  */
>
> I wonder if the "unchanged" part is a useful optimization.
> If so, then maybe my approach is preferred, though perhaps with
> a new type -- something Pedro pointed out in review.
>
> Something else he mentioned is that he came across a test case where
> this function had a performance impact.  To test this, next week I'll
> try out your patch on a large Ada program we have here to measure the
> impact.  Hopefully it will show nothing, and we can just move forward.

Thank you! For reference, this is Pedro's email where he pointed that out:
https://sourceware.org/ml/gdb-patches/2019-05/msg00673.html

It sounds like one concern was over sniffing the minsym language. It
seems like that shouldn't need allocations at all.

BTW, another option would be to have a thread-local std::string in the
function and reuse that. (or an std::string in a class like Pedro
described)

Christian
  
Tom Tromey Sept. 23, 2019, 5:41 p.m. UTC | #3
>>>>> "Christian" == Christian Biesinger via gdb-patches <gdb-patches@sourceware.org> writes:

Christian> This makes it safer to use in general, and also allows using it on a
Christian> background thread in the future.

Thanks for doing this.

I tried it out on a large Ada program we have internally, and I couldn't
detect a difference.  I tested it by doing:

    time gdb -nx -batch the-program

... both before and after applying the patch.

I think this is a reasonable enough test, because it tests minsym
reading and psymbol reading -- IMO, CU expansion isn't usually
significant enough to matter.

When I applied it, I needed a couple of extra changes to get it to build
-- one in ada-exp.y and one in dwarf-index-write.c.  This patch is OK
with these minor build issues fixed up.

Tom
  

Patch

diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c
index 609f2d4391..168834cec6 100644
--- a/gdb/ada-lang.c
+++ b/gdb/ada-lang.c
@@ -1105,22 +1105,16 @@  ada_remove_po_subprogram_suffix (const char *encoded, int *len)
 
 /* If ENCODED follows the GNAT entity encoding conventions, then return
    the decoded form of ENCODED.  Otherwise, return "<%s>" where "%s" is
-   replaced by ENCODED.
+   replaced by ENCODED.  */
 
-   The resulting string is valid until the next call of ada_decode.
-   If the string is unchanged by decoding, the original string pointer
-   is returned.  */
-
-const char *
+std::string
 ada_decode (const char *encoded)
 {
   int i, j;
   int len0;
   const char *p;
-  char *decoded;
   int at_start_name;
-  static char *decoding_buffer = NULL;
-  static size_t decoding_buffer_size = 0;
+  std::string decoded;
 
   /* With function descriptors on PPC64, the value of a symbol named
      ".FN", if it exists, is the entry point of the function "FN".  */
@@ -1179,8 +1173,7 @@  ada_decode (const char *encoded)
 
   /* Make decoded big enough for possible expansion by operator name.  */
 
-  GROW_VECT (decoding_buffer, decoding_buffer_size, 2 * len0 + 1);
-  decoded = decoding_buffer;
+  decoded.resize (2 * len0 + 1, 'X');
 
   /* Remove trailing __{digit}+ or trailing ${digit}+.  */
 
@@ -1217,7 +1210,7 @@  ada_decode (const char *encoded)
                             op_len - 1) == 0)
                   && !isalnum (encoded[i + op_len]))
                 {
-                  strcpy (decoded + j, ada_opname_table[k].decoded);
+                  strcpy (&decoded.front() + j, ada_opname_table[k].decoded);
                   at_start_name = 0;
                   i += op_len;
                   j += strlen (ada_opname_table[k].decoded);
@@ -1338,7 +1331,7 @@  ada_decode (const char *encoded)
           j += 1;
         }
     }
-  decoded[j] = '\000';
+  decoded.resize (j);
 
   /* Decoded names should never contain any uppercase character.
      Double-check this, and abort the decoding if we find one.  */
@@ -1347,18 +1340,13 @@  ada_decode (const char *encoded)
     if (isupper (decoded[i]) || decoded[i] == ' ')
       goto Suppress;
 
-  if (strcmp (decoded, encoded) == 0)
-    return encoded;
-  else
-    return decoded;
+  return decoded;
 
 Suppress:
-  GROW_VECT (decoding_buffer, decoding_buffer_size, strlen (encoded) + 3);
-  decoded = decoding_buffer;
   if (encoded[0] == '<')
-    strcpy (decoded, encoded);
+    decoded = encoded;
   else
-    xsnprintf (decoded, decoding_buffer_size, "<%s>", encoded);
+    decoded = '<' + encoded + '>';
   return decoded;
 
 }
@@ -1389,13 +1377,13 @@  ada_decode_symbol (const struct general_symbol_info *arg)
 
   if (!gsymbol->ada_mangled)
     {
-      const char *decoded = ada_decode (gsymbol->name);
+      std::string decoded = ada_decode (gsymbol->name);
       struct obstack *obstack = gsymbol->language_specific.obstack;
 
       gsymbol->ada_mangled = 1;
 
       if (obstack != NULL)
-	*resultp = obstack_strdup (obstack, decoded);
+	*resultp = obstack_strdup (obstack, decoded.c_str ());
       else
         {
 	  /* Sometimes, we can't find a corresponding objfile, in
@@ -1404,10 +1392,10 @@  ada_decode_symbol (const struct general_symbol_info *arg)
 	     significant memory leak (FIXME).  */
 
           char **slot = (char **) htab_find_slot (decoded_names_store,
-                                                  decoded, INSERT);
+                                                  decoded.c_str (), INSERT);
 
           if (*slot == NULL)
-            *slot = xstrdup (decoded);
+            *slot = xstrdup (decoded.c_str ());
           *resultp = *slot;
         }
     }
@@ -1418,7 +1406,7 @@  ada_decode_symbol (const struct general_symbol_info *arg)
 static char *
 ada_la_decode (const char *encoded, int options)
 {
-  return xstrdup (ada_decode (encoded));
+  return xstrdup (ada_decode (encoded).c_str ());
 }
 
 /* Implement la_sniff_from_mangled_name for Ada.  */
@@ -1426,11 +1414,11 @@  ada_la_decode (const char *encoded, int options)
 static int
 ada_sniff_from_mangled_name (const char *mangled, char **out)
 {
-  const char *demangled = ada_decode (mangled);
+  std::string demangled = ada_decode (mangled);
 
   *out = NULL;
 
-  if (demangled != mangled && demangled != NULL && demangled[0] != '<')
+  if (demangled != mangled && demangled[0] != '<')
     {
       /* Set the gsymbol language to Ada, but still return 0.
 	 Two reasons for that:
@@ -5995,7 +5983,7 @@  is_name_suffix (const char *str)
 static int
 is_valid_name_for_wild_match (const char *name0)
 {
-  const char *decoded_name = ada_decode (name0);
+  std::string decoded_name = ada_decode (name0);
   int i;
 
   /* If the decoded name starts with an angle bracket, it means that
@@ -6241,13 +6229,10 @@  ada_lookup_name_info::matches
          that iff we are doing a verbatim match, the decoded version
          of the symbol name starts with '<'.  Otherwise, this symbol name
          is not a suitable completion.  */
-      const char *sym_name_copy = sym_name;
-      bool has_angle_bracket;
 
-      sym_name = ada_decode (sym_name);
-      has_angle_bracket = (sym_name[0] == '<');
+      std::string decoded = ada_decode (sym_name);
+      bool has_angle_bracket = (decoded[0] == '<');
       match = (has_angle_bracket == m_verbatim_p);
-      sym_name = sym_name_copy;
     }
 
   if (match && !m_verbatim_p)
@@ -6271,7 +6256,7 @@  ada_lookup_name_info::matches
       /* Since we are doing wild matching, this means that TEXT
          may represent an unqualified symbol name.  We therefore must
          also compare TEXT against the unqualified name of the symbol.  */
-      sym_name = ada_unqualified_name (ada_decode (sym_name));
+      sym_name = ada_unqualified_name (ada_decode (sym_name).c_str ());
 
       if (strncmp (sym_name, text, text_len) == 0)
 	match = true;
@@ -13496,7 +13481,7 @@  static bool
 name_matches_regex (const char *name, compiled_regex *preg)
 {
   return (preg == NULL
-	  || preg->exec (ada_decode (name), 0, NULL, 0) == 0);
+	  || preg->exec (ada_decode (name).c_str (), 0, NULL, 0) == 0);
 }
 
 /* Add all exceptions defined globally whose name name match
@@ -13529,8 +13514,8 @@  ada_add_global_exceptions (compiled_regex *preg,
 			   lookup_name_info::match_any (),
 			   [&] (const char *search_name)
 			   {
-			     const char *decoded = ada_decode (search_name);
-			     return name_matches_regex (decoded, preg);
+			     std::string decoded = ada_decode (search_name);
+			     return name_matches_regex (decoded.c_str (), preg);
 			   },
 			   NULL,
 			   VARIABLES_DOMAIN);
diff --git a/gdb/ada-lang.h b/gdb/ada-lang.h
index 2fc3f523ca..c7279d5ac3 100644
--- a/gdb/ada-lang.h
+++ b/gdb/ada-lang.h
@@ -227,7 +227,7 @@  extern struct type *ada_get_decoded_type (struct type *type);
 
 extern const char *ada_decode_symbol (const struct general_symbol_info *);
 
-extern const char *ada_decode (const char*);
+extern std::string ada_decode (const char*);
 
 extern enum language ada_update_initial_language (enum language);
 
diff --git a/gdb/ada-varobj.c b/gdb/ada-varobj.c
index a4d553d378..1a5d0ac239 100644
--- a/gdb/ada-varobj.c
+++ b/gdb/ada-varobj.c
@@ -624,6 +624,7 @@  ada_varobj_describe_simple_array_child (struct value *parent_value,
 	 of the array index type when such type qualification is
 	 needed.  */
       const char *index_type_name = NULL;
+      std::string decoded;
 
       /* If the index type is a range type, find the base type.  */
       while (TYPE_CODE (index_type) == TYPE_CODE_RANGE)
@@ -634,7 +635,10 @@  ada_varobj_describe_simple_array_child (struct value *parent_value,
 	{
 	  index_type_name = ada_type_name (index_type);
 	  if (index_type_name)
-	    index_type_name = ada_decode (index_type_name);
+	    {
+	      decoded = ada_decode (index_type_name);
+	      index_type_name = decoded.c_str ();
+	    }
 	}
 
       if (index_type_name != NULL)