Memory leak: d_growable_string_resize..cp_canonicalize_string

Message ID 3941f94c-920d-5074-49da-5a913d1585ea@gmail.com
State New, archived
Headers

Commit Message

Alex Lindsay Aug. 7, 2017, 3:09 p.m. UTC
  This is my first attempt at contributing, so please be patient with 
me...I've been encountering some large memory usage with gdb, so I've 
been running it through valgrind. Some discussion from the gdb-users 
list is below. One leak occurs through this stack-trace:

==21225== 32 bytes in 1 blocks are definitely lost in loss record 6,063 of 10,949^M
==21225==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)^M
==21225==    by 0x4C2FDEF: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)^M
==21225==    by 0x76CB31: d_growable_string_resize (cp-demangle.c:3963)^M
==21225==    by 0x76CB31: d_growable_string_init (cp-demangle.c:3942)^M
==21225==    by 0x76CB31: cplus_demangle_print (cp-demangle.c:4308)^M
==21225==    by 0x4C9535: cp_comp_to_string(demangle_component*, int) (cp-name-parser.y:1972)^M
==21225==    by 0x53EF14: cp_canonicalize_string[abi:cxx11](char const*) (cp-support.c:569)^M
==21225==    by 0x561B75: dwarf2_canonicalize_name(char const*, dwarf2_cu*, obstack*) [clone .isra.210] (dwarf2read.c:20159)^M
==21225==    by 0x566B77: read_partial_die (dwarf2read.c:16264)


AFAICT, this occurs because we return a (char *) with 
`cp_comp_to_string` and immediately copy its contents to a std::string, 
never freeing the space pointed to by (char *). So my patch for this is:

index df9a563508..f11d94f56c 100644

Let me know what else to do. In CONTRIBUTE, I read about ChangeLog 
entries, so I can definitely make an addition there if deemed appropriate.

-------- Forwarded Message --------
Subject: 	Re: Large memory usage by gdb
Date: 	Mon, 07 Aug 2017 10:14:57 +0100
From: 	Yao Qi <qiyaoltc@gmail.com>
To: 	Alex Lindsay <alexlindsay239@gmail.com>
CC: 	gdb@sourceware.org



Alex Lindsay<alexlindsay239@gmail.com>  writes:
> and new call-graph. So my question is, is what I'm doing valuable? I
Oh, definitely yes!  Thanks a lot for the investigation.
> haven't done any profiling yet to see how these changes affect my real
> use case where I'm debugging an executable with lots of shared
> libraries. Nevertheless, these leaks do seem to be very real. I know
> that GDB developers are way better programmers than I am, so the fact
> that these leaks haven't been found yet makes me wonder whether they
> matter in real use cases or not. I am using a gdb built from the git
> repository (GNU gdb (GDB) 8.0.50.20170803-git).
leaks are bugs, and we should fix them.  I can find these leaks in
valgrind too,
==21225== 463 (336 direct, 127 indirect) bytes in 1 blocks are definitely lost in loss record 10,770 of 10,949^M
==21225==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)^M
==21225==    by 0x6C6DA2: bfd_malloc (libbfd.c:193)^M
==21225==    by 0x6C6F4D: bfd_zmalloc (libbfd.c:278)^M
==21225==    by 0x6D252E: elf_x86_64_get_synthetic_symtab (elf64-x86-64.c:6846)^M
==21225==    by 0x4B397A: elf_read_minimal_symbols (elfread.c:1124)^M
==21225==    by 0x4B397A: elf_symfile_read(objfile*, enum_flags<symfile_add_flag>) (elfread.c:1182)^M
==21225==    by 0x63AC94: read_symbols(objfile*, enum_flags<symfile_add_flag>) (symfile.c:861)^M
==21225==    by 0x63A773: syms_from_objfile_1 (symfile.c:1062)
and
==21225== 32 bytes in 1 blocks are definitely lost in loss record 6,063 of 10,949^M
==21225==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)^M
==21225==    by 0x4C2FDEF: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)^M
==21225==    by 0x76CB31: d_growable_string_resize (cp-demangle.c:3963)^M
==21225==    by 0x76CB31: d_growable_string_init (cp-demangle.c:3942)^M
==21225==    by 0x76CB31: cplus_demangle_print (cp-demangle.c:4308)^M
==21225==    by 0x4C9535: cp_comp_to_string(demangle_component*, int) (cp-name-parser.y:1972)^M
==21225==    by 0x53EF14: cp_canonicalize_string[abi:cxx11](char const*) (cp-support.c:569)^M
==21225==    by 0x561B75: dwarf2_canonicalize_name(char const*, dwarf2_cu*, obstack*) [clone .isra.210] (dwarf2read.c:20159)^M
==21225==    by 0x566B77: read_partial_die (dwarf2read.c:16264)
Can you post your two patches
https://github.com/lindsayad/gdb/pull/1/files  separately to
gdb-patches@sourceware.org?
  

Comments

Simon Marchi Aug. 7, 2017, 3:53 p.m. UTC | #1
On 2017-08-07 17:09, Alex Lindsay wrote:
> This is my first attempt at contributing, so please be patient with
> me...I've been encountering some large memory usage with gdb, so I've
> been running it through valgrind.

Hi Alex,

Thanks for doing this!

For your future patches, I suggest you look into "git send-email", as it 
will send the patch neatly formatted in a standard way.

> Some discussion from the gdb-users
> list is below. One leak occurs through this stack-trace:
> 
> ==21225== 32 bytes in 1 blocks are definitely lost in loss record
> 6,063 of 10,949^M
> ==21225==    at 0x4C2DB8F: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)^M
> ==21225==    by 0x4C2FDEF: realloc (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)^M
> ==21225==    by 0x76CB31: d_growable_string_resize 
> (cp-demangle.c:3963)^M
> ==21225==    by 0x76CB31: d_growable_string_init (cp-demangle.c:3942)^M
> ==21225==    by 0x76CB31: cplus_demangle_print (cp-demangle.c:4308)^M
> ==21225==    by 0x4C9535: cp_comp_to_string(demangle_component*, int)
> (cp-name-parser.y:1972)^M
> ==21225==    by 0x53EF14: cp_canonicalize_string[abi:cxx11](char
> const*) (cp-support.c:569)^M
> ==21225==    by 0x561B75: dwarf2_canonicalize_name(char const*,
> dwarf2_cu*, obstack*) [clone .isra.210] (dwarf2read.c:20159)^M
> ==21225==    by 0x566B77: read_partial_die (dwarf2read.c:16264)
> 
> 
> AFAICT, this occurs because we return a (char *) with
> `cp_comp_to_string` and immediately copy its contents to a
> std::string, never freeing the space pointed to by (char *). So my
> patch for this is:
> 
> index df9a563508..f11d94f56c 100644
> --- a/gdb/cp-support.c
> +++ b/gdb/cp-support.c
> @@ -566,7 +566,9 @@ cp_canonicalize_string (const char *string)
>      return std::string ();
> 
>    estimated_len = strlen (string) * 2;
> -  std::string ret = cp_comp_to_string (info->tree, estimated_len);
> +  char * ret_char = cp_comp_to_string (info->tree, estimated_len);

Remove the space after the *.

> +  std::string ret = ret_char;
> +  xfree (ret_char);
> 
>    if (ret.empty ())
>      {
> 
> Let me know what else to do. In CONTRIBUTE, I read about ChangeLog
> entries, so I can definitely make an addition there if deemed
> appropriate.

Indeed, an appropriate ChangeLog entry for this patch could look like 
this:

gdb/ChangeLog:

	* cp-support.c (cp_canonicalize_string): Free return value of
	cp_comp_to_string.

Note that the ChangeLog entries should describe _what_ changed and not 
_why_.  The why is explained is the commit message.

As for the patch itself, I think it's good in that it does fix the 
problem, but there is a bit more refactoring that could be done to avoid 
this kind of problem with this function in the future.  Instead of 
returning an allocated char*, it should probably return a 
gdb::unique_xmalloc_ptr<char>.  This will make sure the content gets 
free'd without requiring the caller to free it explicitly.  However, 
doing such a change will probably have many ramifications, so if you are 
not ready to do this it's perfectly fine.  But if you are, feel free to 
ask any questions you may have.  The #gdb IRC channel is also good if 
you need some more hands-on help.

Thanks,

Simon
  
Alex Lindsay Aug. 7, 2017, 8:21 p.m. UTC | #2
Simon,

Thanks for the help and suggestions! I think when I have a bit more time 
I would be very interested in learning about and implementing the longer 
term fix. However, for now I hope I can start with this band-aid. Just 
sent the patch with the style correction via `git send-email`.

Alex

On 08/07/2017 10:53 AM, Simon Marchi wrote:
> On 2017-08-07 17:09, Alex Lindsay wrote:
>> This is my first attempt at contributing, so please be patient with
>> me...I've been encountering some large memory usage with gdb, so I've
>> been running it through valgrind.
>
> Hi Alex,
>
> Thanks for doing this!
>
> For your future patches, I suggest you look into "git send-email", as 
> it will send the patch neatly formatted in a standard way.
>
>> Some discussion from the gdb-users
>> list is below. One leak occurs through this stack-trace:
>>
>> ==21225== 32 bytes in 1 blocks are definitely lost in loss record
>> 6,063 of 10,949^M
>> ==21225==    at 0x4C2DB8F: malloc (in
>> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)^M
>> ==21225==    by 0x4C2FDEF: realloc (in
>> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)^M
>> ==21225==    by 0x76CB31: d_growable_string_resize 
>> (cp-demangle.c:3963)^M
>> ==21225==    by 0x76CB31: d_growable_string_init (cp-demangle.c:3942)^M
>> ==21225==    by 0x76CB31: cplus_demangle_print (cp-demangle.c:4308)^M
>> ==21225==    by 0x4C9535: cp_comp_to_string(demangle_component*, int)
>> (cp-name-parser.y:1972)^M
>> ==21225==    by 0x53EF14: cp_canonicalize_string[abi:cxx11](char
>> const*) (cp-support.c:569)^M
>> ==21225==    by 0x561B75: dwarf2_canonicalize_name(char const*,
>> dwarf2_cu*, obstack*) [clone .isra.210] (dwarf2read.c:20159)^M
>> ==21225==    by 0x566B77: read_partial_die (dwarf2read.c:16264)
>>
>>
>> AFAICT, this occurs because we return a (char *) with
>> `cp_comp_to_string` and immediately copy its contents to a
>> std::string, never freeing the space pointed to by (char *). So my
>> patch for this is:
>>
>> index df9a563508..f11d94f56c 100644
>> --- a/gdb/cp-support.c
>> +++ b/gdb/cp-support.c
>> @@ -566,7 +566,9 @@ cp_canonicalize_string (const char *string)
>>      return std::string ();
>>
>>    estimated_len = strlen (string) * 2;
>> -  std::string ret = cp_comp_to_string (info->tree, estimated_len);
>> +  char * ret_char = cp_comp_to_string (info->tree, estimated_len);
>
> Remove the space after the *.
>
>> +  std::string ret = ret_char;
>> +  xfree (ret_char);
>>
>>    if (ret.empty ())
>>      {
>>
>> Let me know what else to do. In CONTRIBUTE, I read about ChangeLog
>> entries, so I can definitely make an addition there if deemed
>> appropriate.
>
> Indeed, an appropriate ChangeLog entry for this patch could look like 
> this:
>
> gdb/ChangeLog:
>
>     * cp-support.c (cp_canonicalize_string): Free return value of
>     cp_comp_to_string.
>
> Note that the ChangeLog entries should describe _what_ changed and not 
> _why_.  The why is explained is the commit message.
>
> As for the patch itself, I think it's good in that it does fix the 
> problem, but there is a bit more refactoring that could be done to 
> avoid this kind of problem with this function in the future. Instead 
> of returning an allocated char*, it should probably return a 
> gdb::unique_xmalloc_ptr<char>.  This will make sure the content gets 
> free'd without requiring the caller to free it explicitly.  However, 
> doing such a change will probably have many ramifications, so if you 
> are not ready to do this it's perfectly fine.  But if you are, feel 
> free to ask any questions you may have.  The #gdb IRC channel is also 
> good if you need some more hands-on help.
>
> Thanks,
>
> Simon
  

Patch

--- a/gdb/cp-support.c
+++ b/gdb/cp-support.c
@@ -566,7 +566,9 @@  cp_canonicalize_string (const char *string)
      return std::string ();

    estimated_len = strlen (string) * 2;
-  std::string ret = cp_comp_to_string (info->tree, estimated_len);
+  char * ret_char = cp_comp_to_string (info->tree, estimated_len);
+  std::string ret = ret_char;
+  xfree (ret_char);

    if (ret.empty ())
      {