Synthetic symbol leak in in elf_x86_64_get_synthetic_symtab and elf_read_minimal_symbols

Message ID ce856f01-6e4b-5e74-fe8a-8e4aa0cb89e6@gmail.com
State New, archived
Headers

Commit Message

Alex Lindsay Aug. 7, 2017, 3:19 p.m. UTC
  Detected this leak with valgrind memcheck:

==21225== 463 (336 direct, 127 indirect) bytes in 1 blocks are 
definitely lost in loss record 10,770 of 10,949^M
==21225==    at 0x4C2DB8F: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)^M
==21225==    by 0x6C6DA2: bfd_malloc (libbfd.c:193)^M
==21225==    by 0x6C6F4D: bfd_zmalloc (libbfd.c:278)^M
==21225==    by 0x6D252E: elf_x86_64_get_synthetic_symtab 
(elf64-x86-64.c:6846)^M
==21225==    by 0x4B397A: elf_read_minimal_symbols (elfread.c:1124)^M
==21225==    by 0x4B397A: elf_symfile_read(objfile*, 
enum_flags<symfile_add_flag>) (elfread.c:1182)^M
==21225==    by 0x63AC94: read_symbols(objfile*, 
enum_flags<symfile_add_flag>) (symfile.c:861)^M
==21225==    by 0x63A773: syms_from_objfile_1 (symfile.c:1062)

We perform a couple of  dynamic allocations in 
elf_x86_64_get_synthetic_symtab (elf64-x86-64.c):

   s = *ret = (asymbol *) bfd_zmalloc (size);

   names = (char *) bfd_malloc (size);

that appear to never get freed. My patch for this:

  /* Scan and build partial symbols for a symbol file.
  

Comments

Yao Qi Aug. 11, 2017, 9:27 a.m. UTC | #1
On 17-08-07 10:19:15, Alex Lindsay wrote:
> 
> We perform a couple of  dynamic allocations in
> elf_x86_64_get_synthetic_symtab (elf64-x86-64.c):
> 
>   s = *ret = (asymbol *) bfd_zmalloc (size);
> 
>   names = (char *) bfd_malloc (size);
> 
> that appear to never get freed. My patch for this:

Good catch!  It is more complicated that other bfd targets allocate
memory for asymbol in a different way as if asymbol.name is defined
as an zero-length array, so memory allocated for both asymbol and .name
in one bfd_malloc call, like,

	  sym = *r->sym_ptr_ptr;
	  if (!sym_exists_at (syms, opdsymend, symcount,
			      sym->section->id, sym->value + r->addend))
	    {
	      ++count;
	      size += sizeof (asymbol);
	      size += strlen (syms[i]->name) + 2;
	    }
	}

      if (size == 0)
	goto done;
      s = *ret = bfd_malloc (size);

or

  size = count * sizeof (asymbol);
  p = relplt->relocation;
  for (i = 0; i < count; i++, p += elf32_arm_size_info.int_rels_per_ext_rel)
    {
      size += strlen ((*p->sym_ptr_ptr)->name) + sizeof ("@plt");
      if (p->addend != 0)
	size += sizeof ("+0x") - 1 + 8;
    }

  s = *ret = (asymbol *) bfd_malloc (size);

> 
> diff --git a/gdb/elfread.c b/gdb/elfread.c
> index ece704ca7c..5ed8a6f957 100644
> --- a/gdb/elfread.c
> +++ b/gdb/elfread.c
> @@ -1144,6 +1144,9 @@ elf_read_minimal_symbols (struct objfile *objfile, int
> symfile_flags,
> 
>    if (symtab_create_debug)
>      fprintf_unfiltered (gdb_stdlog, "Done reading minimal symbols.\n");
> +  if (synthcount > 0)
> +    xfree ((char *) synthsyms->name);

We can't do this for some bfd targets.

> +  xfree (synthsyms);

We can only safely do this, but .name is leaked for x86_64.  Other
tools using bfd, like objdump, nm, and gprof may have this issue too.
I'll ask binutils people on asymbol allocation and de-allocation.
  
Alex Lindsay Aug. 11, 2017, 3:06 p.m. UTC | #2
Thanks for looking into it!

On 08/11/2017 04:27 AM, Yao Qi wrote:
> On 17-08-07 10:19:15, Alex Lindsay wrote:
>> We perform a couple of  dynamic allocations in
>> elf_x86_64_get_synthetic_symtab (elf64-x86-64.c):
>>
>>    s = *ret = (asymbol *) bfd_zmalloc (size);
>>
>>    names = (char *) bfd_malloc (size);
>>
>> that appear to never get freed. My patch for this:
> Good catch!  It is more complicated that other bfd targets allocate
> memory for asymbol in a different way as if asymbol.name is defined
> as an zero-length array, so memory allocated for both asymbol and .name
> in one bfd_malloc call, like,
>
> 	  sym = *r->sym_ptr_ptr;
> 	  if (!sym_exists_at (syms, opdsymend, symcount,
> 			      sym->section->id, sym->value + r->addend))
> 	    {
> 	      ++count;
> 	      size += sizeof (asymbol);
> 	      size += strlen (syms[i]->name) + 2;
> 	    }
> 	}
>
>        if (size == 0)
> 	goto done;
>        s = *ret = bfd_malloc (size);
>
> or
>
>    size = count * sizeof (asymbol);
>    p = relplt->relocation;
>    for (i = 0; i < count; i++, p += elf32_arm_size_info.int_rels_per_ext_rel)
>      {
>        size += strlen ((*p->sym_ptr_ptr)->name) + sizeof ("@plt");
>        if (p->addend != 0)
> 	size += sizeof ("+0x") - 1 + 8;
>      }
>
>    s = *ret = (asymbol *) bfd_malloc (size);
>
>> diff --git a/gdb/elfread.c b/gdb/elfread.c
>> index ece704ca7c..5ed8a6f957 100644
>> --- a/gdb/elfread.c
>> +++ b/gdb/elfread.c
>> @@ -1144,6 +1144,9 @@ elf_read_minimal_symbols (struct objfile *objfile, int
>> symfile_flags,
>>
>>     if (symtab_create_debug)
>>       fprintf_unfiltered (gdb_stdlog, "Done reading minimal symbols.\n");
>> +  if (synthcount > 0)
>> +    xfree ((char *) synthsyms->name);
> We can't do this for some bfd targets.
>
>> +  xfree (synthsyms);
> We can only safely do this, but .name is leaked for x86_64.  Other
> tools using bfd, like objdump, nm, and gprof may have this issue too.
> I'll ask binutils people on asymbol allocation and de-allocation.
>
  
H.J. Lu Aug. 11, 2017, 3:30 p.m. UTC | #3
On Fri, Aug 11, 2017 at 2:27 AM, Yao Qi <qiyaoltc@gmail.com> wrote:
> On 17-08-07 10:19:15, Alex Lindsay wrote:
>>
>> We perform a couple of  dynamic allocations in
>> elf_x86_64_get_synthetic_symtab (elf64-x86-64.c):
>>
>>   s = *ret = (asymbol *) bfd_zmalloc (size);
>>
>>   names = (char *) bfd_malloc (size);
>>
>> that appear to never get freed. My patch for this:
>
> Good catch!  It is more complicated that other bfd targets allocate
> memory for asymbol in a different way as if asymbol.name is defined
> as an zero-length array, so memory allocated for both asymbol and .name
> in one bfd_malloc call, like,
>
>           sym = *r->sym_ptr_ptr;
>           if (!sym_exists_at (syms, opdsymend, symcount,
>                               sym->section->id, sym->value + r->addend))
>             {
>               ++count;
>               size += sizeof (asymbol);
>               size += strlen (syms[i]->name) + 2;
>             }
>         }
>
>       if (size == 0)
>         goto done;
>       s = *ret = bfd_malloc (size);
>
> or
>
>   size = count * sizeof (asymbol);
>   p = relplt->relocation;
>   for (i = 0; i < count; i++, p += elf32_arm_size_info.int_rels_per_ext_rel)
>     {
>       size += strlen ((*p->sym_ptr_ptr)->name) + sizeof ("@plt");
>       if (p->addend != 0)
>         size += sizeof ("+0x") - 1 + 8;
>     }
>
>   s = *ret = (asymbol *) bfd_malloc (size);
>
>>
>> diff --git a/gdb/elfread.c b/gdb/elfread.c
>> index ece704ca7c..5ed8a6f957 100644
>> --- a/gdb/elfread.c
>> +++ b/gdb/elfread.c
>> @@ -1144,6 +1144,9 @@ elf_read_minimal_symbols (struct objfile *objfile, int
>> symfile_flags,
>>
>>    if (symtab_create_debug)
>>      fprintf_unfiltered (gdb_stdlog, "Done reading minimal symbols.\n");
>> +  if (synthcount > 0)
>> +    xfree ((char *) synthsyms->name);
>
> We can't do this for some bfd targets.
>
>> +  xfree (synthsyms);
>
> We can only safely do this, but .name is leaked for x86_64.  Other
> tools using bfd, like objdump, nm, and gprof may have this issue too.
> I'll ask binutils people on asymbol allocation and de-allocation.
>

This is:

https://sourceware.org/bugzilla/show_bug.cgi?id=21943

i386 and x86-64 get_synthetic_symtab don't know if @plt should
be added to symbol name for a PLT entry.  The first pass checks
if @plt is needed and extra space is allocated in the second pass.
We can assume @plt is needed and waste some space if it isn't.
  
Yao Qi Aug. 11, 2017, 3:45 p.m. UTC | #4
On 17-08-11 08:30:21, H.J. Lu wrote:
> >
> > We can only safely do this, but .name is leaked for x86_64.  Other
> > tools using bfd, like objdump, nm, and gprof may have this issue too.
> > I'll ask binutils people on asymbol allocation and de-allocation.
> >
> 
> This is:
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=21943
> 

I opened it :)

> i386 and x86-64 get_synthetic_symtab don't know if @plt should
> be added to symbol name for a PLT entry.  The first pass checks
> if @plt is needed and extra space is allocated in the second pass.
> We can assume @plt is needed and waste some space if it isn't.

Do you plan to fix it?
  
H.J. Lu Aug. 11, 2017, 4:44 p.m. UTC | #5
On Fri, Aug 11, 2017 at 8:45 AM, Yao Qi <qiyaoltc@gmail.com> wrote:
> On 17-08-11 08:30:21, H.J. Lu wrote:
>> >
>> > We can only safely do this, but .name is leaked for x86_64.  Other
>> > tools using bfd, like objdump, nm, and gprof may have this issue too.
>> > I'll ask binutils people on asymbol allocation and de-allocation.
>> >
>>
>> This is:
>>
>> https://sourceware.org/bugzilla/show_bug.cgi?id=21943
>>
>
> I opened it :)
>
>> i386 and x86-64 get_synthetic_symtab don't know if @plt should
>> be added to symbol name for a PLT entry.  The first pass checks
>> if @plt is needed and extra space is allocated in the second pass.
>> We can assume @plt is needed and waste some space if it isn't.
>
> Do you plan to fix it?
>

Done.
  
Alex Lindsay Aug. 11, 2017, 9:20 p.m. UTC | #6
I can verify that the objdump example is fixed in HEAD, but I still get 
this leak with `valgrind --leak-check=full --show-leak-kinds=definite 
gdb ./hello`:

==18127== 300,438 bytes in 5 blocks are definitely lost in loss record 
11,404 of 11,407
==18127==    at 0x4C2DE31: malloc (vg_replace_malloc.c:299)
==18127==    by 0x62F747: bfd_malloc (libbfd.c:193)
==18127==    by 0x62F941: bfd_zmalloc (libbfd.c:278)
==18127==    by 0x649E14: elf_x86_64_get_synthetic_symtab 
(elf64-x86-64.c:6835)
==18127==    by 0x2F1B9E: elf_read_minimal_symbols(objfile*, int, 
elfinfo const*) (elfread.c:1124)
==18127==    by 0x2F1D7C: elf_symfile_read(objfile*, 
enum_flags<symfile_add_flag>) (elfread.c:1182)
==18127==    by 0x563738: read_symbols(objfile*, 
enum_flags<symfile_add_flag>) (symfile.c:861)
==18127==    by 0x563E55: syms_from_objfile_1(objfile*, 
section_addr_info*, enum_flags<symfile_add_flag>) (symfile.c:1062)
==18127==    by 0x563EAC: syms_from_objfile(objfile*, 
section_addr_info*, enum_flags<symfile_add_flag>) (symfile.c:1078)
==18127==    by 0x5641F7: symbol_file_add_with_addrs(bfd*, char const*, 
enum_flags<symfile_add_flag>, section_addr_info*, 
enum_flags<objfile_flag>, objfile*) (symfile.c:1177)
==18127==    by 0x5644C1: symbol_file_add_from_bfd(bfd*, char const*, 
enum_flags<symfile_add_flag>, section_addr_info*, 
enum_flags<objfile_flag>, objfile*) (symfile.c:1268)
==18127==    by 0x547B32: solib_read_symbols(so_list*, 
enum_flags<symfile_add_flag>) (solib.c:707)

On 08/11/2017 11:44 AM, H.J. Lu wrote:
> On Fri, Aug 11, 2017 at 8:45 AM, Yao Qi <qiyaoltc@gmail.com> wrote:
>> On 17-08-11 08:30:21, H.J. Lu wrote:
>>>> We can only safely do this, but .name is leaked for x86_64.  Other
>>>> tools using bfd, like objdump, nm, and gprof may have this issue too.
>>>> I'll ask binutils people on asymbol allocation and de-allocation.
>>>>
>>> This is:
>>>
>>> https://sourceware.org/bugzilla/show_bug.cgi?id=21943
>>>
>> I opened it :)
>>
>>> i386 and x86-64 get_synthetic_symtab don't know if @plt should
>>> be added to symbol name for a PLT entry.  The first pass checks
>>> if @plt is needed and extra space is allocated in the second pass.
>>> We can assume @plt is needed and waste some space if it isn't.
>> Do you plan to fix it?
>>
> Done.
>
  

Patch

diff --git a/gdb/elfread.c b/gdb/elfread.c
index ece704ca7c..5ed8a6f957 100644
--- a/gdb/elfread.c
+++ b/gdb/elfread.c
@@ -1144,6 +1144,9 @@  elf_read_minimal_symbols (struct objfile *objfile, 
int symfile_flags,

    if (symtab_create_debug)
      fprintf_unfiltered (gdb_stdlog, "Done reading minimal symbols.\n");
+  if (synthcount > 0)
+    xfree ((char *) synthsyms->name);
+  xfree (synthsyms);
  }