[RFAv2] Fix buffer overflow regression due to minsym malloc-ed instead of obstack-ed.

Message ID 87h8brj7ie.fsf@tromey.com
State New, archived
Headers

Commit Message

Tom Tromey March 25, 2019, 3:31 p.m. UTC
  >>>>> "Philippe" == Philippe Waroquiers <philippe.waroquiers@skynet.be> writes:

Philippe> Before this commit, the array of 'struct minimal_symbol'
Philippe> contained a last element that was a "null symbol".  The comment in
Philippe> minimal_symbol_reader::install was:

Sorry about this.

Philippe> Note that a bunch of comments in minimal_symbol_reader::install
Philippe> are still referring to allocations being done in obstack.  These
Philippe> comments seem obsolete.  I have not fixed them, as I have not
Philippe> understood what they are explaining (e.g. related to language
Philippe> auto, demangling, etc : I have not seen where all this is done).

The comment about language_auto is mildly incorrect, and I think
probably has been for quite some time.

There are some other incorrect comments in there.  I'll send a patch.

Philippe> +  int n_after_msymbol = minsym.objfile->per_bfd->minimal_symbol_count
Philippe> +    - (msymbol - minsym.objfile->per_bfd->msymbols.get ())
Philippe> +    - 1;

What do you think of the appended instead?
The idea is to make the last element more explicit.

Tom
  

Comments

Philippe Waroquiers March 25, 2019, 7:54 p.m. UTC | #1
On Mon, 2019-03-25 at 09:31 -0600, Tom Tromey wrote:
> Philippe> +  int n_after_msymbol = minsym.objfile->per_bfd->minimal_symbol_count
> Philippe> +    - (msymbol - minsym.objfile->per_bfd->msymbols.get ())
> Philippe> +    - 1;
> 
> What do you think of the appended instead?
> The idea is to make the last element more explicit.
Yes, that looks better, 2 minor comments below.
Thanks
Philippe
> 
> Tom
> 
> diff --git a/gdb/minsyms.c b/gdb/minsyms.c
> index b95e9ef6e8b..03743e3062b 100644
> --- a/gdb/minsyms.c
> +++ b/gdb/minsyms.c
> @@ -1480,11 +1480,10 @@ find_solib_trampoline_target (struct frame_info *frame, CORE_ADDR pc)
>  CORE_ADDR
>  minimal_symbol_upper_bound (struct bound_minimal_symbol minsym)
>  {
> -  int i;
>    short section;
>    struct obj_section *obj_section;
>    CORE_ADDR result;
> -  struct minimal_symbol *msymbol;
> +  struct minimal_symbol *iter, *msymbol;
>  
>    gdb_assert (minsym.minsym != NULL);
>  
> @@ -1499,21 +1498,24 @@ minimal_symbol_upper_bound (struct bound_minimal_symbol minsym)
>       other sections, to find the next symbol in this section with a
>       different address.  */
>  
> +  struct minimal_symbol *last
> +    = (minsym.objfile->per_bfd->msymbols.get ()
> +       + minsym.objfile->per_bfd->minimal_symbol_count);
Are the parenthesis needed here ?

Also, I find the name 'last' a little bit confusing,
as in the loop below, last is not handled.
Maybe last could be the 'real' last i.e. as:
   minsym.objfile->per_bfd->msymbols.get () +       
    + minsym.objfile->per_bfd->minimal_symbol_count - 1;

and have the '< last' conditions below then be '<= last'.

That makes more clear for me that we handle the last
element of the array.


>    msymbol = minsym.minsym;
>    section = MSYMBOL_SECTION (msymbol);
> -  for (i = 1; MSYMBOL_LINKAGE_NAME (msymbol + i) != NULL; i++)
> +  for (iter = msymbol + 1; iter < last; ++iter)
>      {
> -      if ((MSYMBOL_VALUE_RAW_ADDRESS (msymbol + i)
> +      if ((MSYMBOL_VALUE_RAW_ADDRESS (iter)
>  	   != MSYMBOL_VALUE_RAW_ADDRESS (msymbol))
> -	  && MSYMBOL_SECTION (msymbol + i) == section)
> +	  && MSYMBOL_SECTION (iter) == section)
>  	break;
>      }
>  
>    obj_section = MSYMBOL_OBJ_SECTION (minsym.objfile, minsym.minsym);
> -  if (MSYMBOL_LINKAGE_NAME (msymbol + i) != NULL
> -      && (MSYMBOL_VALUE_ADDRESS (minsym.objfile, msymbol + i)
> +  if (iter < last
> +      && (MSYMBOL_VALUE_ADDRESS (minsym.objfile, iter)
>  	  < obj_section_endaddr (obj_section)))
> -    result = MSYMBOL_VALUE_ADDRESS (minsym.objfile, msymbol + i);
> +    result = MSYMBOL_VALUE_ADDRESS (minsym.objfile, iter);
>    else
>      /* We got the start address from the last msymbol in the objfile.
>         So the end address is the end of the section.  */
  
Simon Marchi March 26, 2019, 6:46 p.m. UTC | #2
On 2019-03-25 3:54 p.m., Philippe Waroquiers wrote:
> On Mon, 2019-03-25 at 09:31 -0600, Tom Tromey wrote:
>> Philippe> +  int n_after_msymbol = minsym.objfile->per_bfd->minimal_symbol_count
>> Philippe> +    - (msymbol - minsym.objfile->per_bfd->msymbols.get ())
>> Philippe> +    - 1;
>>
>> What do you think of the appended instead?
>> The idea is to make the last element more explicit.
> Yes, that looks better, 2 minor comments below.

I just wanted to mention that I just hit this bug, and that Tom's patch fixes it for me.

>> @@ -1499,21 +1498,24 @@ minimal_symbol_upper_bound (struct bound_minimal_symbol minsym)
>>       other sections, to find the next symbol in this section with a
>>       different address.  */
>>  
>> +  struct minimal_symbol *last
>> +    = (minsym.objfile->per_bfd->msymbols.get ()
>> +       + minsym.objfile->per_bfd->minimal_symbol_count);
> Are the parenthesis needed here ?

It is mentioned here, search for "extra paren":

https://www.gnu.org/prep/standards/html_node/Formatting.html#Formatting

It's just there to please people who use Emacs :).

> Also, I find the name 'last' a little bit confusing,
> as in the loop below, last is not handled.
> Maybe last could be the 'real' last i.e. as:
>    minsym.objfile->per_bfd->msymbols.get () +       
>     + minsym.objfile->per_bfd->minimal_symbol_count - 1;
> 
> and have the '< last' conditions below then be '<= last'.
> 
> That makes more clear for me that we handle the last
> element of the array.

This, or name the variable "past_the_end" or something like that.

Simon
  
Tom Tromey March 26, 2019, 7:20 p.m. UTC | #3
>>>>> "Simon" == Simon Marchi <simark@simark.ca> writes:

>> and have the '< last' conditions below then be '<= last'.
>> 
>> That makes more clear for me that we handle the last
>> element of the array.

Simon> This, or name the variable "past_the_end" or something like that.

Perhaps I'll use past_the_end and then use !=, since that seems to be
the C++ iterator style.

Tom
  

Patch

diff --git a/gdb/minsyms.c b/gdb/minsyms.c
index b95e9ef6e8b..03743e3062b 100644
--- a/gdb/minsyms.c
+++ b/gdb/minsyms.c
@@ -1480,11 +1480,10 @@  find_solib_trampoline_target (struct frame_info *frame, CORE_ADDR pc)
 CORE_ADDR
 minimal_symbol_upper_bound (struct bound_minimal_symbol minsym)
 {
-  int i;
   short section;
   struct obj_section *obj_section;
   CORE_ADDR result;
-  struct minimal_symbol *msymbol;
+  struct minimal_symbol *iter, *msymbol;
 
   gdb_assert (minsym.minsym != NULL);
 
@@ -1499,21 +1498,24 @@  minimal_symbol_upper_bound (struct bound_minimal_symbol minsym)
      other sections, to find the next symbol in this section with a
      different address.  */
 
+  struct minimal_symbol *last
+    = (minsym.objfile->per_bfd->msymbols.get ()
+       + minsym.objfile->per_bfd->minimal_symbol_count);
   msymbol = minsym.minsym;
   section = MSYMBOL_SECTION (msymbol);
-  for (i = 1; MSYMBOL_LINKAGE_NAME (msymbol + i) != NULL; i++)
+  for (iter = msymbol + 1; iter < last; ++iter)
     {
-      if ((MSYMBOL_VALUE_RAW_ADDRESS (msymbol + i)
+      if ((MSYMBOL_VALUE_RAW_ADDRESS (iter)
 	   != MSYMBOL_VALUE_RAW_ADDRESS (msymbol))
-	  && MSYMBOL_SECTION (msymbol + i) == section)
+	  && MSYMBOL_SECTION (iter) == section)
 	break;
     }
 
   obj_section = MSYMBOL_OBJ_SECTION (minsym.objfile, minsym.minsym);
-  if (MSYMBOL_LINKAGE_NAME (msymbol + i) != NULL
-      && (MSYMBOL_VALUE_ADDRESS (minsym.objfile, msymbol + i)
+  if (iter < last
+      && (MSYMBOL_VALUE_ADDRESS (minsym.objfile, iter)
 	  < obj_section_endaddr (obj_section)))
-    result = MSYMBOL_VALUE_ADDRESS (minsym.objfile, msymbol + i);
+    result = MSYMBOL_VALUE_ADDRESS (minsym.objfile, iter);
   else
     /* We got the start address from the last msymbol in the objfile.
        So the end address is the end of the section.  */