Fix infinite recursion bug at get_msymbol_address.

Message ID 20191107040541.208168-1-tamur@google.com
State New, archived
Headers

Commit Message

Terekhov, Mikhail via Gdb-patches Nov. 7, 2019, 4:05 a.m. UTC
  The patch 4b610737f0 seems to have introduced the possibility of infinite
recursion. I have encountered the problem while debugging a failing in-house
test. I am sorry, it is fairly difficult to reduce the test case (and I don't
understand most of what is going on) but the stack trace shows a call to
objfpy_add_separate_debug_file, which eventually causes
lookup_minimal_symbol_by_pc_name to be invoked, which calls get_msymbol_address.
Somehow lookup_minimal_symbol_linkage finds the same symbol and the function
calls itself with the same parameters. I don't know whether this should be
classified as 'it should never happen', but this simple patch makes the test
pass and should be harmless, I think.

gdb/ChangeLog

	* symtab.c (get_msymbol_address): Guard against infinite recursion.
---
 gdb/symtab.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Terekhov, Mikhail via Gdb-patches Nov. 13, 2019, 6:34 p.m. UTC | #1
Friendly ping?

On Wed, Nov 6, 2019 at 8:05 PM Ali Tamur <tamur@google.com> wrote:

> The patch 4b610737f0 seems to have introduced the possibility of infinite
> recursion. I have encountered the problem while debugging a failing
> in-house
> test. I am sorry, it is fairly difficult to reduce the test case (and I
> don't
> understand most of what is going on) but the stack trace shows a call to
> objfpy_add_separate_debug_file, which eventually causes
> lookup_minimal_symbol_by_pc_name to be invoked, which calls
> get_msymbol_address.
> Somehow lookup_minimal_symbol_linkage finds the same symbol and the
> function
> calls itself with the same parameters. I don't know whether this should be
> classified as 'it should never happen', but this simple patch makes the
> test
> pass and should be harmless, I think.
>
> gdb/ChangeLog
>
>         * symtab.c (get_msymbol_address): Guard against infinite recursion.
> ---
>  gdb/symtab.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gdb/symtab.c b/gdb/symtab.c
> index 2c934b9c22..b231cc6e84 100644
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -6328,7 +6328,7 @@ get_msymbol_address (struct objfile *objf, const
> struct minimal_symbol *minsym)
>         {
>           bound_minimal_symbol found
>             = lookup_minimal_symbol_linkage (linkage_name, objfile);
> -         if (found.minsym != nullptr)
> +         if (found.minsym != nullptr && found.minsym != minsym)
>             return BMSYMBOL_VALUE_ADDRESS (found);
>         }
>      }
> --
> 2.24.0.rc1.363.gb1bccd3e3d-goog
>
>
  
Terekhov, Mikhail via Gdb-patches Nov. 19, 2019, 3:46 a.m. UTC | #2
If there are no objections, I am planning to check this in a couple days.
Thank you.
Ali

On Wed, Nov 13, 2019 at 10:34 AM Ali Tamur <tamur@google.com> wrote:

> Friendly ping?
>
> On Wed, Nov 6, 2019 at 8:05 PM Ali Tamur <tamur@google.com> wrote:
>
>> The patch 4b610737f0 seems to have introduced the possibility of infinite
>> recursion. I have encountered the problem while debugging a failing
>> in-house
>> test. I am sorry, it is fairly difficult to reduce the test case (and I
>> don't
>> understand most of what is going on) but the stack trace shows a call to
>> objfpy_add_separate_debug_file, which eventually causes
>> lookup_minimal_symbol_by_pc_name to be invoked, which calls
>> get_msymbol_address.
>> Somehow lookup_minimal_symbol_linkage finds the same symbol and the
>> function
>> calls itself with the same parameters. I don't know whether this should be
>> classified as 'it should never happen', but this simple patch makes the
>> test
>> pass and should be harmless, I think.
>>
>> gdb/ChangeLog
>>
>>         * symtab.c (get_msymbol_address): Guard against infinite
>> recursion.
>> ---
>>  gdb/symtab.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gdb/symtab.c b/gdb/symtab.c
>> index 2c934b9c22..b231cc6e84 100644
>> --- a/gdb/symtab.c
>> +++ b/gdb/symtab.c
>> @@ -6328,7 +6328,7 @@ get_msymbol_address (struct objfile *objf, const
>> struct minimal_symbol *minsym)
>>         {
>>           bound_minimal_symbol found
>>             = lookup_minimal_symbol_linkage (linkage_name, objfile);
>> -         if (found.minsym != nullptr)
>> +         if (found.minsym != nullptr && found.minsym != minsym)
>>             return BMSYMBOL_VALUE_ADDRESS (found);
>>         }
>>      }
>> --
>> 2.24.0.rc1.363.gb1bccd3e3d-goog
>>
>>
  
Simon Marchi Nov. 19, 2019, 5:15 a.m. UTC | #3
On 2019-11-06 11:05 p.m., Ali Tamur via gdb-patches wrote:
> The patch 4b610737f0 seems to have introduced the possibility of infinite
> recursion. I have encountered the problem while debugging a failing in-house
> test. I am sorry, it is fairly difficult to reduce the test case (and I don't
> understand most of what is going on) but the stack trace shows a call to
> objfpy_add_separate_debug_file, which eventually causes
> lookup_minimal_symbol_by_pc_name to be invoked, which calls get_msymbol_address.
> Somehow lookup_minimal_symbol_linkage finds the same symbol and the function
> calls itself with the same parameters. I don't know whether this should be
> classified as 'it should never happen', but this simple patch makes the test
> pass and should be harmless, I think.
> 
> gdb/ChangeLog
> 
> 	* symtab.c (get_msymbol_address): Guard against infinite recursion.
> ---
>  gdb/symtab.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gdb/symtab.c b/gdb/symtab.c
> index 2c934b9c22..b231cc6e84 100644
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -6328,7 +6328,7 @@ get_msymbol_address (struct objfile *objf, const struct minimal_symbol *minsym)
>  	{
>  	  bound_minimal_symbol found
>  	    = lookup_minimal_symbol_linkage (linkage_name, objfile);
> -	  if (found.minsym != nullptr)
> +	  if (found.minsym != nullptr && found.minsym != minsym)
>  	    return BMSYMBOL_VALUE_ADDRESS (found);
>  	}
>      }

Hi Ali,

At the top of the function, we assert that the objfile associated to the minsym
is not the main one:

  gdb_assert ((objf->flags & OBJF_MAINLINE) == 0);

We then iterate on all objfiles, looking only for main objfiles (presumably
there's only one).  We look up a minsym with the same name as our original minsym in
the main objfile.

How could our original minsym (which is not supposed to come from the main objfile,
I believe) could be the same one as the minsym found when looking up in the main
objfile?

If this is indeed not supposed to happen, I think a fix like this would
just paper over the real problem.  Or maybe you have a legitimate use case that
wasn't expected.  You've talked about objfpy_add_separate_debug_file, so clearly your
use case involves separate debug files.  It's possible that we are passing the objfile
representing the separate debug file of the main objfile or something like that.

I'd like if we could get a reproducer before we commit a fix we are not sure to
understand.  Can you give the backtrace, maybe we'll get some idea?

Also, could you provide as much info about your use case as possible to try to guide
us in finding the root issue?

Simon
  
Terekhov, Mikhail via Gdb-patches Nov. 20, 2019, 12:02 a.m. UTC | #4
Hi Simon,

We have some hooks into gdb's "new objfile" event, where we optionally load
related debug files from somewhere else. The failing test involves these
hooks and reducing it further is proving difficult. I put a breakpoint just
before entering the loop:

CORE_ADDR

get_msymbol_address (struct objfile *objf, const struct minimal_symbol

*minsym) {

  gdb_assert (minsym->maybe_copied);

  gdb_assert ((objf->flags & OBJF_MAINLINE) == 0);

  const char *linkage_name = MSYMBOL_LINKAGE_NAME (minsym);

  for (objfile *objfile : current_program_space->objfiles ()) {

    if ((objfile->flags & OBJF_MAINLINE) != 0) {

      bound_minimal_symbol found = lookup_minimal_symbol_linkage

(linkage_name, objfile);

<br>  if (found.minsym != nullptr) return BMSYMBOL_VALUE_ADDRESS (found);

    }

  }

  return (minsym->value.address + ANOFFSET (objf->section_offsets,

minsym->section));

}

(top-gdb) c

(top-gdb) p objf

$12 = (objfile *) 0x1a5dae0

(top-gdb) p minsym

$13 = (const minimal_symbol *) 0x7ffff70fff88

(top-gdb) p objf->flags

$14 = {m_enum_value = (OBJF_REORDERED | OBJF_USERLOADED |

OBJF_PSYMTABS_READ)}

(top-gdb) p objfile

$15 = (objfile *) 0x1928000

(top-gdb) p objfile->flags

$16 = {m_enum_value = (OBJF_REORDERED | OBJF_USERLOADED |

OBJF_PSYMTABS_READ | OBJF_MAINLINE)}

(top-gdb) p found

$17 = {minsym = 0x7ffff70fff88, objfile = 0x1a5dae0}

So, objfile and objf are different objects. We call
lookup_minimal_symbol_linkage(linkage_name, objfile), it iterates through
objfile->separate_debug_objfiles() and returns the original objf and minsym.

The stack trace is like this:

#0  get_msymbol_address (objf=0x1a5dae0, minsym=0x7ffff70fff88) at

gdb/symtab.c:6306

#1  0x0000000000c98b1e in lookup_minimal_symbol_by_pc_name (pc=0x10d120,

name=0x1c86990 "main(int, char**)", objf=0x1a5dae0) at gdb/minsyms.c:613

#2  0x0000000000e090d8 in fixup_section (ginfo=0x1d42af0, addr=0x10d120,

objfile=0x1a5dae0) at gdb/symtab.c:1649

#3  0x0000000000e09577 in fixup_symbol_section (sym=0x1d42af0,

objfile=0x1a5dae0) at gdb/symtab.c:1759

#4  0x0000000000e16401 in lookup_symbol_via_quick_fns (objfile=0x1a5dae0,

block_index=GLOBAL_BLOCK,

    name=0x1f995b0 "main", domain=VAR_DOMAIN) at gdb/symtab.c:2408

#5  0x0000000000e0ab30 in lookup_symbol_in_objfile (objfile=0x1a5dae0,

block_index=GLOBAL_BLOCK,

    name=0x1f995b0 "main", domain=VAR_DOMAIN) at gdb/symtab.c:2530

I don't know whether what we are doing breaks too many assumptions:

* already have a debug file and load an equivalent one again?

* have two debug files that define the same symbol?

* something else?

but my patch with a simple guard against infinite recursion seems to work
well enough.

Thanks,

Ali


On Mon, Nov 18, 2019 at 9:15 PM Simon Marchi <simark@simark.ca> wrote:

> On 2019-11-06 11:05 p.m., Ali Tamur via gdb-patches wrote:
> > The patch 4b610737f0 seems to have introduced the possibility of infinite
> > recursion. I have encountered the problem while debugging a failing
> in-house
> > test. I am sorry, it is fairly difficult to reduce the test case (and I
> don't
> > understand most of what is going on) but the stack trace shows a call to
> > objfpy_add_separate_debug_file, which eventually causes
> > lookup_minimal_symbol_by_pc_name to be invoked, which calls
> get_msymbol_address.
> > Somehow lookup_minimal_symbol_linkage finds the same symbol and the
> function
> > calls itself with the same parameters. I don't know whether this should
> be
> > classified as 'it should never happen', but this simple patch makes the
> test
> > pass and should be harmless, I think.
> >
> > gdb/ChangeLog
> >
> >       * symtab.c (get_msymbol_address): Guard against infinite recursion.
> > ---
> >  gdb/symtab.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/gdb/symtab.c b/gdb/symtab.c
> > index 2c934b9c22..b231cc6e84 100644
> > --- a/gdb/symtab.c
> > +++ b/gdb/symtab.c
> > @@ -6328,7 +6328,7 @@ get_msymbol_address (struct objfile *objf, const
> struct minimal_symbol *minsym)
> >       {
> >         bound_minimal_symbol found
> >           = lookup_minimal_symbol_linkage (linkage_name, objfile);
> > -       if (found.minsym != nullptr)
> > +       if (found.minsym != nullptr && found.minsym != minsym)
> >           return BMSYMBOL_VALUE_ADDRESS (found);
> >       }
> >      }
>
> Hi Ali,
>
> At the top of the function, we assert that the objfile associated to the
> minsym
> is not the main one:
>
>   gdb_assert ((objf->flags & OBJF_MAINLINE) == 0);
>
> We then iterate on all objfiles, looking only for main objfiles (presumably
> there's only one).  We look up a minsym with the same name as our original
> minsym in
> the main objfile.
>
> How could our original minsym (which is not supposed to come from the main
> objfile,
> I believe) could be the same one as the minsym found when looking up in
> the main
> objfile?
>
> If this is indeed not supposed to happen, I think a fix like this would
> just paper over the real problem.  Or maybe you have a legitimate use case
> that
> wasn't expected.  You've talked about objfpy_add_separate_debug_file, so
> clearly your
> use case involves separate debug files.  It's possible that we are passing
> the objfile
> representing the separate debug file of the main objfile or something like
> that.
>
> I'd like if we could get a reproducer before we commit a fix we are not
> sure to
> understand.  Can you give the backtrace, maybe we'll get some idea?
>
> Also, could you provide as much info about your use case as possible to
> try to guide
> us in finding the root issue?
>
> Simon
>
  
Simon Marchi Nov. 20, 2019, 12:35 a.m. UTC | #5
On 2019-11-19 7:02 p.m., Ali Tamur via gdb-patches wrote:
> Hi Simon,
> 
> We have some hooks into gdb's "new objfile" event, where we optionally load
> related debug files from somewhere else.

Thanks, and does it happen for the "new objfile" event for the main objfile (the
main executable), or for the "new objfile" event for another objfile (a shared library)?

>The failing test involves these
> hooks and reducing it further is proving difficult. I put a breakpoint just
> before entering the loop:
> 
> CORE_ADDR
> 
> get_msymbol_address (struct objfile *objf, const struct minimal_symbol
> 
> *minsym) {
> 
>   gdb_assert (minsym->maybe_copied);
> 
>   gdb_assert ((objf->flags & OBJF_MAINLINE) == 0);
> 
>   const char *linkage_name = MSYMBOL_LINKAGE_NAME (minsym);
> 
>   for (objfile *objfile : current_program_space->objfiles ()) {
> 
>     if ((objfile->flags & OBJF_MAINLINE) != 0) {
> 
>       bound_minimal_symbol found = lookup_minimal_symbol_linkage
> 
> (linkage_name, objfile);
> 
> <br>  if (found.minsym != nullptr) return BMSYMBOL_VALUE_ADDRESS (found);
> 
>     }
> 
>   }
> 
>   return (minsym->value.address + ANOFFSET (objf->section_offsets,
> 
> minsym->section));
> 
> }
> 
> (top-gdb) c
> 
> (top-gdb) p objf
> 
> $12 = (objfile *) 0x1a5dae0
> 
> (top-gdb) p minsym
> 
> $13 = (const minimal_symbol *) 0x7ffff70fff88
> 
> (top-gdb) p objf->flags
> 
> $14 = {m_enum_value = (OBJF_REORDERED | OBJF_USERLOADED |
> 
> OBJF_PSYMTABS_READ)}

For this objfile, it would be interesting to print the objfile::separate_debug_objfile_backlink
to know if it represents a separate debug information objfile or not.  In fact, if you could
provide "print objf" as well as "print *objf", it would help.

> (top-gdb) p objfile
> 
> $15 = (objfile *) 0x1928000
> 
> (top-gdb) p objfile->flags
> 
> $16 = {m_enum_value = (OBJF_REORDERED | OBJF_USERLOADED |
> 
> OBJF_PSYMTABS_READ | OBJF_MAINLINE)}

Same thing for this objfile, both "print objfile" and "print *objfile".

> (top-gdb) p found
> 
> $17 = {minsym = 0x7ffff70fff88, objfile = 0x1a5dae0}



> 
> So, objfile and objf are different objects. We call
> lookup_minimal_symbol_linkage(linkage_name, objfile), it iterates through
> objfile->separate_debug_objfiles() and returns the original objf and minsym.
> 
> The stack trace is like this:
> 
> #0  get_msymbol_address (objf=0x1a5dae0, minsym=0x7ffff70fff88) at
> 
> gdb/symtab.c:6306
> 
> #1  0x0000000000c98b1e in lookup_minimal_symbol_by_pc_name (pc=0x10d120,
> 
> name=0x1c86990 "main(int, char**)", objf=0x1a5dae0) at gdb/minsyms.c:613
> 
> #2  0x0000000000e090d8 in fixup_section (ginfo=0x1d42af0, addr=0x10d120,
> 
> objfile=0x1a5dae0) at gdb/symtab.c:1649
> 
> #3  0x0000000000e09577 in fixup_symbol_section (sym=0x1d42af0,
> 
> objfile=0x1a5dae0) at gdb/symtab.c:1759
> 
> #4  0x0000000000e16401 in lookup_symbol_via_quick_fns (objfile=0x1a5dae0,
> 
> block_index=GLOBAL_BLOCK,
> 
>     name=0x1f995b0 "main", domain=VAR_DOMAIN) at gdb/symtab.c:2408
> 
> #5  0x0000000000e0ab30 in lookup_symbol_in_objfile (objfile=0x1a5dae0,
> 
> block_index=GLOBAL_BLOCK,
> 
>     name=0x1f995b0 "main", domain=VAR_DOMAIN) at gdb/symtab.c:2530

Could you provide the full stack trace please?  Ideally as an attached file, as your
email client seems to wrap the lines in a way that makes them hard to read.

> I don't know whether what we are doing breaks too many assumptions:
> 
> * already have a debug file and load an equivalent one again?
> 
> * have two debug files that define the same symbol?
> 
> * something else?

My intuition is that the two objfiles above are related, one would be the separate debug
file for the other.  If you provide the info I requested above we'll be able to tell.

> but my patch with a simple guard against infinite recursion seems to work
> well enough.

It gets rid of the symptoms of the problem, but we don't know if it gets rid of the
problem.  Maybe it does, but we need to understand what happens first.

Simon
  

Patch

diff --git a/gdb/symtab.c b/gdb/symtab.c
index 2c934b9c22..b231cc6e84 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -6328,7 +6328,7 @@  get_msymbol_address (struct objfile *objf, const struct minimal_symbol *minsym)
 	{
 	  bound_minimal_symbol found
 	    = lookup_minimal_symbol_linkage (linkage_name, objfile);
-	  if (found.minsym != nullptr)
+	  if (found.minsym != nullptr && found.minsym != minsym)
 	    return BMSYMBOL_VALUE_ADDRESS (found);
 	}
     }