[review] Avoid infinite recursion in find_pc_sect_line

Message ID gerrit.1574019611000.I595470be6ab5f61ca7e4e9e70c61a252c0deaeaa@gnutoolchain-gerrit.osci.io
State New, archived
Headers

Commit Message

Simon Marchi (Code Review) Nov. 17, 2019, 7:40 p.m. UTC
  Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/682
......................................................................

Avoid infinite recursion in find_pc_sect_line

This is a patch which has been in Fedora GDB for well over a decade.  It
was written by Jan Kratochvil, though I've made some minor changes to
the comment and warning message.

It was motivated by a customer reported bug (back in 2006!) which
showed infinite mutual recursion between find_pc_sect_line and
find_pc_line.  Here is a portion of the backtrace from the bug report:

    (gdb) bt
    #0  0x00000000004450a4 in lookup_minimal_symbol_by_pc_section (
	pc=251700325328, section=0x570f500) at gdb/minsyms.c:484
    #1  0x00000000004bbfb2 in find_pc_sect_line (pc=251700325328,
	section=0x570f500, notcurrent=0) at gdb/symtab.c:2057
    #2  0x00000000004bc480 in find_pc_line (pc=251700325328, notcurrent=0)
	at gdb/symtab.c:2232
    #3  0x00000000004bc1ff in find_pc_sect_line (pc=251700325328,
	section=0x570f500, notcurrent=0) at gdb/symtab.c:2081

    ...   (lots and lots of the same two functions with the same parameters)

    #1070 0x00000000004bc480 in find_pc_line (pc=251700325328, notcurrent=0)
	at gdb/symtab.c:2232
    #1071 0x00000000004bc1ff in find_pc_sect_line (pc=251700325328,
	section=0x570f500, notcurrent=0) at gdb/symtab.c:2081
    #1072 0x00000000004bc480 in find_pc_line (pc=251700325328, notcurrent=0)
	at gdb/symtab.c:2232
    #1073 0x00000000004bc1ff in find_pc_sect_line (pc=251700325328,
	section=0x570f500, notcurrent=0) at gdb/symtab.c:2081
    #1074 0x00000000004bc480 in find_pc_line (pc=251700325328, notcurrent=0)
	at gdb/symtab.c:2232
    #1075 0x00000000004bc1ff in find_pc_sect_line (pc=251696794399,
	section=0x59b0df8, notcurrent=0) at gdb/symtab.c:2081
    #1076 0x00000000004bc480 in find_pc_line (pc=251696794399, notcurrent=0)
	at gdb/symtab.c:2232
    #1077 0x000000000055550e in find_frame_sal (frame=0xb3f3e0, sal=0x7fff1d1a8200)
	at gdb/frame.c:1392
    #1078 0x00000000004d86fd in set_current_sal_from_frame (frame=0x1648, center=1)
	at gdb/stack.c:379
    #1079 0x00000000004cf137 in normal_stop () at gdb/infrun.c:3147
    ...

The test case was a large application.  Attempts were made to make a
small(er) test case, but those attempts were not successful.
Therefore, I cannot provide a new test for this patch.

That said, we ought to guard against recursively calling
find_pc_sect_line (via find_pc_line) with the identical PC value that
it had been called with.  Should this happen, infinite recursion (as
shown in the above backtrace) is the result.  This patch prevents
that from happening.

Also, it seems likely that if this case should happen, there is a bug
somewhere, possibly in the C library or in some other part of the
toolchain, so we print a warning message asking the user to file a bug
report.  We don't error out here because falling through may produce
a satisfactory result.

I spent some time looking at the surrounding code and commentary which
handle the case of PC being in a stub/trampoline.  It first appeared
in the public GDB repository in April, 1999.  The ChangeLog entry for
this commit is from 1998-12-31.  The relevant portion is:

	(find_pc_sect_line): Return correct information if pc is in import
	or export stub (trampoline).

What's remarkable about the overall ChangeLog entry is that it's over
2500+ lines long!  I believe that this was part of the infamous "HP
merge" (in which insufficient due diligence was given in accepting
a large batch of changes from an outside source).  In the years that
followed, much of this code was either significantly revised or
outright removed.

For this particular case, I'm grateful that extensive comments were
provided by "RT".  (I haven't been able to figure out who RT is/was.)
I've decided against attempting to revise this stub/trampoline handling
code any further than adding Jan's test which prevents an obvious case
of infinite recursion.

I've tested on Fedora 31, x86-64.  I see no regressions.  I've also
searched the logfile for the new message, but as expected, no message
was found (which is good).

gdb/ChangeLog:

	* symtab.c (find_pc_sect_line): Add check which prevents infinite
	recursion.

Change-Id: I595470be6ab5f61ca7e4e9e70c61a252c0deaeaa
---
M gdb/symtab.c
1 file changed, 10 insertions(+), 0 deletions(-)
  

Comments

Simon Marchi (Code Review) Nov. 27, 2019, 2:19 a.m. UTC | #1
Luis Machado has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/682
......................................................................


Patch Set 1:

Out of curiosity, is the example backtrace from a code using the patched version of GDB? How did we break out of the recursion and into lookup_minimal_symbol_by_pc_section?
  
Simon Marchi (Code Review) Nov. 27, 2019, 8:13 p.m. UTC | #2
Kevin Buettner has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/682
......................................................................


Patch Set 1:

> Patch Set 1:
> 
> Out of curiosity, is the example backtrace from a code using the patched version of GDB? How did we break out of the recursion and into lookup_minimal_symbol_by_pc_section?

I don't know the answers to either of these questions. The example backtrace was pasted from a very old customer bug report.
  
Simon Marchi (Code Review) Nov. 27, 2019, 11 p.m. UTC | #3
Simon Marchi has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/682
......................................................................


Patch Set 1:

I'm not 100% comfortable in merging fixes like this, in an area that's already very tricky, without reproducer.  I'm not strongly opposing either, but just sharing my opinion.

It just doesn't sound logical.  To reach this "else", we need to:

1. lookup a minsym for PC, that minsym is a trampoline.  It means that this trampoline minsym is the closest (smaller or equal) minsym from PC, all minsym types considered.
2. lookup a _text_ minsym by name, we find one that is exactly equal to pc

If that text minsym we have found in #2 is equal to PC, why did we not find it in #1?  The only way I see is for the two minsyms (the trampoline one and the text one) to have the same value.  The lookup function would have returned the trampoline one, because it can only return one symbol, so it has to choose, and the trampoline one happens to come before.  Then, we search specifically text symbols by name, we find the text one.  But that case would already handled with the existing

	else if (BMSYMBOL_VALUE_ADDRESS (mfunsym)
		 == BMSYMBOL_VALUE_ADDRESS (msymbol))

So the only explanation I can imagine is that there was another GDB bug, causing it to mess up the minsym creation or lookup.  And either:

- that bug was fixed, and this fix is no longer needed
- that bug still isn't fixed, in which case this fix just covers up the problem.

I think you are already conscious of this, since you added a warning to tell the user that this is not expected.  But if this is a condition we expect to never legitimately happen, I would suggest using an assertion instead:

	    gdb_assert (BMSYMBOL_VALUE_ADDRESS (mfunsym) != pc);
	    return find_pc_line (BMSYMBOL_VALUE_ADDRESS (mfunsym), 0);

If this ever happens again in the field, we have more chances to hear about it than if it's a warning that users can easily overlook.
  
Simon Marchi (Code Review) Dec. 4, 2019, 10:04 p.m. UTC | #4
Kevin Buettner has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/682
......................................................................


Patch Set 1:

> Patch Set 1:
> 
> I'm not 100% comfortable in merging fixes like this, in an area that's already very tricky, without reproducer.  I'm not strongly opposing either, but just sharing my opinion.
> 
> It just doesn't sound logical.  To reach this "else", we need to:
> 
> 1. lookup a minsym for PC, that minsym is a trampoline.  It means that this trampoline minsym is the closest (smaller or equal) minsym from PC, all minsym types considered.
> 2. lookup a _text_ minsym by name, we find one that is exactly equal to pc
> 
> If that text minsym we have found in #2 is equal to PC, why did we not find it in #1?  The only way I see is for the two minsyms (the trampoline one and the text one) to have the same value.  The lookup function would have returned the trampoline one, because it can only return one symbol, so it has to choose, and the trampoline one happens to come before.  Then, we search specifically text symbols by name, we find the text one.  But that case would already handled with the existing
> 
> 	else if (BMSYMBOL_VALUE_ADDRESS (mfunsym)
> 		 == BMSYMBOL_VALUE_ADDRESS (msymbol))
> 
> So the only explanation I can imagine is that there was another GDB bug, causing it to mess up the minsym creation or lookup.  And either:
> 
> - that bug was fixed, and this fix is no longer needed
> - that bug still isn't fixed, in which case this fix just covers up the problem.
> 
> I think you are already conscious of this, since you added a warning to tell the user that this is not expected.  But if this is a condition we expect to never legitimately happen, I would suggest using an assertion instead:
> 
> 	    gdb_assert (BMSYMBOL_VALUE_ADDRESS (mfunsym) != pc);
> 	    return find_pc_line (BMSYMBOL_VALUE_ADDRESS (mfunsym), 0);
> 
> If this ever happens again in the field, we have more chances to hear about it than if it's a warning that users can easily overlook.

As a GDB user, I'd prefer that GDB do its best to provide a reasonable result. However, as I developer, I too, really want a bug report. I don't have a strong preference either way, so I went (mostly) with your suggestion, but the new patch calls internal_error instead of using an assert. This way, I could include a plea to file a bug report. (Though that may be obvious if you get an assertion failure.)

Anyway, the new patch can be found here:

https://www.sourceware.org/ml/gdb-patches/2019-12/msg00168.html

(I didn't use gerrit to post it.)
  

Patch

diff --git a/gdb/symtab.c b/gdb/symtab.c
index 0064800..8a01317 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -3142,6 +3142,16 @@ 
 	     SYMBOL_LINKAGE_NAME (msymbol)); */
 	  ;
 	/* fall through */
+	else if (BMSYMBOL_VALUE_ADDRESS (mfunsym) == pc)
+	  {
+	    /* Avoid infinite (mutual) recursion between
+	       find_pc_sect_line and find_pc_line.  */
+	    warning ("Could not dereference stub/trampoline %s (0x%s).  "
+	             "Please file a bug report.",
+	             MSYMBOL_LINKAGE_NAME (msymbol.minsym),
+		     paddress(target_gdbarch (), pc));
+	    /* Fall through.  */
+	  }
 	else
 	  return find_pc_line (BMSYMBOL_VALUE_ADDRESS (mfunsym), 0);
       }