[v2,2/8] Search global block from basic_lookup_symbol_nonlocal

Message ID 20190920192017.15293-3-tromey@adacore.com
State New, archived
Headers

Commit Message

Tom Tromey Sept. 20, 2019, 7:20 p.m. UTC
  From: Andrew Burgess <andrew.burgess@embecosm.com>

This changes lookup_global_symbol to look in the global block
of the passed-in block.  If no block was passed in, it reverts to the
previous behavior.

This change is needed to ensure that 'FILENAME'::NAME lookups work
properly.  As debugging Pedro's test case showed, this was not working
properly in the case where multiple identical names could be found
(the one situation where this feature is truly needed :-).

This also removes some old comments from basic_lookup_symbol_nonlocal
that no longer apply.

Note that the new test cases for this change will appear in a later
patch.  They are in gdb.base/print-file-var.exp.

gdb/ChangeLog
2019-08-27  Andrew Burgess  <andrew.burgess@embecosm.com>

	* symtab.c (lookup_global_symbol): Search global block.
---
 gdb/ChangeLog |  4 ++++
 gdb/symtab.c  | 41 +++++++++++++----------------------------
 2 files changed, 17 insertions(+), 28 deletions(-)
  

Comments

Terekhov, Mikhail via Gdb-patches Sept. 21, 2019, 4:31 a.m. UTC | #1
On Sat, Sep 21, 2019 at 4:20 AM Tom Tromey <tromey@adacore.com> wrote:
> From: Andrew Burgess <andrew.burgess@embecosm.com>
>
> This changes lookup_global_symbol to look in the global block
> of the passed-in block.  If no block was passed in, it reverts to the
> previous behavior.
>
> This change is needed to ensure that 'FILENAME'::NAME lookups work
> properly.  As debugging Pedro's test case showed, this was not working
> properly in the case where multiple identical names could be found
> (the one situation where this feature is truly needed :-).

This further extends the situations where lookup_global_symbols checks
a local scope first (currently only objfile) but lookup_static_symbol
doesn't. Is that really correct? I would think that
lookup_static_symbol is even more likely to need that check, since
static symbols are probably (?) more likely to share the same name. Is
my interpretation wrong?

Christian
  
Andrew Burgess Sept. 23, 2019, 2:16 p.m. UTC | #2
* Christian Biesinger <cbiesinger@google.com> [2019-09-21 13:31:30 +0900]:

> On Sat, Sep 21, 2019 at 4:20 AM Tom Tromey <tromey@adacore.com> wrote:
> > From: Andrew Burgess <andrew.burgess@embecosm.com>
> >
> > This changes lookup_global_symbol to look in the global block
> > of the passed-in block.  If no block was passed in, it reverts to the
> > previous behavior.
> >
> > This change is needed to ensure that 'FILENAME'::NAME lookups work
> > properly.  As debugging Pedro's test case showed, this was not working
> > properly in the case where multiple identical names could be found
> > (the one situation where this feature is truly needed :-).
> 
> This further extends the situations where lookup_global_symbols checks
> a local scope first (currently only objfile) but lookup_static_symbol
> doesn't.

[ Note, in the below I talk about "my patch", I really mean mine and
  Tom's as my suggested rework was based on his original patch. ]

I didn't fully understand this for two reasons, first, you say
"further extends", however, both lookup_global_symbol and
lookup_static_symbol pre-patch, both just call
lookup_global_or_static_symbol.  And I couldn't see any code in
lookup_global_or_static_symbol that was conditional on global or
static lookup.  So my question is where is the existing situation in
which lookup_global_symbol searches a local scope first?  I ask only
so I can try to understand your question and my change in relation to
the existing code.

Second, I wanted to better understand what you mean by "local scope".
My understanding of lookup_global_symbol is that it searches every
global block from every object file in some particular order.  The
change here is simply that it makes more sense to search the global
scope relating to the current compilation unit before searching other
global scopes.  I can fully understand that this can be referred to as
a "local scope", I just wanted to make sure we're all agreed on what's
happening here.

> Is that really correct?

I think it makes sense.  If we have multiple global symbols with the
same name, we should find the one corresponding to the current scope I
think.  That feels like what I'd expect to happen.  And especially as
when we do a FILE::VAR lookup we end up in lookup_global_symbol with
the block corresponding to FILE.  If we then ignore that block and
search all global scopes in the predefined order then we will always
find the same global symbol, which is certainly wrong.

> I would think that
> lookup_static_symbol is even more likely to need that check, since
> static symbols are probably (?) more likely to share the same name. Is
> my interpretation wrong?

No, I think you are very right....

But...

There are currently 5 uses of lookup_static_symbol that I could find,
2 of these are in:

  d-namespace.c:find_symbol_in_baseclass
  cp-namespace.c:cp_lookup_nested_symbol_1

These are interesting because almost immediately before calling
lookup_static_symbol we call lookup_symbol_in_static_block, which I
think is solving the same problem as this proposed patch.

Then we have a call to lookup_static_symbol in:

  symtab.c:lookup_symbol_aux

This calls lookup_static_symbol as the last fallback action after
calling the language specific hook la_lookup_symbol_nonlocal.  I've
only audited some languages, but for those I've looked at they all
call lookup_symbol_in_static_block as one of their early actions:

  symtab.c:basic_lookup_symbol_nonlocal		(for C, Pascal)
  cp-namespace.c:cp_lookup_bare_symbol		(for C++, Fortran)
  cp-namespace.c:cp_lookup_bare_symbol		(for Fortran)
  d-namespace.c:d_lookup_symbol			(for D)
  rust-lang.c:rust_lookup_symbol_nonlocal	(for Rust)

That leaves two uses of lookup_static_symbol, these are:

  python/py-symbol.c:gdbpy_lookup_static_symbol
  d-namespace.c:d_lookup_nested_symbol

In these cases there is no call to lookup_symbol_in_static_block. I
would need to investigate more to see if these are working as expected
or not.  I suspect the python use case might not always do what a user
expects, as it searches static symbols in a predefined order, if we
have multiple with the same name we would always find the same one
first, but we'd probably expect to find one from the current object
file before one from a different object file.  As for the D use case,
I don't know D, so don't feel qualified to judge.

I think my conclusion is that you're right, but, refactoring this code
to have lookup_static_symbol call lookup_symbol_in_static_block (or
equivalent in all cases) seems a pretty scary change.  I'd ideally
like to see that refactoring separated from this patch series.

My vote would be to merge this, and then, possibly we can look at
reworking symbol lookup inline with your suggestion.  What do you
think?

Thanks,
Andrew
  
Tom Tromey Oct. 2, 2019, 3:51 p.m. UTC | #3
>>>>> "Andrew" == Andrew Burgess <andrew.burgess@embecosm.com> writes:

Andrew> I think my conclusion is that you're right, but, refactoring this code
Andrew> to have lookup_static_symbol call lookup_symbol_in_static_block (or
Andrew> equivalent in all cases) seems a pretty scary change.  I'd ideally
Andrew> like to see that refactoring separated from this patch series.

I think it would be better on the whole to have much simpler symbol
lookup functions, and then put the complicated parts (like "this"
searching or iteration over static blocks) into the per-language parser
or evaluation code.  Maybe that's too big a job to contemplate though.

Andrew> My vote would be to merge this, and then, possibly we can look at
Andrew> reworking symbol lookup inline with your suggestion.  What do you
Andrew> think?

I am going to go ahead and do this now.

Tom
  
Terekhov, Mikhail via Gdb-patches Nov. 9, 2019, 6:53 a.m. UTC | #4
I just realized I never responded to this. I just wanted to mention a
couple of things, below.

On Mon, Sep 23, 2019 at 9:16 AM Andrew Burgess
<andrew.burgess@embecosm.com> wrote:
>
> * Christian Biesinger <cbiesinger@google.com> [2019-09-21 13:31:30 +0900]:
>
> > On Sat, Sep 21, 2019 at 4:20 AM Tom Tromey <tromey@adacore.com> wrote:
> > > From: Andrew Burgess <andrew.burgess@embecosm.com>
> > >
> > > This changes lookup_global_symbol to look in the global block
> > > of the passed-in block.  If no block was passed in, it reverts to the
> > > previous behavior.
> > >
> > > This change is needed to ensure that 'FILENAME'::NAME lookups work
> > > properly.  As debugging Pedro's test case showed, this was not working
> > > properly in the case where multiple identical names could be found
> > > (the one situation where this feature is truly needed :-).
> >
> > This further extends the situations where lookup_global_symbols checks
> > a local scope first (currently only objfile) but lookup_static_symbol
> > doesn't.
>
> [ Note, in the below I talk about "my patch", I really mean mine and
>   Tom's as my suggested rework was based on his original patch. ]
>
> I didn't fully understand this for two reasons, first, you say
> "further extends", however, both lookup_global_symbol and
> lookup_static_symbol pre-patch, both just call
> lookup_global_or_static_symbol.  And I couldn't see any code in
> lookup_global_or_static_symbol that was conditional on global or
> static lookup.  So my question is where is the existing situation in
> which lookup_global_symbol searches a local scope first?  I ask only
> so I can try to understand your question and my change in relation to
> the existing code.

lookup_static_symbol always passes nullptr as the objfile to
lookup_global_or_static_symbol, unlike lookup_global_symbol, which is
the difference I meant.

> Second, I wanted to better understand what you mean by "local scope".
> My understanding of lookup_global_symbol is that it searches every
> global block from every object file in some particular order.  The
> change here is simply that it makes more sense to search the global
> scope relating to the current compilation unit before searching other
> global scopes.  I can fully understand that this can be referred to as
> a "local scope", I just wanted to make sure we're all agreed on what's
> happening here.

lookup_global_symbol takes an optional block argument; it starts the
search in the objfile associated with that block (or did, before your
patch).

The argument I was making is that it probably makes sense for both
global and static lookups to take a block (or global block) argument
for starting the search, and that this is more important for static
symbols because there are more likely to be name collisions there.

My apologies for expressing myself poorly.

> I think it makes sense.  If we have multiple global symbols with the
> same name, we should find the one corresponding to the current scope I
> think.  That feels like what I'd expect to happen.  And especially as
> when we do a FILE::VAR lookup we end up in lookup_global_symbol with
> the block corresponding to FILE.  If we then ignore that block and
> search all global scopes in the predefined order then we will always
> find the same global symbol, which is certainly wrong.
>
> > I would think that
> > lookup_static_symbol is even more likely to need that check, since
> > static symbols are probably (?) more likely to share the same name. Is
> > my interpretation wrong?
>
> No, I think you are very right....
>
> But...
>
> There are currently 5 uses of lookup_static_symbol that I could find,
> 2 of these are in:
>
>   d-namespace.c:find_symbol_in_baseclass
>   cp-namespace.c:cp_lookup_nested_symbol_1
>
> These are interesting because almost immediately before calling
> lookup_static_symbol we call lookup_symbol_in_static_block, which I
> think is solving the same problem as this proposed patch.
>
> Then we have a call to lookup_static_symbol in:
>
>   symtab.c:lookup_symbol_aux
>
> This calls lookup_static_symbol as the last fallback action after
> calling the language specific hook la_lookup_symbol_nonlocal.  I've
> only audited some languages, but for those I've looked at they all
> call lookup_symbol_in_static_block as one of their early actions:
>
>   symtab.c:basic_lookup_symbol_nonlocal         (for C, Pascal)
>   cp-namespace.c:cp_lookup_bare_symbol          (for C++, Fortran)
>   cp-namespace.c:cp_lookup_bare_symbol          (for Fortran)
>   d-namespace.c:d_lookup_symbol                 (for D)
>   rust-lang.c:rust_lookup_symbol_nonlocal       (for Rust)
>
> That leaves two uses of lookup_static_symbol, these are:
>
>   python/py-symbol.c:gdbpy_lookup_static_symbol
>   d-namespace.c:d_lookup_nested_symbol
>
> In these cases there is no call to lookup_symbol_in_static_block. I
> would need to investigate more to see if these are working as expected
> or not.  I suspect the python use case might not always do what a user
> expects, as it searches static symbols in a predefined order, if we
> have multiple with the same name we would always find the same one
> first, but we'd probably expect to find one from the current object
> file before one from a different object file.  As for the D use case,
> I don't know D, so don't feel qualified to judge.

The symbol lookup code is so confusing :(

For someone like me, new to GDB, it is difficult to understand this
code -- why does global lookup take a block but local doesn't.
Apparently the answer is that the callers do the static block lookup.
But.. why? Anyway that's the context for my questions.

Thanks for your work in this area!

Christian
  

Patch

diff --git a/gdb/symtab.c b/gdb/symtab.c
index 35eab08cb37..0d29e243b9e 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -2414,34 +2414,6 @@  basic_lookup_symbol_nonlocal (const struct language_defn *langdef,
 {
   struct block_symbol result;
 
-  /* NOTE: carlton/2003-05-19: The comments below were written when
-     this (or what turned into this) was part of lookup_symbol_aux;
-     I'm much less worried about these questions now, since these
-     decisions have turned out well, but I leave these comments here
-     for posterity.  */
-
-  /* NOTE: carlton/2002-12-05: There is a question as to whether or
-     not it would be appropriate to search the current global block
-     here as well.  (That's what this code used to do before the
-     is_a_field_of_this check was moved up.)  On the one hand, it's
-     redundant with the lookup in all objfiles search that happens
-     next.  On the other hand, if decode_line_1 is passed an argument
-     like filename:var, then the user presumably wants 'var' to be
-     searched for in filename.  On the third hand, there shouldn't be
-     multiple global variables all of which are named 'var', and it's
-     not like decode_line_1 has ever restricted its search to only
-     global variables in a single filename.  All in all, only
-     searching the static block here seems best: it's correct and it's
-     cleanest.  */
-
-  /* NOTE: carlton/2002-12-05: There's also a possible performance
-     issue here: if you usually search for global symbols in the
-     current file, then it would be slightly better to search the
-     current global block before searching all the symtabs.  But there
-     are other factors that have a much greater effect on performance
-     than that one, so I don't think we should worry about that for
-     now.  */
-
   /* NOTE: dje/2014-10-26: The lookup in all objfiles search could skip
      the current objfile.  Searching the current objfile first is useful
      for both matching user expectations as well as performance.  */
@@ -2674,6 +2646,19 @@  lookup_global_symbol (const char *name,
 		      const struct block *block,
 		      const domain_enum domain)
 {
+  /* If a block was passed in, we want to search the corresponding
+     global block first.  This yields "more expected" behavior, and is
+     needed to support 'FILENAME'::VARIABLE lookups.  */
+  const struct block *global_block = block_global_block (block);
+  if (global_block != nullptr)
+    {
+      symbol *sym = lookup_symbol_in_block (name,
+					    symbol_name_match_type::FULL,
+					    global_block, domain);
+      if (sym != nullptr)
+	return { sym, global_block };
+    }
+
   struct objfile *objfile = lookup_objfile_from_block (block);
   return lookup_global_or_static_symbol (name, GLOBAL_BLOCK, objfile, domain);
 }