[00/29] Restructure symbol domains

Message ID 20231120-submit-domain-hacks-2-v1-0-29650d01b198@tromey.com
Headers
Series Restructure symbol domains |

Message

Tom Tromey Nov. 21, 2023, 3:53 a.m. UTC
  gdb's symbol domains have long needed some restructuring.

The current symbol domains are C-centric, with the "struct" domain
being separate from types (which is not the case in non-C languages)
and function and types being lumped in with variables.  This latter
decision makes it impossible to search the symbol table for a
function, resulting in bugs like PR 30158, where "main" was found as a
namespace, causing a crash.

This series adds new symbol domains for types and functions, and
changes the various symbol-lookup functions to allow multiple domains
to be searched at once.

Then, the symbol readers are changed to use the new domains.

Finally, selected bits of code are changed to be more precise in which
domains they search.

symbol_matches_domain currently has a C++-specific hack.  This hack
handles the C++ language rule where a tag is also entered as a
typedef.  While working on this series, I discovered that the
non-DWARF symbol readers will actually emit a second typedef symbol.
DWARF could do this as well, at some memory expense; and while I
consider this to be cleaner in an abstract way, for the time being
I've left the hack in place.

I regression tested this on x86-64 Fedora 38.  I also regression
tested using the debug-names and gdb-index target boards.

---
Tom Tromey (29):
      Fix bug in cooked index scanner
      Small cleanup in DWARF reader
      Make nsalias.exp more reliable
      Fix latent bug in mdebugread.c
      Give names to unspecified types
      Remove NR_DOMAINS
      Simplify symbol_to_info_string
      Split up a big 'if' in symtab.c
      Use a .def file for domain_enum
      Add two new symbol domains
      Add domain_search_flags
      Replace search_domain with domain_search_flags
      Remove a check of VAR_DOMAIN
      Introduce "scripting" domains
      Use domain_search_flags in lookup_global_symbol_language
      Use domain_search_flags in lookup_symbol et al
      Remove some obsolete Python constants
      Remove old symbol_matches_domain
      Use the new symbol domains
      Simplify some symbol searches in Ada code
      Simplify some symbol searches in linespec.c
      Only search for "main" as a function
      Only look for functions in expand_symtabs_for_function
      Use a function-domain search in inside_main_func
      Only search types in cp_lookup_rtti_type
      Only search types in lookup_typename
      Only search for functions in rust_structop::evaluate_funcall
      Refine search in cp_search_static_and_baseclasses
      Document new Python and Guile constants

 gdb/NEWS                                 |  12 +
 gdb/ada-exp.y                            |  13 +-
 gdb/ada-lang.c                           |  77 +++--
 gdb/ada-lang.h                           |   6 +-
 gdb/ada-tasks.c                          |  17 +-
 gdb/alpha-mdebug-tdep.c                  |   2 +-
 gdb/ax-gdb.c                             |   5 +-
 gdb/block.c                              |  24 +-
 gdb/block.h                              |  15 +-
 gdb/c-exp.y                              |  19 +-
 gdb/c-lang.c                             |   2 +-
 gdb/c-valprint.c                         |   2 +-
 gdb/coffread.c                           |   5 +-
 gdb/compile/compile-c-symbols.c          |  17 +-
 gdb/compile/compile-cplus-symbols.c      |  15 +-
 gdb/compile/compile-cplus-types.c        |   8 +-
 gdb/compile/compile-object-load.c        |   6 +-
 gdb/cp-namespace.c                       |  66 +++--
 gdb/cp-support.c                         |  10 +-
 gdb/cp-support.h                         |   8 +-
 gdb/ctfread.c                            |   2 +-
 gdb/d-exp.y                              |  12 +-
 gdb/d-lang.c                             |   2 +-
 gdb/d-lang.h                             |   9 +-
 gdb/d-namespace.c                        |  24 +-
 gdb/doc/guile.texi                       |  13 +
 gdb/doc/python.texi                      |  30 +-
 gdb/dwarf2/ada-imported.c                |   2 +-
 gdb/dwarf2/cooked-index.c                |  12 +
 gdb/dwarf2/cooked-index.h                |  45 +--
 gdb/dwarf2/index-write.c                 |   7 +-
 gdb/dwarf2/loc.c                         |   2 +-
 gdb/dwarf2/read-debug-names.c            | 132 ++-------
 gdb/dwarf2/read-gdb-index.c              |  75 +++--
 gdb/dwarf2/read.c                        |  65 +++--
 gdb/dwarf2/read.h                        |   2 +-
 gdb/dwarf2/tag.h                         |  78 +++++
 gdb/eval.c                               |   5 +-
 gdb/f-exp.y                              |   8 +-
 gdb/f-lang.c                             |   2 +-
 gdb/f-lang.h                             |   2 +-
 gdb/f-valprint.c                         |   2 +-
 gdb/fbsd-tdep.c                          |   5 +-
 gdb/frame.c                              |   9 +-
 gdb/ft32-tdep.c                          |   3 +-
 gdb/gdbtypes.c                           |  35 ++-
 gdb/gnu-v3-abi.c                         |   2 +-
 gdb/go-exp.y                             |   9 +-
 gdb/guile/scm-frame.c                    |   2 +-
 gdb/guile/scm-symbol.c                   |  25 +-
 gdb/infrun.c                             |   2 +-
 gdb/jit.c                                |   2 +-
 gdb/language.c                           |   5 +-
 gdb/language.h                           |   4 +-
 gdb/linespec.c                           |  67 +++--
 gdb/m2-exp.y                             |  10 +-
 gdb/mdebugread.c                         |   6 +-
 gdb/mi/mi-cmd-stack.c                    |   5 +-
 gdb/mi/mi-symbol-cmds.c                  |  35 +--
 gdb/moxie-tdep.c                         |   3 +-
 gdb/objc-lang.c                          |   4 +-
 gdb/objfiles.h                           |   9 +-
 gdb/p-exp.y                              |  19 +-
 gdb/p-valprint.c                         |   2 +-
 gdb/parse.c                              |   3 +-
 gdb/printcmd.c                           |   2 +-
 gdb/psymtab.c                            |  52 ++--
 gdb/psymtab.h                            |   7 +-
 gdb/python/py-frame.c                    |   3 +-
 gdb/python/py-objfile.c                  |   6 +-
 gdb/python/py-symbol.c                   |  53 ++--
 gdb/python/python.c                      |   2 +-
 gdb/quick-symbol.h                       |  12 +-
 gdb/rust-lang.c                          |   7 +-
 gdb/rust-lang.h                          |   2 +-
 gdb/rust-parse.c                         |   8 +-
 gdb/source.c                             |   5 +-
 gdb/stabsread.c                          |   8 +-
 gdb/stack.c                              |   4 +-
 gdb/sym-domains.def                      |  58 ++++
 gdb/symfile-debug.c                      |  30 +-
 gdb/symfile.c                            |   9 +-
 gdb/symfile.h                            |   2 +-
 gdb/symmisc.c                            |   3 +-
 gdb/symtab.c                             | 480 +++++++++++++++++--------------
 gdb/symtab.h                             | 177 ++++++------
 gdb/testsuite/gdb.ada/info_auto_lang.exp |   4 +-
 gdb/testsuite/gdb.ada/ptype-o.exp        |   2 +-
 gdb/testsuite/gdb.cp/nsalias.exp         |   2 +-
 gdb/testsuite/gdb.fortran/info-types.exp |   2 +-
 gdb/valops.c                             |  14 +-
 gdb/value.c                              |   6 +-
 gdb/xcoffread.c                          |   9 +-
 gdb/xstormy16-tdep.c                     |   3 +-
 94 files changed, 1117 insertions(+), 981 deletions(-)
---
base-commit: 8116169676604839ecfa39c1fe609249efb481d8
change-id: 20231120-submit-domain-hacks-2-1c1e66b4d560

Best regards,
  

Comments

John Baldwin Nov. 21, 2023, 6:37 p.m. UTC | #1
On 11/20/23 7:53 PM, Tom Tromey wrote:
> gdb's symbol domains have long needed some restructuring.
> 
> The current symbol domains are C-centric, with the "struct" domain
> being separate from types (which is not the case in non-C languages)
> and function and types being lumped in with variables.  This latter
> decision makes it impossible to search the symbol table for a
> function, resulting in bugs like PR 30158, where "main" was found as a
> namespace, causing a crash.
> 
> This series adds new symbol domains for types and functions, and
> changes the various symbol-lookup functions to allow multiple domains
> to be searched at once.
> 
> Then, the symbol readers are changed to use the new domains.
> 
> Finally, selected bits of code are changed to be more precise in which
> domains they search.
> 
> symbol_matches_domain currently has a C++-specific hack.  This hack
> handles the C++ language rule where a tag is also entered as a
> typedef.  While working on this series, I discovered that the
> non-DWARF symbol readers will actually emit a second typedef symbol.
> DWARF could do this as well, at some memory expense; and while I
> consider this to be cleaner in an abstract way, for the time being
> I've left the hack in place.
> 
> I regression tested this on x86-64 Fedora 38.  I also regression
> tested using the debug-names and gdb-index target boards.

I did not look at this all in detail, but I did review 1-7 close enough
to add 'Reviewed-by'.  I think the idea in general is sensible though,
and certainly I think it can be helpful even when working with C to
restrict searches by type (e.g. "only functions").

As far as I understand it, I do agree that changing the DWARF reader to
emit duplicate typedef symbols for C++ probably is the cleaner long term
solution compared to the current hacks.