[RFC,v2,16/21] gdb/python: allow instantiation of gdb.Symbol from Python

Message ID 20241121124714.419946-17-jan.vrany@labware.com
State New
Headers
Series Add Python "JIT" API |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-arm warning Skipped upon request
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 warning Skipped upon request

Commit Message

Jan Vraný Nov. 21, 2024, 12:47 p.m. UTC
  This commit adds code to allow user extension to instantiate
gdb.Symbol.

As of now only "function" symbols can be created (that is: symbols
of FUNCTION_DOMAIN and with address class LOC_BLOCK). This is enough
to be able to implement "JIT reader" equivalent in Python. Future
commits may extend this API to allow creation of other kinds of symbols
(static variables, arguments, locals and so on).

Like previous similar commits, this is a step towards a Python support
for dynamically generated code (JIT) in GDB.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>
---
 gdb/doc/python.texi                    |  26 +++++
 gdb/objfiles.c                         |  20 ++++
 gdb/objfiles.h                         |   4 +
 gdb/python/py-symbol.c                 | 139 ++++++++++++++++++++++++-
 gdb/testsuite/gdb.python/py-symbol.exp |  13 +++
 5 files changed, 201 insertions(+), 1 deletion(-)
  

Comments

Eli Zaretskii Nov. 21, 2024, 1:37 p.m. UTC | #1
> From: Jan Vrany <jan.vrany@labware.com>
> CC: Jan Vrany <jan.vrany@labware.com>,
> 	Eli Zaretskii <eliz@gnu.org>
> Date: Thu, 21 Nov 2024 12:47:09 +0000
> 
> This commit adds code to allow user extension to instantiate
> gdb.Symbol.
> 
> As of now only "function" symbols can be created (that is: symbols
> of FUNCTION_DOMAIN and with address class LOC_BLOCK). This is enough
> to be able to implement "JIT reader" equivalent in Python. Future
> commits may extend this API to allow creation of other kinds of symbols
> (static variables, arguments, locals and so on).
> 
> Like previous similar commits, this is a step towards a Python support
> for dynamically generated code (JIT) in GDB.
> 
> Reviewed-By: Eli Zaretskii <eliz@gnu.org>
> ---
>  gdb/doc/python.texi                    |  26 +++++
>  gdb/objfiles.c                         |  20 ++++
>  gdb/objfiles.h                         |   4 +
>  gdb/python/py-symbol.c                 | 139 ++++++++++++++++++++++++-
>  gdb/testsuite/gdb.python/py-symbol.exp |  13 +++
>  5 files changed, 201 insertions(+), 1 deletion(-)

OK for the documentation part, thanks.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>
  

Patch

diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index 8b3c95cbf1d..55ca91920cb 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -6278,6 +6278,32 @@  arguments.
 
 A @code{gdb.Symbol} object has the following methods:
 
+@defun Symbol.__init__ (name, symtab, type, domain, addr_class, value)
+Creates new symbol named @var{name} and adds it to symbol table
+@var{symtab}.
+
+The @var{type} argument specifies type of the symbol as @var{gdb.Type}
+object (@pxref{Types In Python}).
+
+The @var{domain} argument specifies domain of the symbol.  Each domain is
+a constant defined in the @code{gdb} module and described later in this
+chapter.
+
+The @var{addr_class} argument, together with @var{value} argument, specifies
+how to find the value of this symbol.  Each address class is a constant
+defined in the @code{gdb} module and described later in this chapter.  As of
+now, only @code{gdb.SYMBOL_LOC_BLOCK} address class is supported, but future
+versions of @value{GDBN} may support more address classes.
+
+The meaning of @var{value} argument depends on the value of @var{addr_class}:
+@vtable @code
+@item gdb.SYMBOL_LOC_BLOCK
+The @var{value} argument must be a block (a @code{gdb.Block} object).  Block
+must belong to the same compunit as the
+@var{symtab} parameter (@pxref{Compunits In Python}).
+@end vtable
+@end defun
+
 @defun Symbol.is_valid ()
 Returns @code{True} if the @code{gdb.Symbol} object is valid,
 @code{False} if not.  A @code{gdb.Symbol} object can become invalid if
diff --git a/gdb/objfiles.c b/gdb/objfiles.c
index 0bb578fa6a8..cdb6dba2f7c 100644
--- a/gdb/objfiles.c
+++ b/gdb/objfiles.c
@@ -1312,3 +1312,23 @@  objfile_int_type (struct objfile *of, int size_in_bytes, bool unsigned_p)
 
   gdb_assert_not_reached ("unable to find suitable integer type");
 }
+
+/* See objfiles.h.  */
+
+int
+objfile::find_section_index (CORE_ADDR start, CORE_ADDR end)
+{
+  obj_section *sect;
+  int sect_index;
+  for (sect = this->sections_start, sect_index = 0;
+       sect < this->sections_end;
+       sect++, sect_index++)
+    {
+      if (sect->the_bfd_section == nullptr)
+	continue;
+
+      if (sect->addr () <= start && end <= sect->endaddr ())
+	return sect_index;
+    }
+  return -1;
+}
\ No newline at end of file
diff --git a/gdb/objfiles.h b/gdb/objfiles.h
index bd65e2bd030..94533797563 100644
--- a/gdb/objfiles.h
+++ b/gdb/objfiles.h
@@ -644,6 +644,10 @@  struct objfile : intrusive_list_node<objfile>
     this->section_offsets[idx] = offset;
   }
 
+  /* Return the section index for section mapped at memory range
+     [START, END].  If there's no such section, return -1.  */
+  int find_section_index (CORE_ADDR start, CORE_ADDR end);
+
   class section_iterator
   {
   public:
diff --git a/gdb/python/py-symbol.c b/gdb/python/py-symbol.c
index 44bed85481b..78db88333c5 100644
--- a/gdb/python/py-symbol.c
+++ b/gdb/python/py-symbol.c
@@ -400,6 +400,135 @@  sympy_repr (PyObject *self)
 			       symbol->print_name ());
 }
 
+/* Object initializer; creates new symbol.
+
+   Use: __init__(NAME, SYMTAB, TYPE, DOMAIN, ADDR_CLASS, VALUE).  */
+
+static int
+sympy_init (PyObject *zelf, PyObject *args, PyObject *kw)
+{
+  struct symbol_object *self = (struct symbol_object*) zelf;
+
+  if (self->symbol)
+    {
+      PyErr_Format (PyExc_RuntimeError,
+		    _("Symbol object already initialized."));
+      return -1;
+    }
+
+   static const char *keywords[] = { "name", "symtab", "type",
+				     "domain", "addr_class", "value",
+				     nullptr };
+   const char *name;
+   PyObject *symtab_obj = nullptr;
+   PyObject *type_obj = nullptr;
+   domain_enum domain;
+   unsigned int addr_class;
+   PyObject *value_obj = nullptr;
+
+   if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "sOOIIO", keywords,
+					 &name, &symtab_obj, &type_obj,
+					 &domain, &addr_class, &value_obj))
+    return -1;
+
+
+  struct symtab *symtab = symtab_object_to_symtab (symtab_obj);
+  if (symtab == nullptr)
+    {
+      PyErr_Format (PyExc_TypeError,
+		    _("The symtab argument is not valid gdb.Symtab object"));
+      return -1;
+    }
+
+  struct type *type = type_object_to_type (type_obj);
+  if (type == nullptr)
+    {
+      PyErr_Format (PyExc_TypeError,
+		    _("The type argument is not valid gdb.Type object"));
+      return -1;
+    }
+  if (type->objfile_owner () != nullptr &&
+      type->objfile_owner () != symtab->compunit ()->objfile ())
+    {
+      PyErr_Format (PyExc_ValueError,
+		    _("The type argument's owning objfile differs from "
+		      "symtab's objfile."));
+      return -1;
+    }
+
+  union _value {
+    const struct block *block;
+  } value;
+
+  switch (addr_class)
+    {
+      default:
+	PyErr_Format (PyExc_ValueError,
+		      _("The value of addr_class argument is not supported"));
+	return -1;
+
+      case LOC_BLOCK:
+	if ((value.block = block_object_to_block (value_obj)) == nullptr)
+	  {
+	    PyErr_Format (PyExc_TypeError,
+			 _("The addr_class argument is SYMBOL_LOC_BLOCK but "
+			   "the value argument is not a valid gdb.Block."));
+	    return -1;
+	  }
+	if (type->code () != TYPE_CODE_FUNC)
+	  {
+	    PyErr_Format (PyExc_ValueError,
+			 _("The addr_class argument is SYMBOL_LOC_BLOCK but "
+			   "the type argument is not a function type."));
+	    return -1;
+	  }
+	break;
+    }
+
+  struct objfile *objfile = symtab->compunit ()->objfile ();
+  auto_obstack *obstack = &(objfile->objfile_obstack);
+  struct symbol *sym = new (obstack) symbol();
+
+  sym->m_name = obstack_strdup (obstack, name);
+  sym->set_symtab (symtab);
+  sym->set_type (type);
+  sym->set_domain (domain);
+  sym->set_aclass_index (addr_class);
+
+  switch (addr_class)
+    {
+      case LOC_BLOCK:
+	{
+	  sym->set_value_block (value.block);
+
+	  if (domain == FUNCTION_DOMAIN)
+	    const_cast<struct block*> (value.block)->set_function (sym);
+
+	  /* Set symbol's section index.  This needed in somewhat unusual
+	     usecase where dynamic code is generated into a special section
+	     (defined in custom linker script or otherwise).  Otherwise,
+	     find_pc_sect_compunit_symtab () would not find the compunit
+	     symtab and commands like "disassemble function_name" would
+	     resort to disassemble complete section.
+
+	     Note that in usual case where new objfile is created for
+	     dynamic code, the objfile has no sections at all and
+	     objfile::find_section_index () returns -1.
+	     */
+	  CORE_ADDR start = value.block->start ();
+	  CORE_ADDR end = value.block->end ();
+	  sym->set_section_index (objfile->find_section_index (start, end));
+	}
+	break;
+      default:
+	gdb_assert_not_reached("unreachable");
+	break;
+  }
+
+  set_symbol (self, sym);
+  return 0;
+}
+
 /* Implementation of
    gdb.lookup_symbol (name [, block] [, domain]) -> (symbol, is_field_of_this)
    A tuple with 2 elements is always returned.  The first is the symbol
@@ -774,5 +903,13 @@  PyTypeObject symbol_object_type = {
   0,				  /*tp_iternext */
   symbol_object_methods,	  /*tp_methods */
   0,				  /*tp_members */
-  symbol_object_getset		  /*tp_getset */
+  symbol_object_getset,		  /*tp_getset */
+  0,				  /*tp_base */
+  0,				  /*tp_dict */
+  0,				  /*tp_descr_get */
+  0,				  /*tp_descr_set */
+  0,                              /*tp_dictoffset */
+  sympy_init,	                  /*tp_init */
+  0,				  /*tp_alloc */
+  PyType_GenericNew,		  /*tp_new */
 };
diff --git a/gdb/testsuite/gdb.python/py-symbol.exp b/gdb/testsuite/gdb.python/py-symbol.exp
index 1bfa17b4e91..d9c0e255146 100644
--- a/gdb/testsuite/gdb.python/py-symbol.exp
+++ b/gdb/testsuite/gdb.python/py-symbol.exp
@@ -222,6 +222,19 @@  gdb_test "python print (t\[0\] != 123 )"\
 	 "True" \
 	 "test symbol non-equality with non-symbol"
 
+# Test creation of new symbols
+gdb_py_test_silent_cmd "python s = gdb.Symbol(\"ns1\", t\[0\].symtab, t\[0\].type.function(), gdb.SYMBOL_FUNCTION_DOMAIN, gdb.SYMBOL_LOC_BLOCK, t\[0\].symtab.static_block() )" \
+	"create symbol" 0
+gdb_test "python print (s)" \
+	 "ns1" \
+	 "test new symbol's __str__"
+gdb_test "python print (s.symtab == t\[0\].symtab)" \
+	 "True" \
+	 "test new symbol's symtab"
+gdb_test "python print (s.type == t\[0\].type.function())" \
+	 "True" \
+	 "test new symbol's type"
+
 # C++ tests
 # Recompile binary.
 lappend opts c++