[RFC,v2,21/21] gdb/python: add section in documentation on implementing JIT interface

Message ID 20241121124714.419946-22-jan.vrany@labware.com
State New
Headers
Series Add Python "JIT" API |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-arm warning Skipped upon request
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 warning Skipped upon request

Commit Message

Jan Vraný Nov. 21, 2024, 12:47 p.m. UTC
  This commit adds new section - JIT Interface in Python - outlining how
to use Python APIs introduced in previous commits to implement simple
JIT interface. It also adds new test to make sure the example code is
up-to-date.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>
---
 gdb/NEWS                            |   4 +
 gdb/doc/gdb.texinfo                 |   3 +-
 gdb/doc/python.texi                 | 122 ++++++++++++++++++++++++++++
 gdb/testsuite/gdb.python/py-jit.c   |  61 ++++++++++++++
 gdb/testsuite/gdb.python/py-jit.exp |  57 +++++++++++++
 gdb/testsuite/gdb.python/py-jit.py  | 110 +++++++++++++++++++++++++
 6 files changed, 356 insertions(+), 1 deletion(-)
 create mode 100644 gdb/testsuite/gdb.python/py-jit.c
 create mode 100644 gdb/testsuite/gdb.python/py-jit.exp
 create mode 100644 gdb/testsuite/gdb.python/py-jit.py
  

Comments

Eli Zaretskii Nov. 21, 2024, 1:31 p.m. UTC | #1
> From: Jan Vrany <jan.vrany@labware.com>
> CC: Jan Vrany <jan.vrany@labware.com>,
> 	Eli Zaretskii <eliz@gnu.org>
> Date: Thu, 21 Nov 2024 12:47:14 +0000
> 
> This commit adds new section - JIT Interface in Python - outlining how
> to use Python APIs introduced in previous commits to implement simple
> JIT interface. It also adds new test to make sure the example code is
> up-to-date.
> 
> Reviewed-By: Eli Zaretskii <eliz@gnu.org>
> ---
>  gdb/NEWS                            |   4 +
>  gdb/doc/gdb.texinfo                 |   3 +-
>  gdb/doc/python.texi                 | 122 ++++++++++++++++++++++++++++
>  gdb/testsuite/gdb.python/py-jit.c   |  61 ++++++++++++++
>  gdb/testsuite/gdb.python/py-jit.exp |  57 +++++++++++++
>  gdb/testsuite/gdb.python/py-jit.py  | 110 +++++++++++++++++++++++++
>  6 files changed, 356 insertions(+), 1 deletion(-)
>  create mode 100644 gdb/testsuite/gdb.python/py-jit.c
>  create mode 100644 gdb/testsuite/gdb.python/py-jit.exp
>  create mode 100644 gdb/testsuite/gdb.python/py-jit.py

OK for the documentation part, thanks.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>
  

Patch

diff --git a/gdb/NEWS b/gdb/NEWS
index 7273b23f989..ab3463d9893 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -96,6 +96,10 @@ 
 
   ** Added class gdb.Compunit.
 
+  ** Extended the Python API to allow interfacing with JIT compilers using
+     Python (as an alternative to JIT reader API).  For details, see Section
+     "JIT Interface in Python" in GDB documentation.
+
 * Debugger Adapter Protocol changes
 
   ** The "scopes" request will now return a scope holding global
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 99720f1206e..b292f6c8a89 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -39886,7 +39886,8 @@  If you are using @value{GDBN} to debug a program that uses this interface, then
 it should work transparently so long as you have not stripped the binary.  If
 you are developing a JIT compiler, then the interface is documented in the rest
 of this chapter.  At this time, the only known client of this interface is the
-LLVM JIT.
+LLVM JIT.  An alternative to interface descrived below is to implement JIT
+interface in Python (@pxref{JIT Interface in Python}).
 
 Broadly speaking, the JIT interface mirrors the dynamic loader interface.  The
 JIT compiler communicates with @value{GDBN} by writing data into a global
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index 2ab880f8d73..a67f5836b2f 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -233,6 +233,7 @@  optional arguments while skipping others.  Example:
 * Disassembly In Python::       Instruction Disassembly In Python
 * Missing Debug Info In Python:: Handle missing debug info from Python.
 * Missing Objfiles In Python::  Handle objfiles from Python.
+* JIT Interface in Python::     Writing JIT compilation interface in Python
 @end menu
 
 @node Basic Python
@@ -8554,6 +8555,127 @@  handlers, all of the matching handlers are enabled.  The
 @code{enabled} field of each matching handler is set to @code{True}.
 @end table
 
+@node JIT Interface in Python
+@subsubsection Writing JIT compilation interface in Python
+@cindex python, just-in-time compilation, JIT compilation interface
+
+This section provides a high-level overview how to implement a JIT compiler
+interface entirely in Python.  For alternative way of interfacing a JIT
+@pxref{JIT Interface}.
+
+A JIT compiler interface usually needs to implement three elements:
+
+@enumerate
+@item
+A way how to get notified when the JIT compiler compiles (and installs) new
+code and when existing code is discarded.  Typical solution is to put a Python
+breakpoint (@pxref{Breakpoints In Python}) on some function known to be
+called by the JIT compiler once code is installed or discarded.
+
+@item
+When a new code is installed the JIT interface needs to extract (debug)
+information for newly installed code from the JIT compiler
+(@pxref{Values From Inferior}) and build @value{GDBN}'s internal structures.
+@xref{Objfiles In Python}, @ref{Compunits In Python},
+ @ref{Blocks In Python},   @ref{Symbol Tables In Python},
+ @ref{Symbols In Python},  @ref{Line Tables In Python}).
+
+@item
+Finally, when (previously installed) code is discarded the JIT interface
+needs to discard @value{GDBN}'s internal structures built in previous step.
+This is done by calling @code{unlink} on an objfile for that code
+(which was created in previous step).
+@end enumerate
+
+Here's an example showing how to write a simple JIT interface in Python:
+
+@c The copy of the code below is also in testsuite/gdb.python/py-jit.py
+@c and used by py-jit.exp to make sure it is up to date.  If changed the
+@c test and py-jit.py should be checked and update accordingly if needed.
+@smallexample
+import gdb
+
+class JITRegisterCode(gdb.Breakpoint):
+    def stop(self):
+	# Extract new code's address, size, name, linetable (and possibly
+	# other useful information).  How exactly to do so depends on JIT
+	# compiler in question.
+	#
+	# In this example address, size and name get passed as parameters
+	# to registration function.
+
+	frame = gdb.newest_frame()
+	addr = int(frame.read_var('code'))
+	size = int(frame.read_var('size'))
+	name = frame.read_var('name').string()
+	linetable_entries = get_linetable_entries(addr)
+
+	# Create objfile and compunit for allocated "jitted" code
+	objfile = gdb.Objfile(name)
+	compunit = gdb.Compunit(name, objfile, addr, addr + size)
+
+	# Mark the objfile as "jitted" code.  This will be used later when
+	# unregistering discarded code to check the objfile was indeed
+	# created for "jitted" code.
+	setattr(objfile, "is_jit_code", True)
+
+	# Create block for jitted function
+	block = gdb.Block(compunit.static_block(), addr, addr + size)
+
+	# Create symbol table holding info about jitted function, ...
+	symtab = gdb.Symtab("py-jit.c", compunit)
+	linetable = gdb.LineTable(symtab, linetable_entries)
+
+	# ...type of the jitted function...
+	void_t = gdb.selected_inferior().architecture().void_type()
+	func_t = void_t.function(None)
+
+	# ...and symbol representing jitted function.
+	symbol = gdb.Symbol(name, symtab, func_t,
+			    gdb.SYMBOL_FUNCTION_DOMAIN, gdb.SYMBOL_LOC_BLOCK,
+			    block)
+
+	# Finally, register the symbol in static block...
+	compunit.static_block().add_symbol(symbol)
+
+	# ..and continue execution
+	return False
+
+# Create breakpoint to register new code
+JITRegisterCode("jit_register_code", internal=True)
+
+
+class JITUnregisterCode(gdb.Breakpoint):
+    def stop(self):
+	# Find out which code has been discarded.  Again, how exactly to
+	# do so depends on JIT compiler in question.
+	#
+	# In this example address of discarded code is passed as a
+	# parameter.
+
+	frame = gdb.newest_frame()
+	addr = int(frame.read_var('code'))
+
+	# Find objfile which was created in JITRegisterCode.stop() for
+	# given jitted code.
+	objfile = gdb.current_progspace().objfile_for_address(addr)
+	if objfile is None:
+	    # No objfile for given addr (this should not normally happen)
+	    return False # Continue execution
+	if not getattr(objfile, "is_jit_code", False):
+	    # Not a jitted code (this should not happen either)
+	    return False # Continue execution
+
+	# Remove the objfile and all debug info associated with it...
+	objfile.unlink()
+
+	# ..and continue execution
+	return False # Continue execution
+
+# Create breakpoint to discard old code
+JITUnregisterCode("jit_unregister_code", internal=True)
+@end smallexample
+
 @node Python Auto-loading
 @subsection Python Auto-loading
 @cindex Python auto-loading
diff --git a/gdb/testsuite/gdb.python/py-jit.c b/gdb/testsuite/gdb.python/py-jit.c
new file mode 100644
index 00000000000..2d1a621e9c9
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-jit.c
@@ -0,0 +1,61 @@ 
+/* Copyright (C) 2024-2024 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <string.h>
+#include <unistd.h>
+
+#include <sys/mman.h>
+
+/* "JIT-ed" function, with the prototype `long (long, long)`.  */
+static const unsigned char jit_function_add_code[] = {
+  0x48, 0x01, 0xfe,		/* add %rdi,%rsi */
+  0x48, 0x89, 0xf0,		/* mov %rsi,%rax */
+  0xc3,				/* retq */
+};
+
+/* Dummy function to inform the debugger a new code has been installed.  */
+void jit_register_code (char * name, uintptr_t code, size_t size)
+{}
+
+/* Dummy function to inform the debugger that code has been installed.  */
+void jit_unregister_code (uintptr_t code)
+{}
+
+int
+main (int argc, char **argv)
+{
+  void *code = mmap (NULL, getpagesize (), PROT_WRITE | PROT_EXEC,
+		     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+  assert (code != MAP_FAILED);
+
+  /* "Compile" jit_function_add.  */
+  memcpy (code, jit_function_add_code,
+	  sizeof (jit_function_add_code));
+  jit_register_code ("jit_function_add", (uintptr_t)code, sizeof (jit_function_add_code));
+
+  ((long (*)(long, long))code)(1,5); /* breakpoint 1 line */
+
+  /* "Discard" jit_function_add.  */
+  memset(code, 0, sizeof(jit_function_add_code));
+  jit_unregister_code ((uintptr_t)code);
+
+  return 0; /* breakpoint 2 line */
+}
diff --git a/gdb/testsuite/gdb.python/py-jit.exp b/gdb/testsuite/gdb.python/py-jit.exp
new file mode 100644
index 00000000000..d2aa87a16ed
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-jit.exp
@@ -0,0 +1,57 @@ 
+# Copyright (C) 2024-2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This file is part of the GDB testsuite.  It test the Python API to
+# create symbol tables for dynamic (JIT) code and follows the example
+# code given in documentation (see section JIT Interface in Python)
+
+load_lib gdb-python.exp
+
+require allow_python_tests
+require is_x86_64_m64_target
+
+standard_testfile
+
+if { [prepare_for_testing "failed to prepare" ${testfile} ${srcfile}] } {
+    return -1
+}
+
+if ![runto_main] {
+    return 0
+}
+
+set remote_python_file [gdb_remote_download host \
+                                ${srcdir}/${subdir}/${testfile}.py]
+gdb_test_no_output "source ${remote_python_file}" "load python file"
+
+gdb_breakpoint [gdb_get_line_number "breakpoint 1 line" ${testfile}.c]
+gdb_continue_to_breakpoint "continue to breakpoint 1 line"
+gdb_test "disassemble /s jit_function_add" \
+	"Dump of assembler code for function jit_function_add:.*End of assembler dump." \
+	"disassemble jit_function_add"
+
+gdb_breakpoint "jit_function_add"
+gdb_continue_to_breakpoint "continue to jit_function_add"
+
+gdb_test "bt 1" \
+	"#0  jit_function_add \\(\\) at py-jit.c:.*" \
+	"bt 1"
+
+gdb_breakpoint [gdb_get_line_number "breakpoint 2 line" ${testfile}.c]
+gdb_continue_to_breakpoint "continue to breakpoint 2 line"
+
+gdb_test "disassemble jit_function_add" \
+	"No symbol \"jit_function_add\" in current context." \
+	"disassemble jit_function_add after code has been unregistered"
diff --git a/gdb/testsuite/gdb.python/py-jit.py b/gdb/testsuite/gdb.python/py-jit.py
new file mode 100644
index 00000000000..b46b4ec9fe4
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-jit.py
@@ -0,0 +1,110 @@ 
+# Copyright (C) 2024-2024 Free Software Foundation, Inc.
+#
+#   This file is part of GDB.
+#
+#   This program is free software; you can redistribute it and/or modify
+#   it under the terms of the GNU General Public License as published by
+#   the Free Software Foundation; either version 3 of the License, or
+#   (at your option) any later version.
+#
+#   This program is distributed in the hope that it will be useful,
+#   but WITHOUT ANY WARRANTY; without even the implied warranty of
+#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#   GNU General Public License for more details.
+#
+#   You should have received a copy of the GNU General Public License
+#   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+# This code is same (modulo small tweaks) as the code in documentation,
+# section "JIT Interface in Python". If changed the documentation should
+# be checked and updated accordingly if necessary.
+import gdb
+
+objfile = None
+compunit = None
+block = None
+symtab = None
+symbol = None
+
+def get_linetable_entries(addr):
+    # Entries are not in increasing order to test that
+    # gdb.LineTable.__init__() sorts them properly.
+    return [
+	gdb.LineTableEntry(31, addr+6, True),
+	gdb.LineTableEntry(29, addr, True),
+	gdb.LineTableEntry(30, addr+3, True)
+    ]
+
+
+class JITRegisterCode(gdb.Breakpoint):
+    def stop(self):
+
+        global objfile 
+        global compunit
+        global block
+        global symtab
+        global symbol
+
+        frame = gdb.newest_frame()
+        name = frame.read_var('name').string()
+        addr = int(frame.read_var('code'))
+        size = int(frame.read_var('size'))
+        linetable_entries = get_linetable_entries(addr)
+
+        # Create objfile and compunit for allocated "jit" code
+        objfile = gdb.Objfile(name)
+        compunit = gdb.Compunit(name, objfile, addr, addr + size)
+
+        # Mark the objfile as "jitted code". This will be used later when
+        # unregistering discarded code to check the objfile was indeed
+        # created for jitted code.
+        setattr(objfile, "is_jit_code", True)
+
+        # Create block for jitted function
+        block = gdb.Block(compunit.static_block(), addr, addr + size)
+
+        # Create symbol table holding info about jitted function, ...
+        symtab = gdb.Symtab("py-jit.c", compunit)
+        linetable = gdb.LineTable(symtab, linetable_entries)
+
+        # ...type of the jitted function...
+        int64_t = gdb.selected_inferior().architecture().integer_type(64)
+        func_t = int64_t.function(int64_t, int64_t)
+
+        # ...and symbol representing jitted function.
+        symbol = gdb.Symbol(name, symtab, func_t,
+                                gdb.SYMBOL_FUNCTION_DOMAIN, gdb.SYMBOL_LOC_BLOCK,
+                                block)
+
+        # Finally, register the symbol in static block
+        compunit.static_block().add_symbol(symbol)
+
+        return False # Continue execution
+
+# Create breakpoint to register new code
+JITRegisterCode("jit_register_code", internal=True)
+
+
+class JITUnregisterCode(gdb.Breakpoint):
+    def stop(self):
+        frame = gdb.newest_frame()
+        addr = int(frame.read_var('code'))
+
+        objfile = gdb.current_progspace().objfile_for_address(addr)
+        if objfile is None:
+            # No objfile for given addr - bail out (this should not happen)
+            assert False
+            return False # Continue execution
+
+        if not getattr(objfile, "is_jit_code", False):
+            # Not a jitted addr - bail out (this should not happen either)
+            assert False
+            return False # Continue execution
+
+        # Remove the objfile and all debug info associated with it.
+        objfile.unlink()
+
+        return False # Continue execution
+
+# Create breakpoint to discard old code
+JITUnregisterCode("jit_unregister_code", internal=True)