diff mbox

[RFC] Provide the ability to write the frame unwinder in Python

Message ID 54E71694.1080304@redhat.com
State New
Headers show

Commit Message

Phil Muldoon Feb. 20, 2015, 11:12 a.m. UTC
On 12/02/15 17:58, Alexander Smundak wrote:
> On Wed, Feb 4, 2015 at 2:35 PM, Doug Evans <dje@google.com> wrote:
>> High level comments:
>> Is it possible to see the code, and example usage, of a real-life use-case
>> of this? That will help folks not familiar with this project to understand
>> the problem we are trying to solve.

I agree.

> The full implementation of the combined sniffer/frame filter for OpenJDK
> is about 2500 lines and will eventually become part of it. I am waiting for
> this GDB patch to be reviewed before I can present it to be reviewed by
> the JDK community :-)

You could expand the testcase a little?

> The revised patch is attached. The important differences are as follows:
> * A sniffer is now an object, using the pattern similar to xmethods
> and pretty printers.
> * Register values are properly types (that is, based on a register type)

Comments inline. I looked at in detail. It seems a good idea. Some
things need a little work.

+            (SP, PC)              frame_id_build (Value(SP), Value(PC))
+            (SP, PC, SPECIAL)     frame_id_build_special (Value(SP),
+                                   Value(PC), Value(SPECIAL)
+            The registers in the FRAME_ID_REGNUMS tuple should be among those
+            returned by REG_DATA.
+            The chapter "Stack Frames" in the GDB Internals Guide describes
+            frame ID.
+        """
+        raise NotImplementedError("Sniffer __call__")

There might be a super-class provided somewhere the implements the
comments above. So comments  might apply there too.

+typedef struct
+  PyObject_HEAD
+  struct frame_info *frame_info;
+} sniffer_info_object;

+/* The data we keep for a frame we can unwind: frame_id and an array of
+   (register_number, register_value) pairs.  */
+typedef struct
+  struct frame_id frame_id;
+  struct gdbarch *gdbarch;
+  int reg_count;
+  struct reg_info
+  {
+    int number;
+    gdb_byte *data;
+  } reg[];
+} cached_frame_info;

Each field in a struct needs to be documented.

+sniffer_infopy_read_register (PyObject *self, PyObject *args)
+  volatile struct gdb_exception except;
+  int regnum;
+  struct value *val = NULL;
+  if (!PyArg_ParseTuple (args, "i", &regnum))
+    return NULL;
+    {
+      /* Calling `value_of_register' will result in infinite recursion.
+         Call `deprecated_frame_register_read' instead.  */

Can you explain this comment in more detail?

+      if (deprecated_frame_register_read (frame, regnum, buffer))
+        val = value_from_contents (register_type (gdbarch, regnum), buffer);
+      if (val == NULL)
+        {
+          char *error_text
+              = xstrprintf (_("Cannot read register %d from frame"), regnum);
+          PyErr_SetString (PyExc_ValueError, error_text);
+          xfree (error_text);

I believe in this case you are overwriting the GDB generated exception
with a more generic error message. Don't do it.

+static PyObject *
+frame_info_to_sniffer_info_object (struct frame_info *frame)
+  sniffer_info_object *sniffer_info
+      = PyObject_New (sniffer_info_object, &sniffer_info_object_type);
+  if (sniffer_info != NULL)

This is a new object so you should unconditionally write the

+/* Parse Python Int, saving it at the given address. Returns 1 on success,
+   0 otherwise.  */
+static int
+pyuw_parse_int (PyObject *pyo_int, int *valuep)

There's an issue with Python 2.x and Python 3.x compatibility with
longs. Please see py-value.c:1447 comment and resulting logic to
ensure this works in both cases.

+  long long_value;
+  if (pyo_int == NULL || !PyInt_Check (pyo_int))
+    return 0;
+  long_value = PyInt_AsLong (pyo_int);
+  if (long_value != (int) long_value)
+    return 0;
+  *valuep = (int) long_value;
+  return 1;

+static int
+pyuw_sniffer (const struct frame_unwind *self, struct frame_info *this_frame,
+              void **cache_ptr)
+  struct gdbarch *gdbarch;
+  struct cleanup *cleanups;
+  struct cleanup *cached_frame_cleanups;
+  PyObject *pyo_module;
+  PyObject *pyo_execute;
+  PyObject *pyo_sniffer_info;
+  PyObject *pyo_unwind_info;
+  cached_frame_info *cached_frame = NULL;
+  gdb_assert (*cache_ptr == NULL);

Can you please document the assert, and the reason for it.

+  gdbarch = (void *)(self->unwind_data);

This cast seems odd to me.

+          if (data_size != TYPE_LENGTH (value_enclosing_type (value)))
+            {
+              error (_("The value of the register #%d returned by the "
+                       "Python sniffer has unexpected size: %u instead "
+                       "of %u"), reg->number,
+                     (unsigned)(TYPE_LENGTH (value_enclosing_type (value))),
+                     (unsigned)data_size);
+            }

Can you please document this assert reason.

+          gdb_assert ((gdb_data_free + data_size) <= gdb_data_end);
+          memcpy (gdb_data_free, value_contents (value), data_size);
+          reg->data = gdb_data_free;
+          gdb_data_free += data_size;
+        }
+      }
+  }


diff mbox


diff --git a/gdb/python/lib/gdb/command/sniffers.py b/gdb/python/lib/gdb/command/sniffers.py
new file mode 100644
index 0000000..b340434
--- /dev/null
+++ b/gdb/python/lib/gdb/command/sniffers.py
@@ -0,0 +1,198 @@ 
+# Sniffer commands.
+# Copyright 2013-2015 Free Software Foundation, Inc.

2015 only, I think, as it is new file.

+def validate_regexp(exp, idstring):
+    try:
+        return re.compile(exp)
+    except SyntaxError:
+        raise SyntaxError("invalid %s regexp: %s" % (idstring, exp))

Please make errors a complete sentence with capitalization and

+    argv = gdb.string_to_argv(arg)
+    argc = len(argv)
+    if argc > 2:
+        raise SyntaxError("too many arguments")

See above note on error messages. This and for all others that follow.

+    def list_sniffers(self, title, sniffers, name_re):
+        """Lists the sniffers whose name matches regexp.
+        Arguments:
+            title: The line to print before the list.
+            sniffers: The list of the sniffers.
+            name_re: sniffer name filter.
+        """
+        if not sniffers:
+            return
+        print title
+        for sniffer in sniffers:
+            if name_re.match(sniffer.name):
+                print("  %s%s" % (sniffer.name,
+                                  "" if sniffer.enabled else "[disabled]"))
+    def invoke(self, arg, from_tty):
+        locus_re, name_re = parse_sniffer_command_args(arg)

I think parse_sniffer_command_args should error if the loci is not
global, progspace or objectfile. Either that or the default to object
file should be documented here, and in the parse_sniffer_command_args

+        if locus_re.match("global"):
+            self.list_sniffers("global sniffers:", gdb.frame_sniffers,
+                               name_re)
+        if locus_re.match("progspace"):
+            cp = gdb.current_progspace()
+            self.list_sniffers("progspace %s sniffers:" % cp.filename,
+                               cp.frame_sniffers, name_re)
+        for objfile in gdb.objfiles():

There seems no way *not* to parse object files for sniffers? Say I set
the loci to "global", that's what I want. Given there could be 1000s
of object files, I think this default is a bit odd? I might be
misunderstanding though.

+            if locus_re.match(objfile.filename):
+                self.list_sniffers("objfile %s sniffers:" % objfile.filename,
+                                   objfile.frame_sniffers, name_re)
+def do_enable_sniffer(arg, flag):
+    """Enable/disable sniffer(s)."""
+    (locus_re, name_re) = parse_sniffer_command_args(arg)
+    total = 0
+    if locus_re.match("global"):
+        total += do_enable_sniffer1(gdb.frame_sniffers, name_re, flag)
+    if locus_re.match("progspace"):
+        total += do_enable_sniffer1(gdb.current_progspace().frame_sniffers,
+                                    name_re, flag)
+    for objfile in gdb.objfiles():

Again, in an environment where there may be  many object files this seems
a rather wasteful search if I set the loci to "global", and then we
search all object files pointlessly?

+        if locus_re.match(objfile.filename):
+            total += do_enable_sniffer1(objfile.frame_sniffers, name_re,
+                                        flag)
+    print("%d sniffer%s %s" % (total, "" if total == 1 else "s",
+                               "enabled" if flag else "disabled"))

+def execute_sniffers(sniffer_info):
+    """Internal function called from GDB to execute all sniffers.
+    Runs each currently enabled sniffer until it finds the one that can
+    unwind given frame.
+    Arguments:
+        sniffer_info: an instance of gdb.SnifferInfo.
+    Returns:
+        Unwind info or None. See the description of the value returned
+        by Sniffer.__call__ below.
+    """
+    for objfile in gdb.objfiles():
+        for sniffer in objfile.frame_sniffers:
+            if sniffer.enabled:
+                unwind_info = sniffer.__call__(sniffer_info)
+                if unwind_info is not None:
+                    return unwind_info

I think we may need a priority system. We ran into this with frame
filters. Allow me to explain a scenario, and how a user might solve

Say in the future I am debugging openjdk, and the RPM on my Fedora
system installs a bunch of frame sniffers in the GDB auto-loading
path. So they get auto-loaded when the openjdk inferior is
loaded. Good, that's what I want. But what happens if I want to
register my own frame sniffer that does additional stuff?

I could disable the auto-loaded one, then add my own I guess. But is a
frame sniffing a mutually exclusive operation? Is only one sniffer
only allowed to ever run on one frame in one discrete operation? As it
is right now, without a priority system the auto-loaded frame sniffer
GDB installed will always be run unless I figure out I have to disable
it, and my own sniffer will never be run as the auto-loaded one got
there first.

If it is mutually exclusive operation, that one frame sniffer runs,
and all else are discarded we should either:

Use a priority attribute to decide which sniffer gets run for that

Or print a warning that relevant sniffers have been discarded in favor
of another.

diff --git a/gdb/python/lib/gdb/sniffer.py b/gdb/python/lib/gdb/sniffer.py
new file mode 100644
index 0000000..6c9f327
--- /dev/null
+++ b/gdb/python/lib/gdb/sniffer.py
@@ -0,0 +1,113 @@ 
+# Copyright (C) 2013-2015

This and others, 2015 only.

+class Sniffer(object):
+    """Base class (or a template) for frame sniffers written in Python.
+    A sniffer has a single method __call__ and the attributes described below.
+    Attributes:
+        name: The name of the sniffer.
+        enabled: A boolean indicating whether the sniffer is enabled.

     priority: The priority order of the sniffers
(If you go with the idea of a priority based system)

+        Returns:
+            (REG_DATA, FRAME_ID_REGNUMS) tuple, or None if this sniffer cannot
+            unwind the frame.
+            REG_DATA is a tuple containing the registers in the unwound frame.
+            Each element in thr REG_DATA is a (REG_NUM, REG_VALUE) tuple,
+            where REG_NUM is platform-specific register number, and REG_VALUE
+            is register value (a gdb.Value instance).

See below

+            FRAME_ID_REGNUMS is a tuple specifying the registers used
+            to construct the frame ID. For instance, on x86_64,
+            FRAME_ID_REGNUMS is (7, 16), as 7 is the stack pointer
+            register (RSP), and 16 is the instruction pointer (RIP).
+            FRAME_ID_REGNUMS is a (SP), (SP, PC), or (SP, PC,

This seems like an internal detail exposed a little too much? Why not
return an object with the attributes SP, PC and SPECIAL? These could
be mapped from the Python API register methods talked of earlier in
the previous emails.  These three registers are either returned (SP
looks to be the mandatory one), or not. No additional registers other
than the three listed will be included in FRAME_ID_REGNUMS?

Does the actual number of the register matter here to the
user/implementer?  You tell the user the register number for SP, PC
and SPECIAL as you populate them in tuple order. Why make the user
jump through hops to map values to those three register numbers from
REG_DATA when you could do it for them? You could still return an
attribute containing the tuple REG_DATA if they wanted to poke around
the other registers. It seem like, though, we don't tell them what the
other registers are, so only the three named are important.

There could be a very good reason for this, but I cannot see it
myself. What do you think?

+            tuple, where SP, PC, and SPECIAL are register numbers. Depending
+            on the tuple, Python sniffer support code uses internal GDB
+            functions to construct frame ID as follows:
+            (SP)                  frame_id_build_wild (Value(SP))