Patchwork [v2] Release the GIL while running a gdb command or expression

login
register
mail settings
Submitter Tom Tromey
Date Oct. 10, 2018, 8:22 p.m.
Message ID <20181010202233.17985-1-tom@tromey.com>
Download mbox | patch
Permalink /patch/29700/
State New
Headers show

Comments

Tom Tromey - Oct. 10, 2018, 8:22 p.m.
PR python/23615 points out that gdb.execute_gdb_command does not
release the Python GIL.  This means that, while the gdb command is
running, other Python threads do not run.

This patch solves the problem by introducing a new RAII class that can
be used to temporarily release and then re-acquire the GIL, then puts
this into the appropriate places in execute_gdb_command and
gdbpy_parse_and_eval.

This version of the patch includes a test case from Phil.

gdb/ChangeLog
2018-09-07  Tom Tromey  <tom@tromey.com>

	PR python/23615:
	* python/python.c (execute_gdb_command): Use gdbpy_allow_threads.
	(gdbpy_parse_and_eval): Likewise.
	* python/python-internal.h (gdbpy_allow_threads): New class.

gdb/testsuite/ChangeLog
2018-10-10  Phil Muldoon  <pmuldoon@redhat.com>

	PR python/23615:
	* gdb.python/py-gil-mthread.c: New file.
	* gdb.python/py-gil-mthread.py: New file.
	* gdb.python/py-gil-mthread.exp: New file.
---
 gdb/ChangeLog                               |  7 +++
 gdb/python/python-internal.h                | 25 ++++++++
 gdb/python/python.c                         |  3 +
 gdb/testsuite/ChangeLog                     |  7 +++
 gdb/testsuite/gdb.python/py-gil-mthread.c   | 30 +++++++++
 gdb/testsuite/gdb.python/py-gil-mthread.exp | 68 +++++++++++++++++++++
 gdb/testsuite/gdb.python/py-gil-mthread.py  | 43 +++++++++++++
 7 files changed, 183 insertions(+)
 create mode 100644 gdb/testsuite/gdb.python/py-gil-mthread.c
 create mode 100644 gdb/testsuite/gdb.python/py-gil-mthread.exp
 create mode 100644 gdb/testsuite/gdb.python/py-gil-mthread.py
Phil Muldoon - Oct. 12, 2018, 2:51 p.m.
On 10/10/2018 21:22, Tom Tromey wrote:
> PR python/23615 points out that gdb.execute_gdb_command does not
> release the Python GIL.  This means that, while the gdb command is
> running, other Python threads do not run.
> 
> This patch solves the problem by introducing a new RAII class that can
> be used to temporarily release and then re-acquire the GIL, then puts
> this into the appropriate places in execute_gdb_command and
> gdbpy_parse_and_eval.
> 
> This version of the patch includes a test case from Phil.

I know I said this on the other email, but, Version 2 LGTM also.

Cheers

Phil
Pedro Alves - Oct. 12, 2018, 4:49 p.m.
On 10/10/2018 09:22 PM, Tom Tromey wrote:

> index 0000000000..92d103a963
> --- /dev/null
> +++ b/gdb/testsuite/gdb.python/py-gil-mthread.c
> @@ -0,0 +1,30 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2018 Free Software Foundation, Inc.

Fedora's local version has copyright 2014, so this should
be 2014-2018.

> +#include <stdio.h>
> +#include <unistd.h>
> +
> +int
> +main (void)
> +{
> +  int i;
> +  for (i = 0; i < 10; i++)
> +    {
> +      sleep (1); /* break-here */
> +      printf ("Sleeping %d\n", i);
> +    }
> +}
> diff --git a/gdb/testsuite/gdb.python/py-gil-mthread.exp b/gdb/testsuite/gdb.python/py-gil-mthread.exp
> new file mode 100644
> index 0000000000..6a65346a8d

Please add a describing comment mentioning what the
testcase is about.  The testcase isn't 

> +
> +standard_testfile .c .py
> +set executable $testfile
> +
> +if { [prepare_for_testing $testfile.exp $executable $srcfile] } {
> +    return -1
> +}
> +
> +# Skip all tests if Python scripting is not enabled.
> +if { [skip_python_tests] } { continue }
> +
> +if ![runto_main] {
> +    return -1
> +}
> +

The test relies on stdio, so there should be a gdb_skip_stdio_test
or gdb,noinferiorio check here somewhere.

> +gdb_breakpoint $srcfile:[gdb_get_line_number "break-here"] temporary
> +gdb_continue_to_breakpoint "break-here" ".* break-here .*"
> +
> +set test "response"
> +set timeout 60

Do we need to change the timeout?  

Should this use with_timeout_factor?

> +set sleeping_last -1
> +set hello_last 0
> +set minimal 5
> +gdb_test_multiple "python exec (open ('$srcdir/$subdir/$srcfile2').read ())" $test {

I don't think this would work with remote host testing.
AFAIC, python tests usually "gdb_remote_download host"
the python script.

> +    -re "Error: unable to start thread\r\n" {
> +	fail $test
> +    }
> +    -re "Sleeping (\[0-9\]+)\r\n" {

This is inferior output, so should be expected on
$inferior_spawn_id.   Maybe you can tweak the
test to not need this, like run to a breakpoint.
No idea whether that makes sense.

Please double check the testcase works against
--target_board=native-gdbserver/native-extended-gdbserver.

Thanks,
Pedro Alves

> +	set n $expect_out(1, string)
> +	if { $sleeping_last + 1 != $n } {
> +	    fail $test
> +	} else {
> +	    set sleeping_last $n
> +	    if { $sleeping_last >= $minimal && $hello_last >= $minimal } {
> +		pass $test
> +	    } else {
> +		exp_continue
> +	    }
> +	}
> +    }
> +    -re "Hello \\( (\[0-9\]+) \\)\r\n" {
> +	set n $expect_out(1,string)
> +	if { $hello_last + 1 != $n } {
> +	    fail $test
> +	} else {
> +	    set hello_last $n
> +	    if { $sleeping_last >= $minimal && $hello_last >= $minimal } {
> +		pass $test
> +	    } else {
> +		exp_continue
> +	    }
> +	}
> +    }
> +}
Pedro Alves - Oct. 12, 2018, 4:51 p.m.
On 10/12/2018 05:49 PM, Pedro Alves wrote:

>> diff --git a/gdb/testsuite/gdb.python/py-gil-mthread.exp b/gdb/testsuite/gdb.python/py-gil-mthread.exp
>> new file mode 100644
>> index 0000000000..6a65346a8d
> Please add a describing comment mentioning what the
> testcase is about.  The testcase isn't 

I meant to say that it isn't obvious what the testcase is about.

Thanks,
Pedro Alves
Pedro Alves - Oct. 12, 2018, 4:56 p.m.
On 10/10/2018 09:22 PM, Tom Tromey wrote:
> +
> +# Define a function for the thread

While at it, add missing periods?

> +def print_thread_hello():
> +   count = 0
> +   while count < 10:
> +      time.sleep(1)
> +      count += 1
> +      print ("Hello ( %d )" % count)
> +
> +# Create a thread and continue

Ditto.

> +try:
> +   thread.start_new_thread (print_thread_hello, ())
> +   gdb.execute ("continue", release_gil=True)
> +except:
> +   try:
> +      _thread.start_new_thread (print_thread_hello, ())
> +      gdb.execute ("continue", release_gil=True)

But what I came here for, is:

AFAICT, "release_gil" was not included in this version of
the patch, so that shouldn't be here either.  

The test should be failing because of that, I believe,
because GDB should be complaining that release_gil
is an invalid keyword.  Does the test somehow
happen to still pass with that?

> +   except:
> +      print ("Error: unable to start thread")

Thanks,
Pedro Alves
Tom Tromey - Oct. 16, 2018, 12:48 p.m.
>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:

>> @@ -0,0 +1,30 @@
>> +/* This testcase is part of GDB, the GNU debugger.
>> +
>> +   Copyright 2018 Free Software Foundation, Inc.

Pedro> Fedora's local version has copyright 2014, so this should
Pedro> be 2014-2018.

I changed this because I was under the impression that the years should
start at the date of submission.

Also the other files didn't have copyright headers.  I added them, and
now I've put 2014 in there as well.

Tom
Tom Tromey - Oct. 16, 2018, 1:04 p.m.
>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:

Pedro> AFAICT, "release_gil" was not included in this version of
Pedro> the patch, so that shouldn't be here either.  

Oops, forgot to remove that.

Pedro> The test should be failing because of that, I believe,
Pedro> because GDB should be complaining that release_gil
Pedro> is an invalid keyword.  Does the test somehow
Pedro> happen to still pass with that?

Yeah, it did somehow.

Tom
Pedro Alves - Oct. 16, 2018, 2:50 p.m.
On 10/16/2018 01:48 PM, Tom Tromey wrote:
>>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:
> 
>>> @@ -0,0 +1,30 @@
>>> +/* This testcase is part of GDB, the GNU debugger.
>>> +
>>> +   Copyright 2018 Free Software Foundation, Inc.
> 
> Pedro> Fedora's local version has copyright 2014, so this should
> Pedro> be 2014-2018.
> 
> I changed this because I was under the impression that the years should
> start at the date of submission.

The guidance we have is that the copyright year range starts the day
the work was committed to a medium (file system):

 https://sourceware.org/ml/gdb-patches/2014-02/msg00859.html

> 
> Also the other files didn't have copyright headers.  I added them, and
> now I've put 2014 in there as well.

Thanks,
Pedro Alves
Pedro Alves - Oct. 16, 2018, 2:51 p.m.
On 10/16/2018 02:04 PM, Tom Tromey wrote:
>>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:
> 
> Pedro> AFAICT, "release_gil" was not included in this version of
> Pedro> the patch, so that shouldn't be here either.  
> 
> Oops, forgot to remove that.
> 
> Pedro> The test should be failing because of that, I believe,
> Pedro> because GDB should be complaining that release_gil
> Pedro> is an invalid keyword.  Does the test somehow
> Pedro> happen to still pass with that?
> 
> Yeah, it did somehow.
Hopefully it'll be possible to tighten the test a bit to avoid that.

Thanks,
Pedro Alves
Tom Tromey - Oct. 16, 2018, 9:39 p.m.
Pedro> Hopefully it'll be possible to tighten the test a bit to avoid that.

I should have read the test more closely.  It doesn't do what it claims
to do at all.

The only way the test can fail is if one "Hello" or "Sleeping" line
comes with the wrong number:

	set n $expect_out(1,string)
	if { $hello_last + 1 != $n } {
	    fail $test

But this just isn't possible.

Also the Python code is calling print on a gdb stream in a background
thread, which is a no-no.

I could change the test to do some kind of counting in a background
thread.  But that would introduce a race, which is what I was concerned
about up-thread... I couldn't think of a non-racy way to test this.

Maybe that's better than nothing.  Anyway, we're back to square one and
I don't know what to do really.

Tom
Tom Tromey - Oct. 16, 2018, 9:45 p.m.
>>>>> "Tom" == Tom Tromey <tom@tromey.com> writes:

Tom> Also the Python code is calling print on a gdb stream in a background
Tom> thread, which is a no-no.

If you ignore this then I have a version that works.  Racy in theory but
maybe not in practice.

Tom
Phil Muldoon - Oct. 18, 2018, 9:31 a.m.
On 16/10/2018 22:39, Tom Tromey wrote:
> Pedro> Hopefully it'll be possible to tighten the test a bit to avoid that.
> 
> I should have read the test more closely.  It doesn't do what it claims
> to do at all.

I'm not sure what it or I claimed? The initial snippet and test was included
with this explanation:

"As for a test, we also have a test included. It does not appear to be
racy for our purposes. I also include it for consideration. This
snippet is just for initial consideration and will make any changes
needed to include it in the patch."

This was from the original (version 1) email.

> 
> The only way the test can fail is if one "Hello" or "Sleeping" line
> comes with the wrong number:
> 
> 	set n $expect_out(1,string)
> 	if { $hello_last + 1 != $n } {
> 	    fail $test
> 
> But this just isn't possible.

I can't remember why I wrote it this way, or if there was a
modification to the patch over the years, but, the original intention
to the patch was to test interleaved output from the Python thread and
GDB (thereby proving the GIL was released).

> 
> Also the Python code is calling print on a gdb stream in a background
> thread, which is a no-no.

Yeah, this came up in the original review thread. You pointed out that
GDB wasn't thread safe when I was worried over any existing scripts
out there that would rely on the expectation that the gdb.command API
would hold the GIL. For some reason, I did not know this beforehand
(that GDB wasn't thread safe), so that is why it snuck in there (back
in 2014).

Cheers

Phil
Pedro Alves - Oct. 24, 2018, 6:08 p.m.
On 10/16/2018 10:45 PM, Tom Tromey wrote:
>>>>>> "Tom" == Tom Tromey <tom@tromey.com> writes:
> 
> Tom> Also the Python code is calling print on a gdb stream in a background
> Tom> thread, which is a no-no.
> 
> If you ignore this then I have a version that works.  Racy in theory but
> maybe not in practice.
We have plenty of tests that shouldn't be racy in theory but are
in practice.  So I'd instead assume that in practice someone will
definitely run into the race.

Do we really need to rely on printing to check this?  If the
the gdb.execute command can run some more python code, then
we could try using a couple python mutexes for proving the
non-main thread runs.  

So the non-main thread would wait on mutex1 which starts owned
by the main thread.  The main thread unlocks mutex1 and blocks
on mutex2, waiting for the non-main thread to release it.
The non-main thread should now run, and is now the mutex1 owner.
It now releases mutex2.  The main thread now unblocks, and the
test succeeds.  If we don't release the GIL properly, then
the non-main thread won't run, and the testcase times out.

Or something along those lines.

Thanks,
Pedro Alves
Phil Muldoon - Oct. 25, 2018, 12:45 p.m.
On 24/10/2018 19:08, Pedro Alves wrote:

> Do we really need to rely on printing to check this?  If the
> the gdb.execute command can run some more python code, then
> we could try using a couple python mutexes for proving the
> non-main thread runs.  

I agree.

 
> So the non-main thread would wait on mutex1 which starts owned
> by the main thread.  The main thread unlocks mutex1 and blocks
> on mutex2, waiting for the non-main thread to release it.
> The non-main thread should now run, and is now the mutex1 owner.
> It now releases mutex2.  The main thread now unblocks, and the
> test succeeds.  If we don't release the GIL properly, then
> the non-main thread won't run, and the testcase times out.
> 
> Or something along those lines.

If Tom doesn't have the time to work on this, I can find some time,
but it is entirely up to Tom.

Cheers

Phil
Tom Tromey - Nov. 4, 2018, 3:54 p.m.
>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:

Tom> If you ignore this then I have a version that works.  Racy in theory but
Tom> maybe not in practice.

Pedro> We have plenty of tests that shouldn't be racy in theory but are
Pedro> in practice.  So I'd instead assume that in practice someone will
Pedro> definitely run into the race.

Pedro> Do we really need to rely on printing to check this?  If the
Pedro> the gdb.execute command can run some more python code, then
Pedro> we could try using a couple python mutexes for proving the
Pedro> non-main thread runs.  

Pedro> So the non-main thread would wait on mutex1 which starts owned
Pedro> by the main thread.  The main thread unlocks mutex1 and blocks
Pedro> on mutex2, waiting for the non-main thread to release it.
Pedro> The non-main thread should now run, and is now the mutex1 owner.
Pedro> It now releases mutex2.  The main thread now unblocks, and the
Pedro> test succeeds.  If we don't release the GIL properly, then
Pedro> the non-main thread won't run, and the testcase times out.

It took me a little experimenting to understand why this can't work.

The main thread can't run any Python code at all.  If it does run Python
code, then Python might release the GIL.  And, when dealing with thread
primitives like mutexes or whatnot, it certainly will.  However, if the
main thread does release the GIL, then that invalidates the test --
since we are trying to test that the GIL is released under ordinary
circumstances.

I'm back to not seeing a way to test this.

Tom

Patch

diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
index 1812abb5b7..de84bf7461 100644
--- a/gdb/python/python-internal.h
+++ b/gdb/python/python-internal.h
@@ -627,6 +627,31 @@  class gdbpy_enter_varobj : public gdbpy_enter
 
 };
 
+/* The opposite of gdb_enter: this releases the GIL around a region,
+   allowing other Python threads to run.  No Python APIs may be used
+   while this is active.  */
+class gdbpy_allow_threads
+{
+public:
+
+  gdbpy_allow_threads ()
+    : m_save (PyEval_SaveThread ())
+  {
+    gdb_assert (m_save != nullptr);
+  }
+
+  ~gdbpy_allow_threads ()
+  {
+    PyEval_RestoreThread (m_save);
+  }
+
+  DISABLE_COPY_AND_ASSIGN (gdbpy_allow_threads);
+
+private:
+
+  PyThreadState *m_save;
+};
+
 extern struct gdbarch *python_gdbarch;
 extern const struct language_defn *python_language;
 
diff --git a/gdb/python/python.c b/gdb/python/python.c
index 8fbce78469..5c5dc07849 100644
--- a/gdb/python/python.c
+++ b/gdb/python/python.c
@@ -585,6 +585,8 @@  execute_gdb_command (PyObject *self, PyObject *args, PyObject *kw)
 
   TRY
     {
+      gdbpy_allow_threads allow_threads;
+
       struct interp *interp;
 
       std::string arg_copy = arg;
@@ -908,6 +910,7 @@  gdbpy_parse_and_eval (PyObject *self, PyObject *args)
 
   TRY
     {
+      gdbpy_allow_threads allow_threads;
       result = parse_and_eval (expr_str);
     }
   CATCH (except, RETURN_MASK_ALL)
diff --git a/gdb/testsuite/gdb.python/py-gil-mthread.c b/gdb/testsuite/gdb.python/py-gil-mthread.c
new file mode 100644
index 0000000000..92d103a963
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-gil-mthread.c
@@ -0,0 +1,30 @@ 
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2018 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see  <http://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include <unistd.h>
+
+int
+main (void)
+{
+  int i;
+  for (i = 0; i < 10; i++)
+    {
+      sleep (1); /* break-here */
+      printf ("Sleeping %d\n", i);
+    }
+}
diff --git a/gdb/testsuite/gdb.python/py-gil-mthread.exp b/gdb/testsuite/gdb.python/py-gil-mthread.exp
new file mode 100644
index 0000000000..6a65346a8d
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-gil-mthread.exp
@@ -0,0 +1,68 @@ 
+# Copyright (C) 2018 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+standard_testfile .c .py
+set executable $testfile
+
+if { [prepare_for_testing $testfile.exp $executable $srcfile] } {
+    return -1
+}
+
+# Skip all tests if Python scripting is not enabled.
+if { [skip_python_tests] } { continue }
+
+if ![runto_main] {
+    return -1
+}
+
+gdb_breakpoint $srcfile:[gdb_get_line_number "break-here"] temporary
+gdb_continue_to_breakpoint "break-here" ".* break-here .*"
+
+set test "response"
+set timeout 60
+set sleeping_last -1
+set hello_last 0
+set minimal 5
+gdb_test_multiple "python exec (open ('$srcdir/$subdir/$srcfile2').read ())" $test {
+    -re "Error: unable to start thread\r\n" {
+	fail $test
+    }
+    -re "Sleeping (\[0-9\]+)\r\n" {
+	set n $expect_out(1, string)
+	if { $sleeping_last + 1 != $n } {
+	    fail $test
+	} else {
+	    set sleeping_last $n
+	    if { $sleeping_last >= $minimal && $hello_last >= $minimal } {
+		pass $test
+	    } else {
+		exp_continue
+	    }
+	}
+    }
+    -re "Hello \\( (\[0-9\]+) \\)\r\n" {
+	set n $expect_out(1,string)
+	if { $hello_last + 1 != $n } {
+	    fail $test
+	} else {
+	    set hello_last $n
+	    if { $sleeping_last >= $minimal && $hello_last >= $minimal } {
+		pass $test
+	    } else {
+		exp_continue
+	    }
+	}
+    }
+}
diff --git a/gdb/testsuite/gdb.python/py-gil-mthread.py b/gdb/testsuite/gdb.python/py-gil-mthread.py
new file mode 100644
index 0000000000..beb4186b55
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-gil-mthread.py
@@ -0,0 +1,43 @@ 
+# Copyright (C) 2018 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+try:
+   import thread
+except:
+   import _thread
+import time
+import gdb
+
+# Define a function for the thread
+def print_thread_hello():
+   count = 0
+   while count < 10:
+      time.sleep(1)
+      count += 1
+      print ("Hello ( %d )" % count)
+
+# Create a thread and continue
+try:
+   thread.start_new_thread (print_thread_hello, ())
+   gdb.execute ("continue", release_gil=True)
+except:
+   try:
+      _thread.start_new_thread (print_thread_hello, ())
+      gdb.execute ("continue", release_gil=True)
+   except:
+      print ("Error: unable to start thread")
+
+while 1:
+   pass