Merge handle_inferior_event and handle_inferior_event_1

Message ID 20190320140846.13031-1-tromey@adacore.com
State New, archived
Headers

Commit Message

Tom Tromey March 20, 2019, 2:08 p.m. UTC
  I noticed that handle_inferior_event is just a small wrapper that
frees the value chain.  This patch replaces it with a
scoped_value_mark, reducing the number of lines of code here.

Regression tested on x86-64 Fedora 29.

gdb/ChangeLog
2019-03-20  Tom Tromey  <tromey@adacore.com>

	* infrun.c (handle_inferior_event): Rename from
	handle_inferior_event_1.  Create a scoped_value_mark.
	(handle_inferior_event): Remove.
---
 gdb/ChangeLog |  6 ++++++
 gdb/infrun.c  | 23 ++++++-----------------
 2 files changed, 12 insertions(+), 17 deletions(-)
  

Comments

Pedro Alves March 20, 2019, 4:20 p.m. UTC | #1
On 03/20/2019 02:08 PM, Tom Tromey wrote:
> I noticed that handle_inferior_event is just a small wrapper that
> frees the value chain.  This patch replaces it with a
> scoped_value_mark, reducing the number of lines of code here.
> 
> Regression tested on x86-64 Fedora 29.
> 
> gdb/ChangeLog
> 2019-03-20  Tom Tromey  <tromey@adacore.com>
> 
> 	* infrun.c (handle_inferior_event): Rename from
> 	handle_inferior_event_1.  Create a scoped_value_mark.
> 	(handle_inferior_event): Remove.

OK.

Thanks,
Pedro Alves
  
Sergio Durigan Junior March 21, 2019, 10:05 p.m. UTC | #2
On Wednesday, March 20 2019, Tom Tromey wrote:

> I noticed that handle_inferior_event is just a small wrapper that
> frees the value chain.  This patch replaces it with a
> scoped_value_mark, reducing the number of lines of code here.
>
> Regression tested on x86-64 Fedora 29.

A few days ago (March 18th) a user reported a bug against Fedora GDB:

https://bugzilla.redhat.com/show_bug.cgi?id=1690120

And, after hours and hours investigating this, I found that your commit
actually fixes it.  Not sure if it was intended or not, but what a
coincidence...

Thanks,
  
Sergio Durigan Junior March 22, 2019, 12:01 a.m. UTC | #3
On Thursday, March 21 2019, I wrote:

> On Wednesday, March 20 2019, Tom Tromey wrote:
>
>> I noticed that handle_inferior_event is just a small wrapper that
>> frees the value chain.  This patch replaces it with a
>> scoped_value_mark, reducing the number of lines of code here.
>>
>> Regression tested on x86-64 Fedora 29.
>
> A few days ago (March 18th) a user reported a bug against Fedora GDB:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1690120
>
> And, after hours and hours investigating this, I found that your commit
> actually fixes it.  Not sure if it was intended or not, but what a
> coincidence...

Of course, something I'd also like to ask is: do you think this change
is trivial enough to be included in the 8.3 release?  I'd like to see it
happen.  I can create a bug if needed.

Thanks,
  
Tom Tromey March 25, 2019, 3:35 p.m. UTC | #4
>>>>> ">" == Sergio Durigan Junior <sergiodj@redhat.com> writes:

>> A few days ago (March 18th) a user reported a bug against Fedora GDB:
>> 
>> https://bugzilla.redhat.com/show_bug.cgi?id=1690120
>> 
>> And, after hours and hours investigating this, I found that your commit
>> actually fixes it.  Not sure if it was intended or not, but what a
>> coincidence...

I am curious to know how this change could fix this bug.
That seems mysterious to me.

>> Of course, something I'd also like to ask is: do you think this change
>> is trivial enough to be included in the 8.3 release?  I'd like to see it
>> happen.  I can create a bug if needed.

I would have thought so but the fact that it causes this behavior makes
me worry a bit.  Maybe that's backward!

Tom
  
Pedro Alves March 25, 2019, 4:34 p.m. UTC | #5
On 03/25/2019 03:35 PM, Tom Tromey wrote:
>>>>>> ">" == Sergio Durigan Junior <sergiodj@redhat.com> writes:
> 
>>> A few days ago (March 18th) a user reported a bug against Fedora GDB:
>>>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1690120
>>>
>>> And, after hours and hours investigating this, I found that your commit
>>> actually fixes it.  Not sure if it was intended or not, but what a
>>> coincidence...
> 
> I am curious to know how this change could fix this bug.
> That seems mysterious to me.

There's one behavior change the patch makes, which is that before the patch,
if handle_inferior_event_1 threw an error, value_free_to_mark would not
be called.  After the patch, it's always called, from scoped_value_mark's dtor.

From the backtrace in the bug report, we see that the crash happens late
during teardown, destroying an xmethod value in the global "all_values" vector:

 ...
 #13 std::vector<gdb::ref_ptr<value, value_ref_policy>, std::allocator<gdb::ref_ptr<value, value_ref_policy> > >::~vector (this=0x559d15b24a10 <all_values>, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/stl_vector.h:677
 No locals.
 #14 0x00007f92a8fe0700 in __run_exit_handlers () from /lib64/libc.so.6
 No symbol table info available.
 #15 0x00007f92a8fe0840 in exit () from /lib64/libc.so.6
 No symbol table info available.
 ...

So it looks like your patch made it so that the value in question
avoids the crash because the value in question is no longer in the
all_values vector by the time gdb is tearing down?

Open questions then would be:

 - what was the exception in question that was thrown from
   within handle_inferior_event_1?  (And if there was
   no exception, then the theory above is incorrect.)

 - why don't non-temporary xmethod values put in the value history
   cause a similar crash at tear down time?

>>> Of course, something I'd also like to ask is: do you think this change
>>> is trivial enough to be included in the 8.3 release?  I'd like to see it
>>> happen.  I can create a bug if needed.
> 
> I would have thought so but the fact that it causes this behavior makes
> me worry a bit.  Maybe that's backward!

Thanks,
Pedro Alves
  
Tom Tromey March 25, 2019, 4:46 p.m. UTC | #6
>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:

Pedro> So it looks like your patch made it so that the value in question
Pedro> avoids the crash because the value in question is no longer in the
Pedro> all_values vector by the time gdb is tearing down?

Makes sense.

Pedro> Open questions then would be:

Pedro>  - what was the exception in question that was thrown from
Pedro>    within handle_inferior_event_1?  (And if there was
Pedro>    no exception, then the theory above is incorrect.)

Pedro>  - why don't non-temporary xmethod values put in the value history
Pedro>    cause a similar crash at tear down time?

Also, if a crash can occur during value destruction like this, then I
think it must indicate a bug somewhere else, like invalid reference
counting somewhere.

Tom
  
Sergio Durigan Junior March 25, 2019, 10:20 p.m. UTC | #7
On Monday, March 25 2019, Tom Tromey wrote:

>>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:
>
> Pedro> So it looks like your patch made it so that the value in question
> Pedro> avoids the crash because the value in question is no longer in the
> Pedro> all_values vector by the time gdb is tearing down?
>
> Makes sense.

Thanks Pedro for the explanation.  Yeah, after investigating it became
clear to me that the patch fixes (or "fixes") the bug by destroying the
value instances before we exit GDB.

What's interesting is the fact that I cannot seem to reproduce easily.
For example, if you print anything other than an xmethod, and then you
print an xmethod, and then exit GDB, everything works correctly.  For
that reason, I'm still not able to write a testcase for it, because our
testing infrastructure prints some stuff as a preparation for the test.

> Pedro> Open questions then would be:
>
> Pedro>  - what was the exception in question that was thrown from
> Pedro>    within handle_inferior_event_1?  (And if there was
> Pedro>    no exception, then the theory above is incorrect.)
>
> Pedro>  - why don't non-temporary xmethod values put in the value history
> Pedro>    cause a similar crash at tear down time?
>
> Also, if a crash can occur during value destruction like this, then I
> think it must indicate a bug somewhere else, like invalid reference
> counting somewhere.

That's a good point, yeah.  I can try investigating more.

Having said that, I think the patch is simple enough to be safely
backportable to 8.3, and it doesn't really break anything (in fact, as
Pedro mentioned, it makes things more correct, because now we're sure
that we will destroy the objects if an exception is thrown), so IMHO
there's no downside to it.  WDYT?

Thanks,
  
Tom Tromey March 26, 2019, 1:13 p.m. UTC | #8
>>>>> "Sergio" == Sergio Durigan Junior <sergiodj@redhat.com> writes:

>> Also, if a crash can occur during value destruction like this, then I
>> think it must indicate a bug somewhere else, like invalid reference
>> counting somewhere.

Sergio> That's a good point, yeah.  I can try investigating more.

Maybe one idea would be to back out my patch and then run the failing
test under valgrind.

Sergio> Having said that, I think the patch is simple enough to be safely
Sergio> backportable to 8.3, and it doesn't really break anything (in fact, as
Sergio> Pedro mentioned, it makes things more correct, because now we're sure
Sergio> that we will destroy the objects if an exception is thrown), so IMHO
Sergio> there's no downside to it.  WDYT?

I agree.

Tom
  
Sergio Durigan Junior March 26, 2019, 4:07 p.m. UTC | #9
On Tuesday, March 26 2019, Tom Tromey wrote:

>>>>>> "Sergio" == Sergio Durigan Junior <sergiodj@redhat.com> writes:
>
>>> Also, if a crash can occur during value destruction like this, then I
>>> think it must indicate a bug somewhere else, like invalid reference
>>> counting somewhere.
>
> Sergio> That's a good point, yeah.  I can try investigating more.
>
> Maybe one idea would be to back out my patch and then run the failing
> test under valgrind.

Thanks.

> Sergio> Having said that, I think the patch is simple enough to be safely
> Sergio> backportable to 8.3, and it doesn't really break anything (in fact, as
> Sergio> Pedro mentioned, it makes things more correct, because now we're sure
> Sergio> that we will destroy the objects if an exception is thrown), so IMHO
> Sergio> there's no downside to it.  WDYT?
>
> I agree.

Thanks, Tom.  I created
<https://sourceware.org/bugzilla/show_bug.cgi?id=24391>.  Would you like
me to go ahead and backport the patch to the 8.3 branch, or do you want
to do that?

Thanks,
  
Tom Tromey March 26, 2019, 6:45 p.m. UTC | #10
>>>>> "Sergio" == Sergio Durigan Junior <sergiodj@redhat.com> writes:

Sergio> Thanks, Tom.  I created
Sergio> <https://sourceware.org/bugzilla/show_bug.cgi?id=24391>.  Would you like
Sergio> me to go ahead and backport the patch to the 8.3 branch, or do you want
Sergio> to do that?

Please do it!  Thanks.

Tom
  
Sergio Durigan Junior March 26, 2019, 9:50 p.m. UTC | #11
On Tuesday, March 26 2019, I wrote:

> On Tuesday, March 26 2019, Tom Tromey wrote:
>
>>>>>>> "Sergio" == Sergio Durigan Junior <sergiodj@redhat.com> writes:
>>
>>>> Also, if a crash can occur during value destruction like this, then I
>>>> think it must indicate a bug somewhere else, like invalid reference
>>>> counting somewhere.
>>
>> Sergio> That's a good point, yeah.  I can try investigating more.
>>
>> Maybe one idea would be to back out my patch and then run the failing
>> test under valgrind.
>
> Thanks.

I did that, and we (Pedro, Mark, Frank and I) did a session of
collective investigation.  I summarized what we found here:

  https://bugzilla.redhat.com/show_bug.cgi?id=1690120#c25

Thanks,
  
Tom Tromey March 27, 2019, 12:57 p.m. UTC | #12
Sergio> I did that, and we (Pedro, Mark, Frank and I) did a session of
Sergio> collective investigation.  I summarized what we found here:
Sergio>   https://bugzilla.redhat.com/show_bug.cgi?id=1690120#c25

Thanks, that's very interesting.
I suppose either better control over the order of destruction is needed,
or maybe finalize_python should clear gdb_python_initialized and then
this should be checked in xmethod destructor.

Tom
  
Pedro Alves March 28, 2019, 2:12 p.m. UTC | #13
On 03/27/2019 12:57 PM, Tom Tromey wrote:
> Sergio> I did that, and we (Pedro, Mark, Frank and I) did a session of
> Sergio> collective investigation.  I summarized what we found here:
> Sergio>   https://bugzilla.redhat.com/show_bug.cgi?id=1690120#c25
> 
> Thanks, that's very interesting.
> I suppose either better control over the order of destruction is needed,
> or maybe finalize_python should clear gdb_python_initialized and then
> this should be checked in xmethod destructor.
I think the former is better.  I think we should put an 
  all_values.clear ();
call in quit_force, before the do_final_cleanups call.  Even
better, add a new finalize_values function next to
_initialize_values, and call that.

Thanks,
Pedro Alves
  

Patch

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 550fbe7f5b9..ad7892105a4 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -4591,8 +4591,13 @@  handle_no_resumed (struct execution_control_state *ecs)
    once).  */
 
 static void
-handle_inferior_event_1 (struct execution_control_state *ecs)
+handle_inferior_event (struct execution_control_state *ecs)
 {
+  /* Make sure that all temporary struct value objects that were
+     created during the handling of the event get deleted at the
+     end.  */
+  scoped_value_mark free_values;
+
   enum stop_kind stop_soon;
 
   if (ecs->ws.kind == TARGET_WAITKIND_IGNORE)
@@ -5189,22 +5194,6 @@  Cannot fill $_exitsignal with the correct signal number.\n"));
     }
 }
 
-/* A wrapper around handle_inferior_event_1, which also makes sure
-   that all temporary struct value objects that were created during
-   the handling of the event get deleted at the end.  */
-
-static void
-handle_inferior_event (struct execution_control_state *ecs)
-{
-  struct value *mark = value_mark ();
-
-  handle_inferior_event_1 (ecs);
-  /* Purge all temporary values created during the event handling,
-     as it could be a long time before we return to the command level
-     where such values would otherwise be purged.  */
-  value_free_to_mark (mark);
-}
-
 /* Restart threads back to what they were trying to do back when we
    paused them for an in-line step-over.  The EVENT_THREAD thread is
    ignored.  */