[Regression] Segfault on native-extended-gdbserver + fork (was: Re: [PATCH v2 3/3] Make linux_nat_detach/thread_db_detach use the inferior parameter)

Message ID 87efmaebo3.fsf_-_@redhat.com
State New, archived
Headers

Commit Message

Sergio Durigan Junior Jan. 28, 2018, 6:32 a.m. UTC
  On Friday, January 19 2018, Simon Marchi wrote:

> From: Simon Marchi <simon.marchi@ericsson.com>
>
> No changes in v2.
>
> This patch makes these two functions actually use the inferior parameter
> added by the previous patch, instead of reading inferior_ptid.  I chose
> these two, because they are the one actually used when I detach on my
> GNU/Linux system, so they were easy to test.
>
> I took the opportunity to pass the inferior being detached to
> inf_ptrace_detach_success, so it could use it too.  From there, it made
> sense to add an overload of detach_inferior that takes the inferior
> directly rather than the pid, to avoid having to pass inf->pid only for
> the callee to look up the inferior structure by pid.

Hey Simon,

While working on something else, I noticed a regression introduced by
this patch.  Consider the following example program:

  #include <unistd.h>

  int
  main (int argc, char *argv[])
  {
    fork ();

    return 0;
  }

When running it under gdbserver:

  # ./gdb/gdbserver/gdbserver --multi --once :2345

And debugging it under GDB:

  # ./gdb/gdb -q -batch -ex 'set remote exec-file ./a.out' -ex 'tar extended-remote :2345' -ex r ./a.out
  Starting program:
  ...
  [Detaching after fork from child process 16102.]
  Segmentation fault (core dumped)

The problem happens on inferior.c:detach_inferior:

  void
  detach_inferior (inferior *inf)
  {
    /* Save the pid, since exit_inferior_1 will reset it.  */
    int pid = inf->pid;
              ^^^^^^^^^

    exit_inferior_1 (inf, 0);

    if (print_inferior_events)
      printf_unfiltered (_("[Inferior %d detached]\n"), pid);
  }

When this code is called from remote.c:remote_follow_fork, the PID is
valid but there is not 'inferior' associated with it, which means that
'inf == NULL'.

I've been thinking about the proper fix to this, and arrived at the
patch attached (without a ChangeLog entry; will add that if the patch
seems OK for you).  Since we will still want to print inferior events
even if 'inf == NULL', I've duplicated the print on the "detach_inferior
(int pid)" version.  Other than that, the patch is basically restoring
the old behaviour of just skipping the detach procedure if there's no
inferior object.

I'm running a regression test on BuildBot to make sure no regressions
are introduced.  I was going to write a testcase to exercise this
scenario, but we already have one, gdb.base/foll-vfork.exp.  The
failures were marked as ERROR's by dejagnu, which may explain why they
were missed...?  Not sure.  Oh, and this regression is not present in
the 8.1 branch.

WDYT?
  

Comments

Simon Marchi Jan. 28, 2018, 4:50 p.m. UTC | #1
On 2018-01-28 01:32, Sergio Durigan Junior wrote:
> Hey Simon,
> 
> While working on something else, I noticed a regression introduced by
> this patch.

Hi Sergio,

Thanks for reporting and fixing this!

> Consider the following example program:
> 
>   #include <unistd.h>
> 
>   int
>   main (int argc, char *argv[])
>   {
>     fork ();
> 
>     return 0;
>   }
> 
> When running it under gdbserver:
> 
>   # ./gdb/gdbserver/gdbserver --multi --once :2345
> 
> And debugging it under GDB:
> 
>   # ./gdb/gdb -q -batch -ex 'set remote exec-file ./a.out' -ex 'tar
> extended-remote :2345' -ex r ./a.out
>   Starting program:
>   ...
>   [Detaching after fork from child process 16102.]
>   Segmentation fault (core dumped)
> 
> The problem happens on inferior.c:detach_inferior:
> 
>   void
>   detach_inferior (inferior *inf)
>   {
>     /* Save the pid, since exit_inferior_1 will reset it.  */
>     int pid = inf->pid;
>               ^^^^^^^^^
> 
>     exit_inferior_1 (inf, 0);
> 
>     if (print_inferior_events)
>       printf_unfiltered (_("[Inferior %d detached]\n"), pid);
>   }
> 
> When this code is called from remote.c:remote_follow_fork, the PID is
> valid but there is not 'inferior' associated with it, which means that
> 'inf == NULL'.

Ah yeah, I hadn't thought about that.

> I've been thinking about the proper fix to this, and arrived at the
> patch attached (without a ChangeLog entry; will add that if the patch
> seems OK for you).  Since we will still want to print inferior events
> even if 'inf == NULL', I've duplicated the print on the 
> "detach_inferior
> (int pid)" version.  Other than that, the patch is basically restoring
> the old behaviour of just skipping the detach procedure if there's no
> inferior object.

I'm fine with this, but I was curious about what happens in Pedro's 
multi-target branch.  I remember he said that the detach_inferior(int) 
version disappears in that branch, though I can't find where he said 
that.  But looking at the branch I can see it's indeed the case:

   
https://github.com/palves/gdb/blob/palves/multi-target/gdb/inferior.c#L250

So I was wondering what remote_follow_fork calls in that case, since it 
can't call the detach_inferior(inferior *) version without an inferior.  
Apparently it calls a new remote_detach_pid function:

   
https://github.com/palves/gdb/blob/palves/multi-target/gdb/remote.c#L5859

This means (I just tried it) that it won't show the "[Inferior %d 
detached]\n" message in that case.  So what I would suggest is putting

   if (print_inferior_events)
     printf_unfiltered (_("[Inferior %d detached]\n"), pid);

in its own function, called by both versions of detach_inferior for now 
(bonus, it de-duplicates the printing of the message).  In the 
multi-target branch, remote_target::follow_fork (renamed from 
remote_follow_fork) can call this function in the case where we don't 
have an inferior object.

> I'm running a regression test on BuildBot to make sure no regressions
> are introduced.  I was going to write a testcase to exercise this
> scenario, but we already have one, gdb.base/foll-vfork.exp.  The
> failures were marked as ERROR's by dejagnu, which may explain why they
> were missed...?  Not sure.  Oh, and this regression is not present in
> the 8.1 branch.
> 
> WDYT?

That's fine with me with the printing of the message in its own 
function, as explained above.

> --
> Sergio
> GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
> Please send encrypted e-mail if possible
> http://sergiodj.net/
> 
> diff --git a/gdb/inferior.c b/gdb/inferior.c
> index 38b7369275..94432a92b1 100644
> --- a/gdb/inferior.c
> +++ b/gdb/inferior.c
> @@ -272,7 +272,15 @@ detach_inferior (inferior *inf)
>  void
>  detach_inferior (int pid)
>  {
> -  detach_inferior (find_inferior_pid (pid));
> +  inferior *inf = find_inferior_pid (pid);
> +
> +  if (inf != NULL)
> +    detach_inferior (inf);
> +  else
> +    {
> +      if (print_inferior_events)
> +	printf_unfiltered (_("[Inferior %d detached]\n"), pid);
> +    }
>  }
> 
>  void

Thanks,

Simon
  
Pedro Alves Jan. 29, 2018, 4 p.m. UTC | #2
On 01/28/2018 04:50 PM, Simon Marchi wrote:
> On 2018-01-28 01:32, Sergio Durigan Junior wrote:

> I'm fine with this, but I was curious about what happens in Pedro's multi-target branch.  I remember he said that the detach_inferior(int) version disappears in that branch, though I can't find where he said that.  But looking at the branch I can see it's indeed the case:
> 
>   https://github.com/palves/gdb/blob/palves/multi-target/gdb/inferior.c#L250
> 
> So I was wondering what remote_follow_fork calls in that case, since it can't call the detach_inferior(inferior *) version without an inferior.  Apparently it calls a new remote_detach_pid function:
> 
>   https://github.com/palves/gdb/blob/palves/multi-target/gdb/remote.c#L5859

remote_detach_pid is not new.  It exists in master.  What that url shows
is that I commented out the detach_inferior call in the branch.

Because in this case, we'd detaching a remote process that the core
of gdb never learned about.

> 
> This means (I just tried it) that it won't show the "[Inferior %d detached]\n" message in that case.  So what I would suggest is putting
> 
>   if (print_inferior_events)
>     printf_unfiltered (_("[Inferior %d detached]\n"), pid);
> 
> in its own function, called by both versions of detach_inferior for now (bonus, it de-duplicates the printing of the message).  In the multi-target branch, remote_target::follow_fork (renamed from remote_follow_fork) can call this function in the case where we don't have an inferior object.

But why would we want to print that?  We will have already printed

  "Detaching after fork from child process PID."

from the common code.  When native debugging, in this scenario,
we don't call detach_inferior either, right?  Can't see why
we'd want to call it for remote.

I think we should just remove that detach_inferior call, like in
the multi-target branch.

Thanks,
Pedro Alves
  
Simon Marchi Jan. 29, 2018, 4:25 p.m. UTC | #3
On 2018-01-29 11:00, Pedro Alves wrote:
> On 01/28/2018 04:50 PM, Simon Marchi wrote:
>> On 2018-01-28 01:32, Sergio Durigan Junior wrote:
> 
>> I'm fine with this, but I was curious about what happens in Pedro's 
>> multi-target branch.  I remember he said that the detach_inferior(int) 
>> version disappears in that branch, though I can't find where he said 
>> that.  But looking at the branch I can see it's indeed the case:
>> 
>>   
>> https://github.com/palves/gdb/blob/palves/multi-target/gdb/inferior.c#L250
>> 
>> So I was wondering what remote_follow_fork calls in that case, since 
>> it can't call the detach_inferior(inferior *) version without an 
>> inferior.  Apparently it calls a new remote_detach_pid function:
>> 
>>   
>> https://github.com/palves/gdb/blob/palves/multi-target/gdb/remote.c#L5859
> 
> remote_detach_pid is not new.  It exists in master.  What that url 
> shows
> is that I commented out the detach_inferior call in the branch.
> 
> Because in this case, we'd detaching a remote process that the core
> of gdb never learned about.

Oops I read that wrong.

>> 
>> This means (I just tried it) that it won't show the "[Inferior %d 
>> detached]\n" message in that case.  So what I would suggest is putting
>> 
>>   if (print_inferior_events)
>>     printf_unfiltered (_("[Inferior %d detached]\n"), pid);
>> 
>> in its own function, called by both versions of detach_inferior for 
>> now (bonus, it de-duplicates the printing of the message).  In the 
>> multi-target branch, remote_target::follow_fork (renamed from 
>> remote_follow_fork) can call this function in the case where we don't 
>> have an inferior object.
> 
> But why would we want to print that?  We will have already printed
> 
>   "Detaching after fork from child process PID."
> 
> from the common code.  When native debugging, in this scenario,
> we don't call detach_inferior either, right?  Can't see why
> we'd want to call it for remote.

It's true that it's a bit of a lie to say "[Inferior PID detached]" if 
there never actually was an inferior for that PID.  Since we never print 
"[Inferior PID detached]" on native in that case, I am fine with 
removing the call from remote.c.  Sergio, that would fix the crash you 
found I think?

Simon
  
Sergio Durigan Junior Jan. 29, 2018, 5:24 p.m. UTC | #4
On Monday, January 29 2018, Simon Marchi wrote:

> On 2018-01-29 11:00, Pedro Alves wrote:
>> On 01/28/2018 04:50 PM, Simon Marchi wrote:
>>> On 2018-01-28 01:32, Sergio Durigan Junior wrote:
>>> This means (I just tried it) that it won't show the "[Inferior %d
>>> detached]\n" message in that case.  So what I would suggest is
>>> putting
>>>
>>>   if (print_inferior_events)
>>>     printf_unfiltered (_("[Inferior %d detached]\n"), pid);
>>>
>>> in its own function, called by both versions of detach_inferior for
>>> now (bonus, it de-duplicates the printing of the message).  In the
>>> multi-target branch, remote_target::follow_fork (renamed from
>>> remote_follow_fork) can call this function in the case where we
>>> don't have an inferior object.
>>
>> But why would we want to print that?  We will have already printed
>>
>>   "Detaching after fork from child process PID."
>>
>> from the common code.  When native debugging, in this scenario,
>> we don't call detach_inferior either, right?  Can't see why
>> we'd want to call it for remote.
>
> It's true that it's a bit of a lie to say "[Inferior PID detached]" if
> there never actually was an inferior for that PID.  Since we never
> print "[Inferior PID detached]" on native in that case, I am fine with
> removing the call from remote.c.  Sergio, that would fix the crash you
> found I think?

I was also unsure about printing the message in this case, because
there's no real detach happening.  I'm fine with not printing it.  And
yes, removing the call to "detach_inferior" also fixes the problem.

I'll prepare a patch.
  

Patch

diff --git a/gdb/inferior.c b/gdb/inferior.c
index 38b7369275..94432a92b1 100644
--- a/gdb/inferior.c
+++ b/gdb/inferior.c
@@ -272,7 +272,15 @@  detach_inferior (inferior *inf)
 void
 detach_inferior (int pid)
 {
-  detach_inferior (find_inferior_pid (pid));
+  inferior *inf = find_inferior_pid (pid);
+
+  if (inf != NULL)
+    detach_inferior (inf);
+  else
+    {
+      if (print_inferior_events)
+	printf_unfiltered (_("[Inferior %d detached]\n"), pid);
+    }
 }
 
 void