MIPS/GDB/linux-nat.c: Fix a child detach condition for uClibc-ng

Message ID 20180703193753.GC2675@embecosm.com
State New, archived
Headers

Commit Message

Andrew Burgess July 3, 2018, 7:37 p.m. UTC
  * Sergey Korolev <s.korolev@ndmsystems.com> [2018-06-15 01:54:12 +0300]:

> Current implementation expects that WIFSTOPPED (W_STOPCODE (0)) is true,
> but in the uClibc-ng it is false on MIPS. The patch adds a "detach"
> helper variable to avoid this corner case: WIFSTOPPED applied
> only to a status filled by a waitpid call that should never return
> the status with zero stop signal.

I took a quick look through this patch, and have a little feedback.

You might want to expand your commit message to explain _why_ MIPS is
different (having a signal 127), I didn't know this, and it puzzled me
for a while as to why your above text didn't just indicate a bug in
uClibc-ng :)

> ---
>  gdb/linux-nat.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index 445b59fa4a..916de2d335 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -468,6 +468,7 @@ linux_nat_target::follow_fork (int follow_child, int
> detach_fork)
>        /* Detach new forked process?  */
>        if (detach_fork)
>   {
> +   int detach = 1;

There's already a detach_fork variable in this function, so maybe a
more descriptive name would make thing clearer.

>     struct cleanup *old_chain = make_cleanup (delete_lwp_cleanup,
>       child_lp);
> 
> @@ -492,9 +493,11 @@ linux_nat_target::follow_fork (int follow_child, int
> detach_fork)
>   perror_with_name (_("Couldn't do single step"));
>         if (my_waitpid (child_pid, &status, 0) < 0)
>   perror_with_name (_("Couldn't wait vfork process"));
> +       else
> + detach = WIFSTOPPED (status);
>       }
> 
> -   if (WIFSTOPPED (status))
> +   if (detach)
>       {
>         int signo;

I wonder if we should move the status into the scope of the call to
WIFSTOPPED, and avoid making any calls on it unless we know it has
been filled in.

Such a thing might look like this:


[PATCH] gdb: Avoid using W_STOPCODE(0) as this is ambiguous on MIPS

The MIPS target supports 127 signals, and this can create an ambiguity
in process wait statuses.  A status value of 0x007f could potentially
indicate a process that has exited with signal 127, or a process that
has stopped with signal 0.

In uClibc-ng the interpretation of 0x007f is that the process has
exited with signal 127 rather than stopped with signal 0, and so,
WIFSTOPPED (W_STOPCODE (0)) will be false rather than true as it would
be on most other platforms.

Given that it's pretty easy to avoid using W_STOPCODE (0), lets do that.

gdb/ChangeLog:

	* linux-nat.c (linux_nat_target::follow_fork): Avoid using
	'W_STOPCODE (0)' as this could be ambiguous.
---
 gdb/ChangeLog   |  6 ++++++
 gdb/linux-nat.c | 15 +++++++++++----
 2 files changed, 17 insertions(+), 4 deletions(-)
  

Comments

Sergey Korolev July 4, 2018, 1:35 p.m. UTC | #1
Andrew,

thanks for your feedback. I agree that your explanation and variable names
are more descriptive.
I tested your patch in my environment and did not see any problems.

On Tue, Jul 3, 2018 at 10:37 PM, Andrew Burgess <andrew.burgess@embecosm.com
> wrote:

> * Sergey Korolev <s.korolev@ndmsystems.com> [2018-06-15 01:54:12 +0300]:
>
> > Current implementation expects that WIFSTOPPED (W_STOPCODE (0)) is true,
> > but in the uClibc-ng it is false on MIPS. The patch adds a "detach"
> > helper variable to avoid this corner case: WIFSTOPPED applied
> > only to a status filled by a waitpid call that should never return
> > the status with zero stop signal.
>
> I took a quick look through this patch, and have a little feedback.
>
> You might want to expand your commit message to explain _why_ MIPS is
> different (having a signal 127), I didn't know this, and it puzzled me
> for a while as to why your above text didn't just indicate a bug in
> uClibc-ng :)
>
> > ---
> >  gdb/linux-nat.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> > index 445b59fa4a..916de2d335 100644
> > --- a/gdb/linux-nat.c
> > +++ b/gdb/linux-nat.c
> > @@ -468,6 +468,7 @@ linux_nat_target::follow_fork (int follow_child, int
> > detach_fork)
> >        /* Detach new forked process?  */
> >        if (detach_fork)
> >   {
> > +   int detach = 1;
>
> There's already a detach_fork variable in this function, so maybe a
> more descriptive name would make thing clearer.
>
> >     struct cleanup *old_chain = make_cleanup (delete_lwp_cleanup,
> >       child_lp);
> >
> > @@ -492,9 +493,11 @@ linux_nat_target::follow_fork (int follow_child, int
> > detach_fork)
> >   perror_with_name (_("Couldn't do single step"));
> >         if (my_waitpid (child_pid, &status, 0) < 0)
> >   perror_with_name (_("Couldn't wait vfork process"));
> > +       else
> > + detach = WIFSTOPPED (status);
> >       }
> >
> > -   if (WIFSTOPPED (status))
> > +   if (detach)
> >       {
> >         int signo;
>
> I wonder if we should move the status into the scope of the call to
> WIFSTOPPED, and avoid making any calls on it unless we know it has
> been filled in.
>
> Such a thing might look like this:
>
>
> [PATCH] gdb: Avoid using W_STOPCODE(0) as this is ambiguous on MIPS
>
> The MIPS target supports 127 signals, and this can create an ambiguity
> in process wait statuses.  A status value of 0x007f could potentially
> indicate a process that has exited with signal 127, or a process that
> has stopped with signal 0.
>
> In uClibc-ng the interpretation of 0x007f is that the process has
> exited with signal 127 rather than stopped with signal 0, and so,
> WIFSTOPPED (W_STOPCODE (0)) will be false rather than true as it would
> be on most other platforms.
>
> Given that it's pretty easy to avoid using W_STOPCODE (0), lets do that.
>
> gdb/ChangeLog:
>
>         * linux-nat.c (linux_nat_target::follow_fork): Avoid using
>         'W_STOPCODE (0)' as this could be ambiguous.
> ---
>  gdb/ChangeLog   |  6 ++++++
>  gdb/linux-nat.c | 15 +++++++++++----
>  2 files changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index af38a2a1a5b..4de19770f42 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -448,7 +448,6 @@ linux_nat_target::follow_fork (int follow_child, int
> detach_fork)
>    if (!follow_child)
>      {
>        struct lwp_info *child_lp = NULL;
> -      int status = W_STOPCODE (0);
>        int has_vforked;
>        ptid_t parent_ptid, child_ptid;
>        int parent_pid, child_pid;
> @@ -468,6 +467,8 @@ linux_nat_target::follow_fork (int follow_child, int
> detach_fork)
>        /* Detach new forked process?  */
>        if (detach_fork)
>         {
> +         int child_stop_signal = 0;
> +         bool detach_child = true;
>           struct cleanup *old_chain = make_cleanup (delete_lwp_cleanup,
>                                                     child_lp);
>
> @@ -487,18 +488,24 @@ linux_nat_target::follow_fork (int follow_child, int
> detach_fork)
>           if (!gdbarch_software_single_step_p (target_thread_architecture
>                                                (parent_ptid)))
>             {
> +             int status;
> +
>               linux_disable_event_reporting (child_pid);
>               if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
>                 perror_with_name (_("Couldn't do single step"));
>               if (my_waitpid (child_pid, &status, 0) < 0)
>                 perror_with_name (_("Couldn't wait vfork process"));
> +             else
> +               {
> +                 detach_child = WIFSTOPPED (status);
> +                 child_stop_signal = WSTOPSIG (status);
> +               }
>             }
>
> -         if (WIFSTOPPED (status))
> +         if (detach_child)
>             {
> -             int signo;
> +             int signo = child_stop_signal;
>
> -             signo = WSTOPSIG (status);
>               if (signo != 0
>                   && !signal_pass_state (gdb_signal_from_host (signo)))
>                 signo = 0;
> --
> 2.14.4
>
>
  
Maciej W. Rozycki July 5, 2018, 12:32 p.m. UTC | #2
On Tue, 3 Jul 2018, Andrew Burgess wrote:

> > Current implementation expects that WIFSTOPPED (W_STOPCODE (0)) is true,
> > but in the uClibc-ng it is false on MIPS. The patch adds a "detach"
> > helper variable to avoid this corner case: WIFSTOPPED applied
> > only to a status filled by a waitpid call that should never return
> > the status with zero stop signal.
> 
> I took a quick look through this patch, and have a little feedback.
> 
> You might want to expand your commit message to explain _why_ MIPS is
> different (having a signal 127), I didn't know this, and it puzzled me
> for a while as to why your above text didn't just indicate a bug in
> uClibc-ng :)

 I think it is a bug in uClibc-ng after all, because you can't receive 
signal #0 on Linux.  This is what the kernel has (in kernel/signal.c):

		/*
		 * The null signal is a permissions and process existence
		 * probe.  No signal is actually delivered.
		 */
		if (!error && sig) {
			error = do_send_sig_info(sig, info, p, false);
			/*
			 * If lock_task_sighand() failed we pretend the task
			 * dies after receiving the signal. The window is tiny,
			 * and the signal is private anyway.
			 */
			if (unlikely(error == -ESRCH))
				error = 0;
		}

and also:

	if (!ret && sig)
		ret = do_send_sig_info(sig, info, p, true);

and documentation for kill(2) agrees:

      "If  sig  is 0, then no signal is sent, but error checking is still per-
       formed; this can be used to check for the existence of a process ID  or
       process group ID."

> [PATCH] gdb: Avoid using W_STOPCODE(0) as this is ambiguous on MIPS
> 
> The MIPS target supports 127 signals, and this can create an ambiguity
> in process wait statuses.  A status value of 0x007f could potentially
> indicate a process that has exited with signal 127, or a process that
> has stopped with signal 0.
> 
> In uClibc-ng the interpretation of 0x007f is that the process has
> exited with signal 127 rather than stopped with signal 0, and so,
> WIFSTOPPED (W_STOPCODE (0)) will be false rather than true as it would
> be on most other platforms.

 So there is no ambiguity, 0x007f means that the process has exited with 
signal 127 and nothing else, and while I agree the change might be a good 
workaround for uClibc-ng's oddity, I think uClibc-ng's implementation of 
WIFSTOPPED has also to be fixed to reflect reality.  Which also means the 
commit description needs to be updated accordingly.

 FWIW,

  Maciej
  
Sergey Korolev July 5, 2018, 9:04 p.m. UTC | #3
On Thu, Jul 5, 2018 at 3:32 PM, Maciej W. Rozycki <macro@mips.com> wrote:

> On Tue, 3 Jul 2018, Andrew Burgess wrote:
>
> > > Current implementation expects that WIFSTOPPED (W_STOPCODE (0)) is
> true,
> > > but in the uClibc-ng it is false on MIPS. The patch adds a "detach"
> > > helper variable to avoid this corner case: WIFSTOPPED applied
> > > only to a status filled by a waitpid call that should never return
> > > the status with zero stop signal.
> >
> > I took a quick look through this patch, and have a little feedback.
> >
> > You might want to expand your commit message to explain _why_ MIPS is
> > different (having a signal 127), I didn't know this, and it puzzled me
> > for a while as to why your above text didn't just indicate a bug in
> > uClibc-ng :)
>
>  I think it is a bug in uClibc-ng after all, because you can't receive
> signal #0 on Linux.  This is what the kernel has (in kernel/signal.c):
>
>                 /*
>                  * The null signal is a permissions and process existence
>                  * probe.  No signal is actually delivered.
>                  */
>                 if (!error && sig) {
>                         error = do_send_sig_info(sig, info, p, false);
>                         /*
>                          * If lock_task_sighand() failed we pretend the
> task
>                          * dies after receiving the signal. The window is
> tiny,
>                          * and the signal is private anyway.
>                          */
>                         if (unlikely(error == -ESRCH))
>                                 error = 0;
>                 }
>
> and also:
>
>         if (!ret && sig)
>                 ret = do_send_sig_info(sig, info, p, true);
>
> and documentation for kill(2) agrees:
>
>       "If  sig  is 0, then no signal is sent, but error checking is still
> per-
>        formed; this can be used to check for the existence of a process
> ID  or
>        process group ID."
>
> > [PATCH] gdb: Avoid using W_STOPCODE(0) as this is ambiguous on MIPS
> >
> > The MIPS target supports 127 signals, and this can create an ambiguity
> > in process wait statuses.  A status value of 0x007f could potentially
> > indicate a process that has exited with signal 127, or a process that
> > has stopped with signal 0.
> >
> > In uClibc-ng the interpretation of 0x007f is that the process has
> > exited with signal 127 rather than stopped with signal 0, and so,
> > WIFSTOPPED (W_STOPCODE (0)) will be false rather than true as it would
> > be on most other platforms.
>
>  So there is no ambiguity, 0x007f means that the process has exited with
> signal 127 and nothing else, and while I agree the change might be a good
> workaround for uClibc-ng's oddity, I think uClibc-ng's implementation of
> WIFSTOPPED has also to be fixed to reflect reality.  Which also means the
> commit description needs to be updated accordingly.
>

Here is an initial discussion with an explanation from uClibc-ng patch
author

https://mailman.uclibc-ng.org/pipermail/devel/2018-June/001695.html

Also I do not why but musl does not have W_STOPCODE macro at all.


>  FWIW,
>
>   Maciej
>
  
Andrew Burgess July 6, 2018, 10:09 p.m. UTC | #4
* Maciej W. Rozycki <macro@mips.com> [2018-07-05 13:32:06 +0100]:

> On Tue, 3 Jul 2018, Andrew Burgess wrote:
> 
> > > Current implementation expects that WIFSTOPPED (W_STOPCODE (0)) is true,
> > > but in the uClibc-ng it is false on MIPS. The patch adds a "detach"
> > > helper variable to avoid this corner case: WIFSTOPPED applied
> > > only to a status filled by a waitpid call that should never return
> > > the status with zero stop signal.
> > 
> > I took a quick look through this patch, and have a little feedback.
> > 
> > You might want to expand your commit message to explain _why_ MIPS is
> > different (having a signal 127), I didn't know this, and it puzzled me
> > for a while as to why your above text didn't just indicate a bug in
> > uClibc-ng :)
> 
>  I think it is a bug in uClibc-ng after all, because you can't receive 
> signal #0 on Linux.  This is what the kernel has (in kernel/signal.c):
> 
> 		/*
> 		 * The null signal is a permissions and process existence
> 		 * probe.  No signal is actually delivered.
> 		 */
> 		if (!error && sig) {
> 			error = do_send_sig_info(sig, info, p, false);
> 			/*
> 			 * If lock_task_sighand() failed we pretend the task
> 			 * dies after receiving the signal. The window is tiny,
> 			 * and the signal is private anyway.
> 			 */
> 			if (unlikely(error == -ESRCH))
> 				error = 0;
> 		}
> 
> and also:
> 
> 	if (!ret && sig)
> 		ret = do_send_sig_info(sig, info, p, true);
> 
> and documentation for kill(2) agrees:
> 
>       "If  sig  is 0, then no signal is sent, but error checking is still per-
>        formed; this can be used to check for the existence of a process ID  or
>        process group ID."
> 
> > [PATCH] gdb: Avoid using W_STOPCODE(0) as this is ambiguous on MIPS
> > 
> > The MIPS target supports 127 signals, and this can create an ambiguity
> > in process wait statuses.  A status value of 0x007f could potentially
> > indicate a process that has exited with signal 127, or a process that
> > has stopped with signal 0.
> > 
> > In uClibc-ng the interpretation of 0x007f is that the process has
> > exited with signal 127 rather than stopped with signal 0, and so,
> > WIFSTOPPED (W_STOPCODE (0)) will be false rather than true as it would
> > be on most other platforms.
> 
>  So there is no ambiguity, 0x007f means that the process has exited with 
> signal 127 and nothing else, and while I agree the change might be a good 
> workaround for uClibc-ng's oddity, I think uClibc-ng's implementation of 
> WIFSTOPPED has also to be fixed to reflect reality.  Which also means the 
> commit description needs to be updated accordingly.

OK.  I think we're mostly agreeing with each other, it's just the
part about uClibc-ng which I'd like to push on a bit more.

Lets consider a couple of cases, first signal 0x33.

  Status |
   Value | Meaning
  ------------------
    0033 | Killed by signal 0x33
    337f | Stopped by signal 0x33

Now, signal 127, or 0xf7:

    007f | Killed by signal 0x7f
    7f7f | Stopped by signal 0x7f

And if we _pretend_ for a moment, that signal 0x00 can be sent:

    0000 | Killed by signal 0x00
    007f | Stopped by signal 0x00

So the ambiguity arises if we can send both signal 0x7f AND signal
0x00.  But, as you point out the kernel prevents sending signal 0x00,
and for all targets except MIPS, signal 0x7f is also prevented.  Most
targets seems to have 32 or 64 signals.  MIPS has 128.

But this shouldn't be a problem, the kernel (as you point out)
prevents us sending signal 0x00.

The problem is that signal 0x00 isn't coming from the kernel, it's
being synthesised in GDB with 'W_STOPCODE (0)', which builds the
status 007f.

I might even be bold enough to declare that 'W_STOPCODE (0)' should be
declare a bad thing .... maybe ...

Still, I don't see how uClibc-ng is doing anything wrong, a status is
only a stop status if (a) it ends in 0x7f, and (b) the signal number
is non-zero, this seems pretty reasonable really.

Thanks,
Andrew



> 
>  FWIW,
> 
>   Maciej
  
Maciej W. Rozycki July 9, 2018, 11:20 a.m. UTC | #5
On Fri, 6 Jul 2018, Andrew Burgess wrote:

> The problem is that signal 0x00 isn't coming from the kernel, it's
> being synthesised in GDB with 'W_STOPCODE (0)', which builds the
> status 007f.

 Right, somehow I didn't fully realise that it is GDB being responsible 
for creating this status code artificially.  Sorry for the confusion.

> I might even be bold enough to declare that 'W_STOPCODE (0)' should be
> declare a bad thing .... maybe ...

 I think it would be reasonable for an implementation to require the 
argument to be a valid signal number and therefore users of this API 
would have to avoid passing other numbers as that would cause undefined 
behaviour.

> Still, I don't see how uClibc-ng is doing anything wrong, a status is
> only a stop status if (a) it ends in 0x7f, and (b) the signal number
> is non-zero, this seems pretty reasonable really.

 Agreed.  Garbage in, garbage out.

  Maciej
  
Andrew Burgess Aug. 1, 2018, 9:31 p.m. UTC | #6
Ping!

I would propose my version of the patch, but one of the two versions
should probably be merged.

Original patch:
    https://sourceware.org/ml/gdb-patches/2018-06/msg00382.html

My revised proposal:
    https://sourceware.org/ml/gdb-patches/2018-07/msg00061.html

Thanks,
Andrew

* Andrew Burgess <andrew.burgess@embecosm.com> [2018-07-03 20:37:53 +0100]:

> * Sergey Korolev <s.korolev@ndmsystems.com> [2018-06-15 01:54:12 +0300]:
> 
> > Current implementation expects that WIFSTOPPED (W_STOPCODE (0)) is true,
> > but in the uClibc-ng it is false on MIPS. The patch adds a "detach"
> > helper variable to avoid this corner case: WIFSTOPPED applied
> > only to a status filled by a waitpid call that should never return
> > the status with zero stop signal.
> 
> I took a quick look through this patch, and have a little feedback.
> 
> You might want to expand your commit message to explain _why_ MIPS is
> different (having a signal 127), I didn't know this, and it puzzled me
> for a while as to why your above text didn't just indicate a bug in
> uClibc-ng :)
> 
> > ---
> >  gdb/linux-nat.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> > index 445b59fa4a..916de2d335 100644
> > --- a/gdb/linux-nat.c
> > +++ b/gdb/linux-nat.c
> > @@ -468,6 +468,7 @@ linux_nat_target::follow_fork (int follow_child, int
> > detach_fork)
> >        /* Detach new forked process?  */
> >        if (detach_fork)
> >   {
> > +   int detach = 1;
> 
> There's already a detach_fork variable in this function, so maybe a
> more descriptive name would make thing clearer.
> 
> >     struct cleanup *old_chain = make_cleanup (delete_lwp_cleanup,
> >       child_lp);
> > 
> > @@ -492,9 +493,11 @@ linux_nat_target::follow_fork (int follow_child, int
> > detach_fork)
> >   perror_with_name (_("Couldn't do single step"));
> >         if (my_waitpid (child_pid, &status, 0) < 0)
> >   perror_with_name (_("Couldn't wait vfork process"));
> > +       else
> > + detach = WIFSTOPPED (status);
> >       }
> > 
> > -   if (WIFSTOPPED (status))
> > +   if (detach)
> >       {
> >         int signo;
> 
> I wonder if we should move the status into the scope of the call to
> WIFSTOPPED, and avoid making any calls on it unless we know it has
> been filled in.
> 
> Such a thing might look like this:
> 
> 
> [PATCH] gdb: Avoid using W_STOPCODE(0) as this is ambiguous on MIPS
> 
> The MIPS target supports 127 signals, and this can create an ambiguity
> in process wait statuses.  A status value of 0x007f could potentially
> indicate a process that has exited with signal 127, or a process that
> has stopped with signal 0.
> 
> In uClibc-ng the interpretation of 0x007f is that the process has
> exited with signal 127 rather than stopped with signal 0, and so,
> WIFSTOPPED (W_STOPCODE (0)) will be false rather than true as it would
> be on most other platforms.
> 
> Given that it's pretty easy to avoid using W_STOPCODE (0), lets do that.
> 
> gdb/ChangeLog:
> 
> 	* linux-nat.c (linux_nat_target::follow_fork): Avoid using
> 	'W_STOPCODE (0)' as this could be ambiguous.
> ---
>  gdb/ChangeLog   |  6 ++++++
>  gdb/linux-nat.c | 15 +++++++++++----
>  2 files changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index af38a2a1a5b..4de19770f42 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -448,7 +448,6 @@ linux_nat_target::follow_fork (int follow_child, int detach_fork)
>    if (!follow_child)
>      {
>        struct lwp_info *child_lp = NULL;
> -      int status = W_STOPCODE (0);
>        int has_vforked;
>        ptid_t parent_ptid, child_ptid;
>        int parent_pid, child_pid;
> @@ -468,6 +467,8 @@ linux_nat_target::follow_fork (int follow_child, int detach_fork)
>        /* Detach new forked process?  */
>        if (detach_fork)
>  	{
> +	  int child_stop_signal = 0;
> +	  bool detach_child = true;
>  	  struct cleanup *old_chain = make_cleanup (delete_lwp_cleanup,
>  						    child_lp);
>  
> @@ -487,18 +488,24 @@ linux_nat_target::follow_fork (int follow_child, int detach_fork)
>  	  if (!gdbarch_software_single_step_p (target_thread_architecture
>  					       (parent_ptid)))
>  	    {
> +	      int status;
> +
>  	      linux_disable_event_reporting (child_pid);
>  	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
>  		perror_with_name (_("Couldn't do single step"));
>  	      if (my_waitpid (child_pid, &status, 0) < 0)
>  		perror_with_name (_("Couldn't wait vfork process"));
> +	      else
> +		{
> +		  detach_child = WIFSTOPPED (status);
> +		  child_stop_signal = WSTOPSIG (status);
> +		}
>  	    }
>  
> -	  if (WIFSTOPPED (status))
> +	  if (detach_child)
>  	    {
> -	      int signo;
> +	      int signo = child_stop_signal;
>  
> -	      signo = WSTOPSIG (status);
>  	      if (signo != 0
>  		  && !signal_pass_state (gdb_signal_from_host (signo)))
>  		signo = 0;
> -- 
> 2.14.4
>
  
Sergey Korolev Aug. 1, 2018, 9:37 p.m. UTC | #7
Andrew,

I would personally prefer your revised version due to more detailed
description.

On Thu, Aug 2, 2018 at 12:31 AM, Andrew Burgess <andrew.burgess@embecosm.com
> wrote:

> Ping!
>
> I would propose my version of the patch, but one of the two versions
> should probably be merged.
>
> Original patch:
>     https://sourceware.org/ml/gdb-patches/2018-06/msg00382.html
>
> My revised proposal:
>     https://sourceware.org/ml/gdb-patches/2018-07/msg00061.html
>
> Thanks,
> Andrew
>
> * Andrew Burgess <andrew.burgess@embecosm.com> [2018-07-03 20:37:53
> +0100]:
>
> > * Sergey Korolev <s.korolev@ndmsystems.com> [2018-06-15 01:54:12 +0300]:
> >
> > > Current implementation expects that WIFSTOPPED (W_STOPCODE (0)) is
> true,
> > > but in the uClibc-ng it is false on MIPS. The patch adds a "detach"
> > > helper variable to avoid this corner case: WIFSTOPPED applied
> > > only to a status filled by a waitpid call that should never return
> > > the status with zero stop signal.
> >
> > I took a quick look through this patch, and have a little feedback.
> >
> > You might want to expand your commit message to explain _why_ MIPS is
> > different (having a signal 127), I didn't know this, and it puzzled me
> > for a while as to why your above text didn't just indicate a bug in
> > uClibc-ng :)
> >
> > > ---
> > >  gdb/linux-nat.c | 5 ++++-
> > >  1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> > > index 445b59fa4a..916de2d335 100644
> > > --- a/gdb/linux-nat.c
> > > +++ b/gdb/linux-nat.c
> > > @@ -468,6 +468,7 @@ linux_nat_target::follow_fork (int follow_child,
> int
> > > detach_fork)
> > >        /* Detach new forked process?  */
> > >        if (detach_fork)
> > >   {
> > > +   int detach = 1;
> >
> > There's already a detach_fork variable in this function, so maybe a
> > more descriptive name would make thing clearer.
> >
> > >     struct cleanup *old_chain = make_cleanup (delete_lwp_cleanup,
> > >       child_lp);
> > >
> > > @@ -492,9 +493,11 @@ linux_nat_target::follow_fork (int follow_child,
> int
> > > detach_fork)
> > >   perror_with_name (_("Couldn't do single step"));
> > >         if (my_waitpid (child_pid, &status, 0) < 0)
> > >   perror_with_name (_("Couldn't wait vfork process"));
> > > +       else
> > > + detach = WIFSTOPPED (status);
> > >       }
> > >
> > > -   if (WIFSTOPPED (status))
> > > +   if (detach)
> > >       {
> > >         int signo;
> >
> > I wonder if we should move the status into the scope of the call to
> > WIFSTOPPED, and avoid making any calls on it unless we know it has
> > been filled in.
> >
> > Such a thing might look like this:
> >
> >
> > [PATCH] gdb: Avoid using W_STOPCODE(0) as this is ambiguous on MIPS
> >
> > The MIPS target supports 127 signals, and this can create an ambiguity
> > in process wait statuses.  A status value of 0x007f could potentially
> > indicate a process that has exited with signal 127, or a process that
> > has stopped with signal 0.
> >
> > In uClibc-ng the interpretation of 0x007f is that the process has
> > exited with signal 127 rather than stopped with signal 0, and so,
> > WIFSTOPPED (W_STOPCODE (0)) will be false rather than true as it would
> > be on most other platforms.
> >
> > Given that it's pretty easy to avoid using W_STOPCODE (0), lets do that.
> >
> > gdb/ChangeLog:
> >
> >       * linux-nat.c (linux_nat_target::follow_fork): Avoid using
> >       'W_STOPCODE (0)' as this could be ambiguous.
> > ---
> >  gdb/ChangeLog   |  6 ++++++
> >  gdb/linux-nat.c | 15 +++++++++++----
> >  2 files changed, 17 insertions(+), 4 deletions(-)
> >
> > diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> > index af38a2a1a5b..4de19770f42 100644
> > --- a/gdb/linux-nat.c
> > +++ b/gdb/linux-nat.c
> > @@ -448,7 +448,6 @@ linux_nat_target::follow_fork (int follow_child, int
> detach_fork)
> >    if (!follow_child)
> >      {
> >        struct lwp_info *child_lp = NULL;
> > -      int status = W_STOPCODE (0);
> >        int has_vforked;
> >        ptid_t parent_ptid, child_ptid;
> >        int parent_pid, child_pid;
> > @@ -468,6 +467,8 @@ linux_nat_target::follow_fork (int follow_child, int
> detach_fork)
> >        /* Detach new forked process?  */
> >        if (detach_fork)
> >       {
> > +       int child_stop_signal = 0;
> > +       bool detach_child = true;
> >         struct cleanup *old_chain = make_cleanup (delete_lwp_cleanup,
> >                                                   child_lp);
> >
> > @@ -487,18 +488,24 @@ linux_nat_target::follow_fork (int follow_child,
> int detach_fork)
> >         if (!gdbarch_software_single_step_p (target_thread_architecture
> >                                              (parent_ptid)))
> >           {
> > +           int status;
> > +
> >             linux_disable_event_reporting (child_pid);
> >             if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
> >               perror_with_name (_("Couldn't do single step"));
> >             if (my_waitpid (child_pid, &status, 0) < 0)
> >               perror_with_name (_("Couldn't wait vfork process"));
> > +           else
> > +             {
> > +               detach_child = WIFSTOPPED (status);
> > +               child_stop_signal = WSTOPSIG (status);
> > +             }
> >           }
> >
> > -       if (WIFSTOPPED (status))
> > +       if (detach_child)
> >           {
> > -           int signo;
> > +           int signo = child_stop_signal;
> >
> > -           signo = WSTOPSIG (status);
> >             if (signo != 0
> >                 && !signal_pass_state (gdb_signal_from_host (signo)))
> >               signo = 0;
> > --
> > 2.14.4
> >
>
  
Tom Tromey Aug. 2, 2018, 8:03 p.m. UTC | #8
>>>>> "Sergey" == Sergey Korolev <s.korolev@ndmsystems.com> writes:

Andre> I would propose my version of the patch, but one of the two versions
Andrew> should probably be merged.

Sergey> I would personally prefer your revised version due to more detailed
Sergey> description.

I agree with both of you; the revised patch is ok.  Thank you both.

Tom
  

Patch

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index af38a2a1a5b..4de19770f42 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -448,7 +448,6 @@  linux_nat_target::follow_fork (int follow_child, int detach_fork)
   if (!follow_child)
     {
       struct lwp_info *child_lp = NULL;
-      int status = W_STOPCODE (0);
       int has_vforked;
       ptid_t parent_ptid, child_ptid;
       int parent_pid, child_pid;
@@ -468,6 +467,8 @@  linux_nat_target::follow_fork (int follow_child, int detach_fork)
       /* Detach new forked process?  */
       if (detach_fork)
 	{
+	  int child_stop_signal = 0;
+	  bool detach_child = true;
 	  struct cleanup *old_chain = make_cleanup (delete_lwp_cleanup,
 						    child_lp);
 
@@ -487,18 +488,24 @@  linux_nat_target::follow_fork (int follow_child, int detach_fork)
 	  if (!gdbarch_software_single_step_p (target_thread_architecture
 					       (parent_ptid)))
 	    {
+	      int status;
+
 	      linux_disable_event_reporting (child_pid);
 	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
 		perror_with_name (_("Couldn't do single step"));
 	      if (my_waitpid (child_pid, &status, 0) < 0)
 		perror_with_name (_("Couldn't wait vfork process"));
+	      else
+		{
+		  detach_child = WIFSTOPPED (status);
+		  child_stop_signal = WSTOPSIG (status);
+		}
 	    }
 
-	  if (WIFSTOPPED (status))
+	  if (detach_child)
 	    {
-	      int signo;
+	      int signo = child_stop_signal;
 
-	      signo = WSTOPSIG (status);
 	      if (signo != 0
 		  && !signal_pass_state (gdb_signal_from_host (signo)))
 		signo = 0;