ppc64* native-gdbserver testsuite hangs on gdb.threads/process-dies-while-handling-bp.exp

Message ID 86o9mu8v32.fsf@gmail.com
State New, archived
Headers

Commit Message

Yao Qi Dec. 19, 2017, 5:51 p.m. UTC
  Yao Qi <qiyaoltc@gmail.com> writes:

> It is a known issue in GDB, described in PR 18749, rather than a
> regression, so git bisect can't tell you anything useful.  Pedro fixed
> one GDBserver crash caused by 18749, but PR 18749 is still
> there.
>
> I can reproduce the timeout on gcc110,
>
> KFAIL: gdb.threads/process-dies-while-handling-bp.exp: non_stop=on:
> cond_bp_target=0: inferior 1 exited (timeout) (PRMS: gdb/18749)
> Remote debugging from host 127.0.0.1^M
> gdbserver: reading register 10: No such process^M
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Killing process(es): 15614^M
> monitor exit^M
> Ignoring packet error, continuing...
>
> I think some code path in GDBserver doesn't expect this error.

Does this patch work for you?  With this patch,
process-dies-while-handling-bp.exp never timeout on gcc110 with
native-gdbserver.
  

Comments

Edjunior Machado Dec. 20, 2017, 3:23 p.m. UTC | #1
Hi Yao,

thanks a lot for the patch! Testing the fix on Fedora26, confirmed that
native-gdbserver no longer hangs on
gdb.threads/process-dies-while-handling-bp.exp in both ppc64le and be.

Unfortunately, I couldn't regtest running the complete testsuite due to
another hang I'm still checking (likely caused by hw watch not working
properly on such boxes).

Thanks and regards,
Edjunior

On Tue, Dec 19, 2017 at 6:51 PM, Yao Qi <qiyaoltc@gmail.com> wrote:

> Yao Qi <qiyaoltc@gmail.com> writes:
>
> > It is a known issue in GDB, described in PR 18749, rather than a
> > regression, so git bisect can't tell you anything useful.  Pedro fixed
> > one GDBserver crash caused by 18749, but PR 18749 is still
> > there.
> >
> > I can reproduce the timeout on gcc110,
> >
> > KFAIL: gdb.threads/process-dies-while-handling-bp.exp: non_stop=on:
> > cond_bp_target=0: inferior 1 exited (timeout) (PRMS: gdb/18749)
> > Remote debugging from host 127.0.0.1^M
> > gdbserver: reading register 10: No such process^M
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > Killing process(es): 15614^M
> > monitor exit^M
> > Ignoring packet error, continuing...
> >
> > I think some code path in GDBserver doesn't expect this error.
>
> Does this patch work for you?  With this patch,
> process-dies-while-handling-bp.exp never timeout on gcc110 with
> native-gdbserver.
>
> --
> Yao (齐尧)
> From ef027cdb366e3000dee2c713abab4f30123d9ed1 Mon Sep 17 00:00:00 2001
> From: Yao Qi <yao.qi@linaro.org>
> Date: Tue, 19 Dec 2017 17:44:11 +0000
> Subject: [PATCH] Mark register unavailable when PTRACE_PEEKUSER fails
>
> As described in PR 18749, GDB/GDBserver may get an error on accessing
> memory or register because the thread may disappear.  However, some
> path doesn't expect the error.  This patch fixes this problem by
> marking the register unavailable when PTRACE_PEEKUSER fails instead
> of throwing error.
>
> gdb/gdbserver:
>
> 2017-12-19  Yao Qi  <yao.qi@linaro.org>
>
>         PR gdb/18749
>         * linux-low.c (fetch_register): Call supply_register instead of
>         error.
>
> diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
> index f6a52d5..0a52a91 100644
> --- a/gdb/gdbserver/linux-low.c
> +++ b/gdb/gdbserver/linux-low.c
> @@ -5555,7 +5555,11 @@ fetch_register (const struct usrregs_info *usrregs,
>                 (PTRACE_TYPE_ARG3) (uintptr_t) regaddr, (PTRACE_TYPE_ARG4)
> 0);
>        regaddr += sizeof (PTRACE_XFER_TYPE);
>        if (errno != 0)
> -       error ("reading register %d: %s", regno, strerror (errno));
> +       {
> +         /* Mark register REGNO unavailable.  */
> +         supply_register (regcache, regno, NULL);
> +         return;
> +       }
>      }
>
>    if (the_low_target.supply_ptrace_register)
>
  
Yao Qi Dec. 20, 2017, 4:23 p.m. UTC | #2
On Wed, Dec 20, 2017 at 3:23 PM, Edjunior Machado <edjunior@gmail.com> wrote:
> thanks a lot for the patch! Testing the fix on Fedora26, confirmed that
> native-gdbserver no longer hangs on
> gdb.threads/process-dies-while-handling-bp.exp in both ppc64le and be.
>
> Unfortunately, I couldn't regtest running the complete testsuite due to
> another hang I'm still checking (likely caused by hw watch not working
> properly on such boxes).
>

Is it gdb.base/watchpoints.exp?  My regression test running on gcc110
is very slow, blocked by gdb.base/watchpoints.exp.  The test needs two
HW watchpoints, but ppc only has one HW watchpoint register.  GDB
native can do resource counting, so it is smart enough to switch to
SW wathcpoint if there is not enough HW watchpoint registers.  However,
GDB remote can *not* do resource counting, so GDB fails to insert
watchpoints, and all the following tests timeout.
  
Yao Qi Dec. 20, 2017, 9:35 p.m. UTC | #3
On Wed, Dec 20, 2017 at 4:23 PM, Yao Qi <qiyaoltc@gmail.com> wrote:
> Is it gdb.base/watchpoints.exp?  My regression test running on gcc110
> is very slow, blocked by gdb.base/watchpoints.exp.  The test needs two
> HW watchpoints, but ppc only has one HW watchpoint register.  GDB
> native can do resource counting, so it is smart enough to switch to
> SW wathcpoint if there is not enough HW watchpoint registers.  However,
> GDB remote can *not* do resource counting, so GDB fails to insert
> watchpoints, and all the following tests timeout.
>

Looks ppc64-linux gdbserver doesn't support watchpoint, so we need
to improve test cases to probe hw watchpoint support.  See
test "set probe hw watchpoint" in watchpoint-stops-at-right-insn.exp.
  
Edjunior Machado Dec. 21, 2017, 4:01 p.m. UTC | #4
Hi Yao,

On Wed, Dec 20, 2017 at 10:35 PM, Yao Qi <qiyaoltc@gmail.com> wrote:

> On Wed, Dec 20, 2017 at 4:23 PM, Yao Qi <qiyaoltc@gmail.com> wrote:
> > Is it gdb.base/watchpoints.exp?  My regression test running on gcc110
> > is very slow, blocked by gdb.base/watchpoints.exp.  The test needs two
> > HW watchpoints, but ppc only has one HW watchpoint register.  GDB
> > native can do resource counting, so it is smart enough to switch to
> > SW wathcpoint if there is not enough HW watchpoint registers.  However,
> > GDB remote can *not* do resource counting, so GDB fails to insert
> > watchpoints, and all the following tests timeout.
> >
>

exactly, gdb.base/watchpoints.exp is the hanging testcase now.


>
> Looks ppc64-linux gdbserver doesn't support watchpoint, so we need
> to improve test cases to probe hw watchpoint support.  See
> test "set probe hw watchpoint" in watchpoint-stops-at-right-insn.exp.
>

Agreed, ppc64[le] gdbserver currently does not implement hw watchpoint
support; however, despite the high number of failures on the mentioned test
(and possibly others) because of that, from what I could git-bisect, it
didn't hang the native-gdbserver testrun until commit
c65d6b55b3a592906c470c566f57ad8ceacc1605.

Thanks and regards,
Edjunior


>
> --
> Yao (齐尧)
>
  
Yao Qi Jan. 16, 2018, 9:11 a.m. UTC | #5
Yao Qi <qiyaoltc@gmail.com> writes:

> As described in PR 18749, GDB/GDBserver may get an error on accessing
> memory or register because the thread may disappear.  However, some
> path doesn't expect the error.  This patch fixes this problem by
> marking the register unavailable when PTRACE_PEEKUSER fails instead
> of throwing error.
>
> gdb/gdbserver:
>
> 2017-12-19  Yao Qi  <yao.qi@linaro.org>
>
> 	PR gdb/18749
> 	* linux-low.c (fetch_register): Call supply_register instead of
> 	error.

I pushed it in to master.
  

Patch

diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index f6a52d5..0a52a91 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -5555,7 +5555,11 @@  fetch_register (const struct usrregs_info *usrregs,
 		(PTRACE_TYPE_ARG3) (uintptr_t) regaddr, (PTRACE_TYPE_ARG4) 0);
       regaddr += sizeof (PTRACE_XFER_TYPE);
       if (errno != 0)
-	error ("reading register %d: %s", regno, strerror (errno));
+	{
+	  /* Mark register REGNO unavailable.  */
+	  supply_register (regcache, regno, NULL);
+	  return;
+	}
     }
 
   if (the_low_target.supply_ptrace_register)