[v2] GDBserver crashes when killing a multi-thread process

  On 10/07/14 16:16, Pedro Alves wrote:
> +static void
> +kill_wait_lwp (struct lwp_info *lwp)
> +{
> +  struct thread_info *thr = get_lwp_thread (lwp);
> +  int pid = ptid_get_pid (ptid_of (thr));
> +  int lwpid = ptid_get_lwp (ptid_of (thr));
> +  int wstat;
> +  int res;
> +
> +  if (debug_threads)
> +    debug_printf ("kwl: killing lwp %d, for pid: %d\n", lwpid, pid);
> +
> +  do
> +    {
> +      linux_kill_one_lwp (lwp);
> +
> +      /* Make sure it died.  Notes:
> +
> +	 - The loop is most likely unnecessary.
> +
> +         - We don't use linux_wait_for_event as that could delete lwps
> +           while we're iterating over them.  We're not interested in
> +           any pending status at this point, only in making sure all
> +           wait status on the kernel side are collected until the
> +           process is reaped.
> +
> +	 - We don't use __WALL here as the __WALL emulation relies on
> +	   SIGCHLD, and killing a stopped process doesn't generate
> +	   one, nor an exit status.
> +      */
> +      res = my_waitpid (lwpid, &wstat, 0);
> +      if (res == -1 && errno == ECHILD)
> +	res = my_waitpid (lwpid, &wstat, __WCLONE);
> +    } while (res > 0 && WIFSTOPPED (wstat));
> +
> +  gdb_assert (res > 0);
> +}

Hi Pedro,
do you still remember why did you add this assert?  It wasn't
mentioned in the mail 
https://sourceware.org/ml/gdb-patches/2014-07/msg00206.html

I am looking at a GDBserver internal error on x86_64 when I run
gdb.threads/thread-unwindonsignal.exp with GDBserver,

continue^M
Continuing.^M
warning: Remote failure reply: E.No unwaited-for children left.^M
PC register is not available^M
(gdb) FAIL: gdb.threads/thread-unwindonsignal.exp: continue until exit
Remote debugging from host 127.0.0.1^M
ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
monitor exit^M
Killing process(es): 30694^M
(gdb) /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/linux-low.c:1106: A 
problem internal to GDBserver has been detected.^M
kill_wait_lwp: Assertion `res > 0' failed.

After your patch https://sourceware.org/ml/gdb-patches/2015-03/msg00597.html
GDBserver starts to swallows errors if the LWP is gone.  Then, when
GDBservers kills non-exist LWP, the assert will be triggered.

Why don't we implement kill_wait_lwp like its counterpart in GDB
linux-nat.c:kill_wait_callback? we can loop and assert like this
patch below, (note that this patch fixes the internal error, and
the FAIL is still there).

[v2] GDBserver crashes when killing a multi-thread process

Commit Message

Comments

Patch