New testcase gdb.threads/leader-exit-attach.exp (PR threads/8153)

Message ID be3bb9cb-a610-4284-b057-92c8ab43d046@palves.net
State New
Headers
Series New testcase gdb.threads/leader-exit-attach.exp (PR threads/8153) |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm fail Testing failed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 fail Testing failed

Commit Message

Pedro Alves March 25, 2024, 7:57 p.m. UTC
  On 2024-03-25 19:36, Pedro Alves wrote:
> On 2024-03-23 06:39, Eli Zaretskii wrote:
>>> From: Pedro Alves <pedro@palves.net>
>>> Date: Fri, 22 Mar 2024 19:30:30 +0000
>>>
>>> While working on Windows non-stop mode, I managed to introduce a bug
>>> that led to fake_create_process being called.  That then resulted in
>>> GDB crashes later on, because fake_create_process added a thread with
>>> an incorrect ptid for this target.  It is putting dwThreadId in the
>>> tid field of the ptid instead of on the lwp field.  This is fixed by
>>> this patch.
>>>
>>> I do however wonder why nobody has seen it this long.
>>
>> AFAIU, to actually see the bug, one would need to attach GDB to a
>> process whose main thread has exited, is that true?  If so, I'm not
>> surprised this bug was not reported: it's unusual for the main thread
>> to exit without shutting down the process, and the need to attach to
>> such a process (as opposed to having it run from GDB to begin with)
>> makes that even more rare.  And finally, not every bug is reported by
>> the first person who sees it the first time, right?
> 
> Yes, that could be the reason.  But it could also be because the brokenness with the
> Windows debug API that Chris was seeing only happens on Windows versions we no
> longer claim support for (i.e., earlier than Windows XP).
> 
> Anyhow, the patch is pretty obvious on its own, so I went ahead and merged it
> without that blurb in the commit log, like below.
> 
> I also wrote a testcase that exercises the scenario in question.  I'll post
> that next.


Here's said testcase.  Only two decades between original fix and testcase,
not too bad.  :-)

While writing this, I stumbled on server/31554, and I filed it in bugzilla,
and added a kfail here, to avoid falling deeper down the rabbit hole.

I also filed server/31555 for the can't-attach-to-zombie-task on Linux, and marked
it as kfail instead of xfail as there may be a workaround for that.  (Attach to all
the process's threads anyhow. We'd not get an exit status for the final process exit,
but I think we could live without it).

From ff9c3b19e5c876ed8e5cb5f45f1e3a9873010991 Mon Sep 17 00:00:00 2001
From: Pedro Alves <pedro@palves.net>
Date: Mon, 25 Mar 2024 15:17:02 +0000
Subject: [PATCH] New testcase gdb.threads/leader-exit-attach.exp (PR
 threads/8153)

Add a new testcase for exercising attaching to a process after its
main thread has exited.

This is not possible on Linux, the kernel does not allow attaching to
a zombie task, so the test is kfailed there.  It is possible however
on Windows at least, and was the scenario addressed by the Windows
backend fix in
https://sourceware.org/legacy-ml/gdb-patches/2003-12/msg00479.html,
nowadays PR threads/8153, back in 2003.

Passes cleanly on Cygwin.
KFAILed on GNU/Linux native and gdbserver.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=8153
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31554
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31555
Change-Id: Ib554f92f68c965bb4603cdf2aadb55ca45ded53b
---
 .../gdb.threads/leader-exit-attach.exp        | 87 +++++++++++++++++++
 1 file changed, 87 insertions(+)
 create mode 100644 gdb/testsuite/gdb.threads/leader-exit-attach.exp


base-commit: ccf3148e3133f016a8e1484e85e5e4d8c271c4f0
  

Comments

Tom Tromey March 26, 2024, 3:26 p.m. UTC | #1
>>>>> "Pedro" == Pedro Alves <pedro@palves.net> writes:

Pedro> Here's said testcase.  Only two decades between original fix and testcase,
Pedro> not too bad.  :-)

Does this mean that the phony process stuff is needed and so it should
be ported to gdbserver as well?

Maybe we need another new bug.

Tom
  
Pedro Alves March 26, 2024, 7:27 p.m. UTC | #2
On 2024-03-26 15:26, Tom Tromey wrote:
>>>>>> "Pedro" == Pedro Alves <pedro@palves.net> writes:
> 
> Pedro> Here's said testcase.  Only two decades between original fix and testcase,
> Pedro> not too bad.  :-)
> 
> Does this mean that the phony process stuff is needed and so it should
> be ported to gdbserver as well?

At least on Windows 10, it isn't needed.  The jury is still out on whether it is needed on any Windows version we claim to support (Windows XP and up).
I was hoping we could drop all that hacky stuff from gdb instead of adding it to gdbserver.

I can't run Cygwin tests against gdbserver atm for the some reason I can't explain, but if I try it manually, GDBserver attaches to the process just fine.

If I run the new test on native Cygwin with this:

 --- c/gdb/windows-nat.c
 +++ w/gdb/windows-nat.c
 @@ -1360,6 +1360,8 @@ windows_nat_target::windows_continue (DWORD continue_status, int id,
  DWORD
  windows_nat_target::fake_create_process ()
  {
 +  gdb_assert (0);
 +

... it still passes.

I only tripped on the bad ptid in the phony process code path because I had a bug in the non-stop series that happened to result in that path being incorrectly taken.

Pedro Alves

> 
> Maybe we need another new bug.
> 
> Tom
  
Tom Tromey March 26, 2024, 8:04 p.m. UTC | #3
>>>>> "Pedro" == Pedro Alves <pedro@palves.net> writes:

Pedro> At least on Windows 10, it isn't needed.  The jury is still out
Pedro> on whether it is needed on any Windows version we claim to
Pedro> support (Windows XP and up).

Ok, thanks, I see.

Pedro> I was hoping we could drop all that hacky stuff from gdb instead
Pedro> of adding it to gdbserver.

Yeah, me too.

Tom
  
Pedro Alves April 12, 2024, 5:45 p.m. UTC | #4
On 2024-03-25 19:57, Pedro Alves wrote:

> Here's said testcase.  Only two decades between original fix and testcase,
> not too bad.  :-)

I've merged this one now.
  

Patch

diff --git a/gdb/testsuite/gdb.threads/leader-exit-attach.exp b/gdb/testsuite/gdb.threads/leader-exit-attach.exp
new file mode 100644
index 00000000000..c1ed1baaa67
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/leader-exit-attach.exp
@@ -0,0 +1,87 @@ 
+# Copyright (C) 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test attaching to a program after its main thread has exited.
+
+require can_spawn_for_attach
+
+standard_testfile leader-exit.c
+
+if {[build_executable "failed to prepare" $testfile $srcfile {debug pthreads}] == -1} {
+    return
+}
+
+set escapedbinfile [string_to_regexp ${binfile}]
+
+set test_spawn_id [spawn_wait_for_attach $binfile]
+set testpid [spawn_id_get_pid $test_spawn_id]
+
+# Wait a bit for the leader thread to exit, before attaching.
+sleep 2
+
+clean_restart ${binfile}
+
+# Save this early as we may not be able to talk with GDBserver anymore
+# when we need to check it.
+set is_gdbserver [target_is_gdbserver]
+
+# True if successfully attached.
+set attached 0
+
+gdb_test_multiple "attach $testpid" "attach" {
+    -re "Attaching to process $testpid failed.*" {
+	# GNU/Linux gdbserver.  Linux ptrace does not let you attach
+	# to zombie threads.
+	setup_kfail "gdb/31555" *-*-linux*
+	fail $gdb_test_name
+    }
+    -re "warning: process $testpid is a zombie - the process has already terminated.*" {
+	# Native GNU/Linux.  Linux ptrace does not let you attach to
+	# zombie threads.
+	setup_kfail "gdb/31555" *-*-linux*
+	fail $gdb_test_name
+    }
+    -re "Attaching to program: $escapedbinfile, process $testpid.*$gdb_prompt $" {
+	pass $gdb_test_name
+	set attached 1
+    }
+}
+
+# With gdbserver, after we failed to attach, we hit PR server/31554:
+#  print $_inferior_thread_count
+#  Remote connection closed
+#  (gdb) KFAIL: gdb.threads/leader-exit-attach.exp: get valueof "$_inferior_thread_count"
+if {!$attached && $is_gdbserver} {
+    setup_kfail "server/31554" "*-*-*"
+}
+
+set thread_count [get_valueof "" "\$_inferior_thread_count" -1]
+
+if {$thread_count == -1} {
+    kill_wait_spawned_process $test_spawn_id
+    return
+}
+
+if {$attached} {
+    # Check that we have at least one thread.  We can't assume there
+    # will only be exactly one thread, because on some systems, like
+    # Cygwin, the runtime spawns extra threads.  Also, on Windows,
+    # attaching always injects one extra thread.
+    gdb_assert {$thread_count >= 1}
+} else {
+    gdb_assert {$thread_count == 0}
+}
+
+kill_wait_spawned_process $test_spawn_id