From patchwork Thu Dec 14 20:22:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 82167 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 06AC73861881 for ; Thu, 14 Dec 2023 20:23:01 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from mail-wr1-f51.google.com (mail-wr1-f51.google.com [209.85.221.51]) by sourceware.org (Postfix) with ESMTPS id DE6FD385C6DD for ; Thu, 14 Dec 2023 20:22:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DE6FD385C6DD Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DE6FD385C6DD Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.85.221.51 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585368; cv=none; b=tyBf2pVFQxu4nHQpZPMBFenXBrjHpwGkHrkFnokcJoZeeElM6Cpa6t6Zo+OFrMpiIe1jnQZw6bt1qj7yrvDHp958gjWtVoxTI4GvDRVbuZW9xCBQu8k7kGazkRdAIRtlid9AN/02lRBbJqziab7S5Eqry9haEsEkVX7y+LtgCVA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585368; c=relaxed/simple; bh=9MRAx0PMJhJk7ma/DlxZTPvu9S/MxrJIBCQgkf6C+DQ=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=XTkVDG9yyEPDqNF4zg/Dq4oAHfoKvbb3n4xQchNo0CTNd9w4xw04auEq0QGoyhPWqGWlNlG2YIexmErbiXDwXCLCKa/+Sv2Cp253TDjduhSMR3IvJ2qEwxOgamNDyg7P+bKebQ4jblNhQ6tNQGE4RUkxURLyEDY4ilSvhfjsACo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-f51.google.com with SMTP id ffacd0b85a97d-336447f240cso1087529f8f.3 for ; Thu, 14 Dec 2023 12:22:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702585364; x=1703190164; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8+Svmzskv6A4yq094F0huGCx4/mjQOR3fs87alIeD9M=; b=wBJRUqyqmnyxjor4c0YQTFJygkg3BziiK4AOR+K0myk3dsYt2kRLasoygmRLdrU0B7 T8rPvbAuRqW11x8R3h3xC8uaomUu2+JFBb9BR88Ecjndwe81BBdl6lv6xQkXm5urW36K cg5AGtus4auAy/lSTfGf1GqySl2+mmwzHsHjo8MfgYhpLIAdDAsex2TYsLPEdisCLsC0 e5rpTwlqable4jL0g6D6rFWUfxLqOByYd/N2hdnahIs0nGc/v6wcucj3sXd98l1fOuP0 gllP3U6KafkcJySlZ+x9NC8As2sgA+EYMqHaRqPQ5PliLT67XE829OjPLF0ZUR5M+3sZ EmAQ== X-Gm-Message-State: AOJu0Yx91NCCn/AkxOlKhdV1hTCcoJ3e36Ja5y3b8PHFm9DklZXklfC3 JrF1HXXiEJUaJ/SFeQ0UOH2NWR8sPtwuQg== X-Google-Smtp-Source: AGHT+IGv+bsGp+4txsAXNoVqOxbhVD+axm2ouG7OPZSd/kva2nsdkYzkL/cyBmqcR8NsjQ+FePHPBw== X-Received: by 2002:a5d:620c:0:b0:334:b14d:c57 with SMTP id y12-20020a5d620c000000b00334b14d0c57mr5616848wru.22.1702585363731; Thu, 14 Dec 2023 12:22:43 -0800 (PST) Received: from localhost ([2001:8a0:f923:4f00:2646:535c:5a04:e380]) by smtp.gmail.com with UTF8SMTPSA id e10-20020adfe7ca000000b003335ddce799sm3625857wrn.103.2023.12.14.12.22.43 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 Dec 2023 12:22:43 -0800 (PST) From: Pedro Alves To: gdb-patches@sourceware.org Subject: [PATCH 1/8] gdb.threads/step-over-thread-exit.exp improvements Date: Thu, 14 Dec 2023 20:22:31 +0000 Message-ID: <20231214202238.1065676-2-pedro@palves.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214202238.1065676-1-pedro@palves.net> References: <20231214202238.1065676-1-pedro@palves.net> MIME-Version: 1.0 X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org This commit makes the following improvements to gdb.threads/step-over-thread-exit.exp: - Add a third axis to stepping over the breakpoint with displaced vs inline stepping -- also test with no breakpoint at all. - Check that when GDB reports "Command aborted, thread exited.", the selected thread is the thread that exited. This is always true currently on GNU/Linux by coincidence, but a similar testcase on AMD GPU exposed a problem here. Better make the testcase catch any potential regression. - Fixes a race that Simon ran into with GDBserver testing. (gdb) next [New Thread 2143071.2143438] Thread 3 "step-over-threa" hit Breakpoint 2, 0x000055555555524e in my_exit_syscall () at .../testsuite/lib/my-syscalls.S:74 74 SYSCALL (my_exit, __NR_exit) (gdb) FAIL: gdb.threads/step-over-thread-exit.exp: displaced-stepping=auto: non-stop=on: target-non-stop=on: schedlock=off: cmd=next: ns_stop_all=0: command aborts when thread exits I was not able to reproduce it, but I believe that what happens is the following: Once we continue, the thread 2 exits, and the main thread thus unblocks from its pthread_join, and spawns a new thread. That new thread may hit the breakpoint at my_exit_syscall very quickly. GDB could then see/process that breakpoint event before the thread exit event for the thread we care about, which would result in the failure seen above. The fix here is to not loop and start a new thread at all in the scenario where the race can happen. We only need to loop and spawn new threads when testing with "cmd=continue" and schedlock off, in which case GDB doesn't abort the command when the thread exits. Change-Id: I90c95c32f00630a3f682b1541c23aff52451f9b6 --- .../gdb.threads/step-over-thread-exit.c | 16 ++- .../gdb.threads/step-over-thread-exit.exp | 127 +++++++++++++++--- 2 files changed, 119 insertions(+), 24 deletions(-) diff --git a/gdb/testsuite/gdb.threads/step-over-thread-exit.c b/gdb/testsuite/gdb.threads/step-over-thread-exit.c index 878e5924c5c..218f003b205 100644 --- a/gdb/testsuite/gdb.threads/step-over-thread-exit.c +++ b/gdb/testsuite/gdb.threads/step-over-thread-exit.c @@ -18,6 +18,7 @@ #include #include #include +#include #include "../lib/my-syscalls.h" static void * @@ -30,13 +31,19 @@ thread_func (void *arg) abort (); } +/* Number of threads we'll create. */ +int n_threads = 100; + int -main (void) +main (int argc, char **argv) { int i; - /* Spawn and join a thread, 100 times. */ - for (i = 0; i < 100; i++) + if (argc > 1) + n_threads = atoi (argv[1]); + + /* Spawn and join a thread, N_THREADS times. */ + for (i = 0; i < n_threads; i++) { pthread_t thread; int ret; @@ -48,5 +55,8 @@ main (void) assert (ret == 0); } + /* Some time to make sure that GDB processes the thread exit event + before the whole-process exit. */ + sleep (3); return 0; } diff --git a/gdb/testsuite/gdb.threads/step-over-thread-exit.exp b/gdb/testsuite/gdb.threads/step-over-thread-exit.exp index 615bd838763..32f64ce1a3e 100644 --- a/gdb/testsuite/gdb.threads/step-over-thread-exit.exp +++ b/gdb/testsuite/gdb.threads/step-over-thread-exit.exp @@ -25,11 +25,29 @@ if { [build_executable "failed to prepare" $testfile \ return } -# Each argument is a different testing axis, most of them obvious. +# Test stepping/continuing at an exit syscall instruction. +# +# Each argument is a different testing axis. +# +# STEP_OVER_MODE can be one of: +# +# - none: don't put a breakpoint on the exit syscall instruction. +# +# - inline: put a breakpoint on the exit syscall instruction, and +# use in-line stepping to step over it (disable +# displaced-stepping). +# +# - displaced: same, but use displaced stepping. +# +# SCHEDLOCK can be "on" or "off". +# +# CMD is the GDB command to run when at the exit syscall instruction. +# # NS_STOP_ALL is only used if testing "set non-stop on", and indicates # whether to have GDB explicitly stop all threads before continuing to # thread exit. -proc test {displaced-stepping non-stop target-non-stop schedlock cmd ns_stop_all} { +# +proc test {step_over_mode non-stop target-non-stop schedlock cmd ns_stop_all} { if {${non-stop} == "off" && $ns_stop_all} { error "invalid arguments" } @@ -40,23 +58,29 @@ proc test {displaced-stepping non-stop target-non-stop schedlock cmd ns_stop_all clean_restart $::binfile } - gdb_test_no_output "set displaced-stepping ${displaced-stepping}" - - if { ![runto_main] } { - return + if { $step_over_mode == "none" } { + # Nothing to do. + } elseif { $step_over_mode == "inline" } { + gdb_test_no_output "set displaced-stepping off" + } elseif { $step_over_mode == "displaced" } { + gdb_test_no_output "set displaced-stepping on" + } else { + error "Invalid step_over_mode value: $step_over_mode" } - gdb_breakpoint "my_exit_syscall" - if {$schedlock || (${non-stop} == "on" && $ns_stop_all)} { - gdb_test "continue" \ - "Thread 2 .*hit Breakpoint $::decimal.* my_exit_syscall .*" \ - "continue until syscall" + + gdb_test_no_output "set args 1" + + if { ![runto my_exit_syscall] } { + return + } if {${non-stop} == "on"} { # The test only spawns one thread at a time, so this just - # stops the main thread. + # stops the main thread. IOW, we only need to wait for + # one stop. gdb_test_multiple "interrupt -a" "" { -re "$::gdb_prompt " { gdb_test_multiple "" $gdb_test_name { @@ -66,12 +90,19 @@ proc test {displaced-stepping non-stop target-non-stop schedlock cmd ns_stop_all } } } - } - gdb_test "thread 2" "Switching to thread 2 .*" + gdb_test "thread 2" "Switching to thread 2 .*" + } gdb_test_no_output "set scheduler-locking ${schedlock}" + # If testing a step-over is requested, leave the breakpoint at + # the current instruction to force a step-over; otherwise, + # remove it. + if { $step_over_mode == "none" } { + delete_breakpoints + } + if {$cmd == "continue"} { gdb_test "continue" \ "No unwaited-for children left." \ @@ -84,9 +115,50 @@ proc test {displaced-stepping non-stop target-non-stop schedlock cmd ns_stop_all } } } else { + # Schedlock is off here. + # + # With "continue" and no scheduler-locking, GDB doesn't stop + # with "Command aborted, thread exited." when the thread + # exits, it just lets the inferior continue running freely. + # So we test that we can move past the thread exit, and that + # other threads can be freely scheduled. We do that by + # spawning another thread as soon as the first exit. We test + # that a number of times. This should also exercise GDB's + # handling of inline or displaced step-overs, that GDB handles + # the related resource accounting correctly when the stepping + # thread exits, etc. + # + # With "continue" and $step_over_mode == "none" however, after + # the first my_exit_syscall breakpoint hit, we will remove the + # breakpoint, so no other thread would ever hit it again. So + # might as well just test one thread. + # + # With step/next, GDB aborts the execution command with + # "Command aborted, thread exited." when the stepping thread + # exits. If we let the main spawn another thread as soon as + # the first exits, it would be possible for that new thread to + # hit the exit syscall insn breakpoint quickly enough that it + # would be reported to be user before the first thread exit + # would be, which would confuse testing. To avoid that, we + # only spawn one thread, too. + # + if {$cmd != "continue" || $step_over_mode == "none"} { + set n_threads 1 + } else { + set n_threads 100 + } + + gdb_test_no_output "set args $n_threads" + + if { ![runto_main] } { + return + } + + gdb_breakpoint "my_exit_syscall" + gdb_test_no_output "set scheduler-locking ${schedlock}" - if {$cmd != "continue"} { + if {$cmd != "continue" || $step_over_mode == "none"} { set thread "" gdb_test_multiple "continue" "" { -re -wrap "Thread ($::decimal) .*hit Breakpoint $::decimal.* my_exit_syscall .*" { @@ -98,10 +170,23 @@ proc test {displaced-stepping non-stop target-non-stop schedlock cmd ns_stop_all "switch to event thread" } - gdb_test_multiple $cmd "command aborts when thread exits" { - -re "Command aborted, thread exited\\.\r\n$::gdb_prompt " { - pass $gdb_test_name + # If testing a step-over is requested, leave the breakpoint at + # the current instruction to force a step-over; otherwise, + # remove it. + if { $step_over_mode == "none" } { + delete_breakpoints + } + + if {$cmd == "continue"} { + gdb_continue_to_end "continue to end" "continue" 1 + } else { + gdb_test_multiple $cmd "command aborts when thread exits" { + -re "Command aborted, thread exited\\.\r\n$::gdb_prompt " { + pass $gdb_test_name + } } + gdb_test "p \$_thread == $thread" "= 1" \ + "selected thread didn't change" } } else { for { set i 0 } { $i < 100 } { incr i } { @@ -130,7 +215,7 @@ proc test {displaced-stepping non-stop target-non-stop schedlock cmd ns_stop_all } } -foreach_with_prefix displaced-stepping {off auto} { +foreach_with_prefix step_over_mode {none inline displaced} { foreach_with_prefix non-stop {off on} { foreach_with_prefix target-non-stop {off on} { if {${non-stop} == "on" && ${target-non-stop} == "off"} { @@ -142,11 +227,11 @@ foreach_with_prefix displaced-stepping {off auto} { foreach_with_prefix cmd {"next" "continue"} { if {${non-stop} == "on"} { foreach_with_prefix ns_stop_all {0 1} { - test ${displaced-stepping} ${non-stop} ${target-non-stop} \ + test ${step_over_mode} ${non-stop} ${target-non-stop} \ ${schedlock} ${cmd} ${ns_stop_all} } } else { - test ${displaced-stepping} ${non-stop} ${target-non-stop} ${schedlock} ${cmd} 0 + test ${step_over_mode} ${non-stop} ${target-non-stop} ${schedlock} ${cmd} 0 } } } From patchwork Thu Dec 14 20:22:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 82168 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D4F84384F02C for ; Thu, 14 Dec 2023 20:23:12 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) by sourceware.org (Postfix) with ESMTPS id 002933861884 for ; Thu, 14 Dec 2023 20:22:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 002933861884 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 002933861884 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.85.128.54 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585370; cv=none; b=IRlHkYCPT8NUv/JmC0/TR2hGO9t4CILqH5O6DG/f4sFqO1Q7pRNkvpqfF+ZpcCPjOEsq316bHMPLa3KQ6sgm0sFvfXGSgkSAwE//AS137HIRBTCyaqwyYh9VC4ClP5kbT73h+NisWCK7hynLzscvzazbaFWI3g1fzIYOseo1Cvg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585370; c=relaxed/simple; bh=mgS9qWNoeXPiT9OqO7/JiZDfMoOUcGS/BhVbYysShW0=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=KyxBnbJBSwNZLEPRG9nQqVI/TMPVVm9aMvcj/Bs3bYNVc1AgGnZfXeRj++f8w9A2S03sw03VADVEFX2ZCkWwT6VNNRqGvKslMEULQJq8C9A4bMQLs3NiTHV/Kh0tteaqiAtPv5XCCFe9OGefcxdfL6uMTLKGk7fH1HuxpnHdgP0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-40c69403b3eso1511335e9.3 for ; Thu, 14 Dec 2023 12:22:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702585366; x=1703190166; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JCsAKoIPbVGLOzyEK27t/7ZN7astFqJODNx4mVHCmHM=; b=ZBJ4hjlAk4yUeJEEj2hAcWvYykB2+OGfiqGUAa+GMyv9ZzLJlsZo/jZvUnn9Lo3bk3 GG4Rr8IDU78GDKRCi3g3ppa4U041avr2RnCX7iAuVtwpL/zcj02R1W7z2abqLGf9KhSP 9IskaP3FWEVKFnA4o+ltUzciMtPM+QjAIt4dRkaZNu5yL2HlH7AjycvzOJ6O93Onfxkw 1QKcbqgadegMq/xkyxtKXTgdxVL9pDmoiC+RxDYlwWfBte0V86Le1BeyWsuvqXATw/SD biPAMiuPMsrbtrbv6QOMx3ScBhPln8mQDcc01z9P9sX99quFqicZod50U1fDdCjdm6AG Xe7w== X-Gm-Message-State: AOJu0Yy5GEWOlwKCnugxI77w2FiVAu6o0fm9ZAA1a6gpON2o6xCqMOn7 pxC3j++bivQ2nfLzEpdUSXQKQxcAsLBURw== X-Google-Smtp-Source: AGHT+IGo9bulI6CwokWPqZ+LMOjygaU9C2wYlG8lMgA4FfJQ1j2k9fxcCPArwbgXYq4d8aSabVhz8Q== X-Received: by 2002:a05:600c:3b9f:b0:40c:2878:35ec with SMTP id n31-20020a05600c3b9f00b0040c287835ecmr5620876wms.131.1702585366569; Thu, 14 Dec 2023 12:22:46 -0800 (PST) Received: from localhost ([2001:8a0:f923:4f00:2646:535c:5a04:e380]) by smtp.gmail.com with UTF8SMTPSA id g20-20020a05600c4ed400b0040c31bb66dcsm25658587wmq.20.2023.12.14.12.22.45 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 Dec 2023 12:22:46 -0800 (PST) From: Pedro Alves To: gdb-patches@sourceware.org Subject: [PATCH 2/8] Ensure selected thread after thread exit stop Date: Thu, 14 Dec 2023 20:22:32 +0000 Message-ID: <20231214202238.1065676-3-pedro@palves.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214202238.1065676-1-pedro@palves.net> References: <20231214202238.1065676-1-pedro@palves.net> MIME-Version: 1.0 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org While making step over thread exit work properly on AMDGPU, I noticed that if there's a breakpoint on top of the exit syscall, and, displaced stepping is off, then when GDB reports "Command aborted, thread exited.", GDB also switches focus to a random thread, instead of leaving the exited thread as selected: (gdb) thread [Current thread is 6, lane 0 (AMDGPU Lane 1:4:1:1/0 (0,0,0)[0,0,0])] (gdb) si Command aborted, thread exited. (gdb) thread [Current thread is 5 (Thread 0x7ffff626f640 (LWP 3248392))] (gdb) The previous patch extended gdb.threads/step-over-thread-exit.exp to exercise this on GNU/Linux (on the CPU side), and there, after that "si", we always end up with the exiting thread as selected even without this fix, but that's just a concidence, there's a code path that happens to select the exiting thread for an unrelated reason. This commit add the explict switch, fixing the latent problem for GNU/Linux, and the actual problem on AMDGPU. I wrote a gdb.rocm/ testcase for this, but it can't be upstreamed yet, until more pieces of the DWARF machinery are upstream as well. Change-Id: I6ff57a79514ac0142bba35c749fe83d53d9e4e51 --- gdb/infrun.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/gdb/infrun.c b/gdb/infrun.c index 45c1b4a79bb..6dc0a2bb9a5 100644 --- a/gdb/infrun.c +++ b/gdb/infrun.c @@ -5895,7 +5895,14 @@ handle_thread_exited (execution_control_state *ecs) if (abort_cmd) { + /* We're stopping for the thread exit event. Switch to the + event thread again, as finish_step_over may have switched + threads. */ + switch_to_thread (ecs->event_thread); + + /* Emit [Thread ... exited] notification. */ delete_thread (ecs->event_thread); + ecs->event_thread = nullptr; return false; } From patchwork Thu Dec 14 20:22:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 82169 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3F435384F02F for ; Thu, 14 Dec 2023 20:23:19 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) by sourceware.org (Postfix) with ESMTPS id 1D5E4386188E for ; Thu, 14 Dec 2023 20:22:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1D5E4386188E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1D5E4386188E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.85.128.53 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585373; cv=none; b=GDyRDy1CN6vlmzFrbn97ZKQl/AMqJMLeVoGBo85tnhyebN9oPHGHBleG/5RaPZg2hWmFvkVu66oLhWt010x+slQO779kwb+vMVzQBdHh0+N9bPc5Dms1wG6MFkVv6QtgxelUZxDDx65fcTOwAYWT3NzNGys3A6a07uPYB1jvBc4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585373; c=relaxed/simple; bh=cMEciRgwilbSSSaV4e9UdWUCdf2W2drWN475p4+jrms=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=bUZ6i/kDG0q6zFnFRgeEtaHlum6hxMD7XQHLy7iHn135dYjo/eKl869zzT9+ZfAc63Hsn49kHnwlCcbxxZL0Bv5mb1lsjyh42g09bg71orm79jej47pgNzE/yBS/GCkrbA+PBvtxs/W2HcMgtxgubkrMUVsSWaJOIwzHj9dBNtc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-f53.google.com with SMTP id 5b1f17b1804b1-40c68c1990dso2699975e9.0 for ; Thu, 14 Dec 2023 12:22:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702585370; x=1703190170; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4eDjbyBobp3oHIVPFVu+rV37b/E/bVOIE/BV7bj8YwQ=; b=w+kXFr7bBc9GgBZEtjSjdLLbQ5l+I7fhT/jxgCT8yDLidd1n83PZrCZ2Kc2tZiulGW RDN7VrXmRjExwEJhQl4zYgyWrwVPVF68xauRFU1OXX8f6bf5swN+RjdApXZ9gdO5iDft zb33m3ftFeVi0x78VFCyKg9An80XbJGswdtzJKUDmZ3eDDog6u1ASfhZAWHDF3AUneNZ gLyjAMaGoW1whSBu57QjN38dGmWP9Eyv/TvnOaWzxKHKGwspCWDkgESO8/+XMDyjLfKf um0RisrNiIv1aJpGjdL/ZeXyYYJ9vsKKbzBFTsV2ge3f/SlA4ag6z5Xx6slR3MR5G6hr H7zg== X-Gm-Message-State: AOJu0YwfINDVFJIJ7zBvOH08yV6VFfoub/OilrjYOWVt9jGi7jwfTGBo J0R1sGTw5OKjwwHcPHsiAPqiHWdgcp2NTA== X-Google-Smtp-Source: AGHT+IHc51zl9VavjTTmwY+btA+Z9gMwMYf1z0U6QsrUAHkHqOEFtpEzEjbDGmsae+DVfHyKdFR1qA== X-Received: by 2002:a05:600c:ad0:b0:40c:2a4c:b265 with SMTP id c16-20020a05600c0ad000b0040c2a4cb265mr3838897wmr.124.1702585370490; Thu, 14 Dec 2023 12:22:50 -0800 (PST) Received: from localhost ([168.182.197.117]) by smtp.gmail.com with UTF8SMTPSA id a11-20020adfeecb000000b00333b17432c0sm16898661wrp.28.2023.12.14.12.22.49 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 Dec 2023 12:22:49 -0800 (PST) From: Pedro Alves To: gdb-patches@sourceware.org Subject: [PATCH 3/8] displaced_step_finish: Don't fetch the regcache of exited threads Date: Thu, 14 Dec 2023 20:22:33 +0000 Message-ID: <20231214202238.1065676-4-pedro@palves.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214202238.1065676-1-pedro@palves.net> References: <20231214202238.1065676-1-pedro@palves.net> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org displaced_step_finish can be called with event_status.kind == TARGET_WAITKIND_THREAD_EXITED, and in that case it is not possible to get at the already-exited thread's registers. This patch moves the get_thread_regcache calls to branches that actually need it, where we know the thread is still alive. It also adds an assertion to get_thread_regcache, to help catching these broken cases sooner. Change-Id: I63b5eacb3e02a538fc5087c270d8025adfda88c3 --- gdb/infrun.c | 19 ++++++++++++------- gdb/regcache.c | 2 ++ 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/gdb/infrun.c b/gdb/infrun.c index 6dc0a2bb9a5..76693a30611 100644 --- a/gdb/infrun.c +++ b/gdb/infrun.c @@ -2021,8 +2021,6 @@ displaced_step_finish (thread_info *event_thread, const target_waitstatus &event_status) { /* Check whether the parent is displaced stepping. */ - struct regcache *regcache = get_thread_regcache (event_thread); - struct gdbarch *gdbarch = regcache->arch (); inferior *parent_inf = event_thread->inf; /* If this was a fork/vfork/clone, this event indicates that the @@ -2040,10 +2038,15 @@ displaced_step_finish (thread_info *event_thread, gdbarch_displaced_step_restore_all_in_ptid. This is not enforced during gdbarch validation to support architectures which support displaced stepping but not forks. */ - if (event_status.kind () == TARGET_WAITKIND_FORKED - && gdbarch_supports_displaced_stepping (gdbarch)) - gdbarch_displaced_step_restore_all_in_ptid - (gdbarch, parent_inf, event_status.child_ptid ()); + if (event_status.kind () == TARGET_WAITKIND_FORKED) + { + struct regcache *parent_regcache = get_thread_regcache (event_thread); + struct gdbarch *gdbarch = parent_regcache->arch (); + + if (gdbarch_supports_displaced_stepping (gdbarch)) + gdbarch_displaced_step_restore_all_in_ptid + (gdbarch, parent_inf, event_status.child_ptid ()); + } displaced_step_thread_state *displaced = &event_thread->displaced_step_state; @@ -2082,11 +2085,13 @@ displaced_step_finish (thread_info *event_thread, child hasn't been added to the inferior list yet at this point. */ + struct regcache *parent_regcache = get_thread_regcache (event_thread); + struct gdbarch *gdbarch = parent_regcache->arch (); struct regcache *child_regcache = get_thread_arch_regcache (parent_inf, event_status.child_ptid (), gdbarch); /* Read PC value of parent. */ - CORE_ADDR parent_pc = regcache_read_pc (regcache); + CORE_ADDR parent_pc = regcache_read_pc (parent_regcache); displaced_debug_printf ("write child pc from %s to %s", paddress (gdbarch, diff --git a/gdb/regcache.c b/gdb/regcache.c index e46a0b58f50..f9bf1ecbe12 100644 --- a/gdb/regcache.c +++ b/gdb/regcache.c @@ -411,6 +411,8 @@ get_thread_regcache (process_stratum_target *target, ptid_t ptid) struct regcache * get_thread_regcache (thread_info *thread) { + gdb_assert (thread->state != THREAD_EXITED); + return get_thread_regcache (thread->inf->process_target (), thread->ptid); } From patchwork Thu Dec 14 20:22:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 82170 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 502FF384F01B for ; Thu, 14 Dec 2023 20:23:23 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) by sourceware.org (Postfix) with ESMTPS id C096738618F5 for ; Thu, 14 Dec 2023 20:22:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C096738618F5 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C096738618F5 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.85.221.47 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585375; cv=none; b=tLV86kNb9qf13so9w3+mQzbk6omCDtgA3pI8LIleY24FYt1AK7K/BGMWNLx0zVCkoDvhQuUe5bur/I7oBBdvZOkGzEZxaQKt+ah16orSrVPDU53ykXpNoUxdE53HGOq82YFB/dCzrybzGId04jLeg4VM+MTBXbEsnW300VrvdaI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585375; c=relaxed/simple; bh=aGcGeQVOc9ii0UHHGkkKHcFn99d+pPDOh020UrfgfX0=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=xXfr8tDDRsUVQcRIOT+9Ukj3BZD+e8SiUk6LwMS4CSOXYzJ9Zz/YDuTYqApfj2Y9f5V4uA5eRdhgeqPF64zXGzmf4uRAzI4xBmnqjMgoYylziZobgrDCfj5ddlfVZ1TvNexlU7Z5U9idH3wBnAcA8f0/wT05GhtDlrmL8n+BcXY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-f47.google.com with SMTP id ffacd0b85a97d-336445a2749so1189325f8f.0 for ; Thu, 14 Dec 2023 12:22:53 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702585372; x=1703190172; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=I1Higu9aMo8ZYN7IeeDxxEewR7Av2+i3U/0AhwN9Aas=; b=ua1uYLdY0O2agR257SwnmTYrt8X9DedXkjdwR0KI/ZYkY9ORZHSnbY1J1pqLref0fV gI8vvuZdTRoOqIvDSkzvWUVa7+EbM5J53bPFsT/opRsRSUYtASuRqcOOufBV7Dw+2GDP bJR5Xjwop3iPn9fNa/B5GL2+OzwcsUkY5wdBU39Dzx6Q9eWJDxbTP9/mad67sJLyC1iv 6fWHp3SLIEHixb5enwTGfJAAipaAh3Yl+LSvZl+WS5B6o0jdXyLwACyAZ4DUVxsRnszS Jd5RcBiGSo2+cxObHCRIroj3DczUZno4askX/cY33Cnr9X3ZCAyYoaidULs4zDvjWG5q HWsQ== X-Gm-Message-State: AOJu0YzPXtp++3e1K2HJprAT0XD2wetQ4Ha7/IviCU34baV/Qf+X4Zjk Dzkx7MX6jbhacN+brn3VFV/xgmln1eeipQ== X-Google-Smtp-Source: AGHT+IGYfe7NWkxbLq7jWqpBFUnFJp4v7u5amqgadoEoXOeuWsVQTiTE2djdyuBvn6pBa/IZmI253w== X-Received: by 2002:a5d:650d:0:b0:336:3db7:6e3e with SMTP id x13-20020a5d650d000000b003363db76e3emr1586507wru.96.1702585372362; Thu, 14 Dec 2023 12:22:52 -0800 (PST) Received: from localhost ([2001:8a0:f923:4f00:2646:535c:5a04:e380]) by smtp.gmail.com with UTF8SMTPSA id t5-20020a5d4605000000b0033646bf3e50sm2770469wrq.102.2023.12.14.12.22.51 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 Dec 2023 12:22:51 -0800 (PST) From: Pedro Alves To: gdb-patches@sourceware.org Subject: [PATCH 4/8] Step over thread exit, always delete the thread non-silently Date: Thu, 14 Dec 2023 20:22:34 +0000 Message-ID: <20231214202238.1065676-5-pedro@palves.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214202238.1065676-1-pedro@palves.net> References: <20231214202238.1065676-1-pedro@palves.net> MIME-Version: 1.0 X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org With AMD GPU debugging, I noticed that when stepping over a breakpoint placed on top of the s_endpgm instruction inline (displaced=off), GDB would behave differently -- it wouldn't print the wave exit. E.g: With displaced stepping, or no breakpoint at all: stepi [AMDGPU Wave 1:4:1:1 (0,0,0)/0 exited] Command aborted, thread exited. (gdb) With inline stepping: stepi Command aborted, thread exited. (gdb) In the cases we see the "exited" notification, handle_thread_exit is what first called delete_thread on the exiting thread, which is non-silent. With inline stepping, however, handle_thread_exit ends up in update_thread_list (via restart_threads) before any delete_thread call. Thus, amd_dbgapi_target::update_thread_list notices that the wave is gone and deletes it with delete_thread_silent. This commit fixes it, by making handle_thread_exited call set_thread_exited (with the default silent=false) early, which emits the user-visible notification. Change-Id: I22ab3145e18d07c99dace45576307b9f9d5d966f --- gdb/infrun.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/gdb/infrun.c b/gdb/infrun.c index 76693a30611..1d863896c40 100644 --- a/gdb/infrun.c +++ b/gdb/infrun.c @@ -5885,6 +5885,13 @@ handle_thread_exited (execution_control_state *ecs) update the thread list and delete the event thread. */ bool abort_cmd = (ecs->event_thread->thread_fsm () != nullptr); + /* Mark the thread exited right now, because finish_step_over may + update the thread list and that may delete the thread silently + (depending on target), while we always want to emit the "[Thread + ... exited]" notification. Don't actually delete the thread yet, + because we need to pass its pointer down to finish_step_over. */ + set_thread_exited (ecs->event_thread); + /* Maybe the thread was doing a step-over, if so release resources and start any further pending step-overs. @@ -5904,10 +5911,6 @@ handle_thread_exited (execution_control_state *ecs) event thread again, as finish_step_over may have switched threads. */ switch_to_thread (ecs->event_thread); - - /* Emit [Thread ... exited] notification. */ - delete_thread (ecs->event_thread); - ecs->event_thread = nullptr; return false; } From patchwork Thu Dec 14 20:22:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 82171 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9E494385DC2B for ; Thu, 14 Dec 2023 20:23:30 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) by sourceware.org (Postfix) with ESMTPS id EC831386180F for ; Thu, 14 Dec 2023 20:22:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EC831386180F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EC831386180F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.85.128.49 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585379; cv=none; b=SQkGE9+U+k39IuC5GmMubWLwJh0OYlxw8bBJSqpy+/FRxaFB98UqaR/JY6rBVdu2MK9FKMmm7yvm2AvLugWGeYz0RGp6gMwuKuw3Z8exo3F8UgGgPs2pYtRDItZluoo0PdqjT8jRApctO9h1dslNwkDzfRMeypall3XORyFbou4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585379; c=relaxed/simple; bh=XZdEJe6u3iOT2+x14U7jDwCsHO5hJ3NqtfA2yuGnQMc=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=oA0T4WONP2pZzyLyRO2htzOyHAxq4ljYYrOMjoCaFdZYBHQIAsGbQitaoHqLHh3VJV8xBixFzoIdzCYgVl6LSNjWZm4gZRKyRaI3DBiezI5rdLqFB2gSbLSaAOqHVLYfvPBKQ2kRYkkAtQoxCVfiIllppEDsfU4H+FecKMWSBTY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-40c19f5f822so6483855e9.1 for ; Thu, 14 Dec 2023 12:22:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702585375; x=1703190175; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZVxys935Ibo8wbBWOO6bd+d2WHWLEtbLyOhr4/ddy4k=; b=UTP1GaOhLMKOEOt2Z70ojcoNAOMPGkwUk4g+R5BYAcRj+0BBr1S/zpYXtf/DIhQoF8 UWKDSwjqs25OaCUb2zR1jO01H0+DJ+6IMWZ7VuU3Q51hYKiQFX6iayyhzXlOeqnkjxUk ow7btYir1DlMMq4Y5Jie8giLUkXGQxG7Szi5hzwjIu3ZeoLZ99pl3uzOEiUE3QbBfwtb j/CGnxVAGk5rzSDBniwV21wr2MySocO552UUKbAmECyEwcO5dbjSxf4YmHyU25KWMX+v 6SvquY8Jwyps03Rqnxsut2TB4+Y9cCBRitrdvu7FNWHbfZfY0C51xYTstcwR+Ahsmloz X/mQ== X-Gm-Message-State: AOJu0YxONMO8yDTAroLxFdTynGbIkt2uCta589QrKLm6o3w8xwvAYCL5 83eTRG9Ch3jwuUvqzj+b9PS2tmmC3q3N1g== X-Google-Smtp-Source: AGHT+IHPiQCsQmV15yP5Li5ZBUU+m1wpMfKlAz+rUO6EunCv6HCv5adOpn3Znn+i46vPikbYtFnl5A== X-Received: by 2002:a05:600c:4d0d:b0:40c:2c48:bdc0 with SMTP id u13-20020a05600c4d0d00b0040c2c48bdc0mr5420043wmp.136.1702585375128; Thu, 14 Dec 2023 12:22:55 -0800 (PST) Received: from localhost ([2001:8a0:f923:4f00:2646:535c:5a04:e380]) by smtp.gmail.com with UTF8SMTPSA id fm21-20020a05600c0c1500b0040c03c3289bsm26192616wmb.37.2023.12.14.12.22.54 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 Dec 2023 12:22:54 -0800 (PST) From: Pedro Alves To: gdb-patches@sourceware.org Subject: [PATCH 5/8] Fix thread target ID of exited waves Date: Thu, 14 Dec 2023 20:22:35 +0000 Message-ID: <20231214202238.1065676-6-pedro@palves.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214202238.1065676-1-pedro@palves.net> References: <20231214202238.1065676-1-pedro@palves.net> MIME-Version: 1.0 X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Currently, if you step over kernel exit, you see: stepi [AMDGPU Wave ?:?:?:1 (?,?,?)/? exited] Command aborted, thread exited. (gdb) Those '?' are because the thread/wave is already gone by the time GDB prints the "exited" notification, we can't ask dbgapi for any info about the wave anymore. This commit fixes it by caching the wave's coordinates as soon as GDB sees the wave for the first time, and making amd_dbgapi_target::pid_to_str use the cached info. At first I thought of clearing the wave_info object from a thread_exited observer. However, that is too soon, resulting in this: (gdb) si [AMDGPU Wave 1:4:1:1 (0,0,0)/0 exited] Command aborted, thread exited. (gdb) thread [Current thread is 6 (AMDGPU Wave ?:?:?:0 (?,?,?)/?) (exited)] We need instead to clear the wave info when the thread is ultimately deleted, so we get: (gdb) si [AMDGPU Wave 1:4:1:1 (0,0,0)/0 exited] Command aborted, thread exited. (gdb) thread [Current thread is 6 (AMDGPU Wave 1:4:1:1 (0,0,0)/0) (exited)] And for that, we need a new thread_deleted observable. Change-Id: I6c3e22541f051e1205f75eb657b04dc15e547580 --- gdb/amd-dbgapi-target.c | 168 +++++++++++++++++++++++++++++++--------- gdb/observable.c | 1 + gdb/observable.h | 5 ++ gdb/thread.c | 2 + 4 files changed, 138 insertions(+), 38 deletions(-) diff --git a/gdb/amd-dbgapi-target.c b/gdb/amd-dbgapi-target.c index 18c0543c40e..86102b7fb03 100644 --- a/gdb/amd-dbgapi-target.c +++ b/gdb/amd-dbgapi-target.c @@ -109,6 +109,28 @@ get_amd_dbgapi_target_inferior_created_observer_token () return amd_dbgapi_target_inferior_created_observer_token; } +/* A type holding coordinate, etc. info for a given wave. We cache + this because we need this information after a wave exits. */ + +struct wave_info +{ + /* The wave. Set by the ctor. */ + amd_dbgapi_wave_id_t wave_id; + + /* All these fields are initialized here to a value that is printed + as "?". */ + amd_dbgapi_dispatch_id_t dispatch_id {}; + amd_dbgapi_queue_id_t queue_id {}; + amd_dbgapi_agent_id_t agent_id {}; + uint32_t group_ids[3] {UINT32_MAX, UINT32_MAX, UINT32_MAX}; + uint32_t wave_in_group = UINT32_MAX; + + explicit wave_info (amd_dbgapi_wave_id_t wave_id); + + /* Return the target ID string for the wave this wave_info is + for. */ + std::string to_string () const; +}; /* Big enough to hold the size of the largest register in bytes. */ #define AMDGPU_MAX_REGISTER_SIZE 256 @@ -160,6 +182,16 @@ struct amd_dbgapi_inferior_info /* List of pending events the amd-dbgapi target retrieved from the dbgapi. */ std::list> wave_events; + + /* Map of wave ID to wave_info. We cache wave_info objects because + we need to access the info after the wave is gone, in the thread + exit nofication. E.g.: + [AMDGPU Wave 1:4:1:1 (0,0,0)/0 exited] + + wave_info objects are added when we first see the wave, and + removed from a thread_deleted observer. */ + std::unordered_map + wave_info_map; }; static amd_dbgapi_event_id_t process_event_queue @@ -256,56 +288,70 @@ static const registry::key static async_event_handler *amd_dbgapi_async_event_handler = nullptr; -/* Return the target id string for a given wave. */ - -static std::string -wave_target_id_string (amd_dbgapi_wave_id_t wave_id) +std::string +wave_info::to_string () const { - amd_dbgapi_dispatch_id_t dispatch_id; - amd_dbgapi_queue_id_t queue_id; - amd_dbgapi_agent_id_t agent_id; - uint32_t group_ids[3], wave_in_group; std::string str = "AMDGPU Wave"; - amd_dbgapi_status_t status - = amd_dbgapi_wave_get_info (wave_id, AMD_DBGAPI_WAVE_INFO_AGENT, - sizeof (agent_id), &agent_id); - str += (status == AMD_DBGAPI_STATUS_SUCCESS + str += (agent_id.handle != 0 ? string_printf (" %ld", agent_id.handle) : " ?"); - status = amd_dbgapi_wave_get_info (wave_id, AMD_DBGAPI_WAVE_INFO_QUEUE, - sizeof (queue_id), &queue_id); - str += (status == AMD_DBGAPI_STATUS_SUCCESS + str += (queue_id.handle != 0 ? string_printf (":%ld", queue_id.handle) : ":?"); - status = amd_dbgapi_wave_get_info (wave_id, AMD_DBGAPI_WAVE_INFO_DISPATCH, - sizeof (dispatch_id), &dispatch_id); - str += (status == AMD_DBGAPI_STATUS_SUCCESS + str += (dispatch_id.handle != 0 ? string_printf (":%ld", dispatch_id.handle) : ":?"); str += string_printf (":%ld", wave_id.handle); - status = amd_dbgapi_wave_get_info (wave_id, - AMD_DBGAPI_WAVE_INFO_WORKGROUP_COORD, - sizeof (group_ids), &group_ids); - str += (status == AMD_DBGAPI_STATUS_SUCCESS + str += (group_ids[0] != UINT32_MAX ? string_printf (" (%d,%d,%d)", group_ids[0], group_ids[1], group_ids[2]) : " (?,?,?)"); - status = amd_dbgapi_wave_get_info - (wave_id, AMD_DBGAPI_WAVE_INFO_WAVE_NUMBER_IN_WORKGROUP, - sizeof (wave_in_group), &wave_in_group); - str += (status == AMD_DBGAPI_STATUS_SUCCESS + str += (wave_in_group != UINT32_MAX ? string_printf ("/%d", wave_in_group) : "/?"); return str; } +wave_info::wave_info (amd_dbgapi_wave_id_t wave_id) + : wave_id (wave_id) +{ +} + +/* Read in wave_info for WAVE_ID. */ + +static wave_info +get_wave_info (amd_dbgapi_wave_id_t wave_id) +{ + wave_info res (wave_id); + + /* Any field that fails to be read is left with its in-class + initialized value, which is printed as "?". */ + + amd_dbgapi_wave_get_info (wave_id, AMD_DBGAPI_WAVE_INFO_AGENT, + sizeof (res.agent_id), &res.agent_id); + amd_dbgapi_wave_get_info (wave_id, AMD_DBGAPI_WAVE_INFO_QUEUE, + sizeof (res.queue_id), &res.queue_id); + amd_dbgapi_wave_get_info (wave_id, AMD_DBGAPI_WAVE_INFO_DISPATCH, + sizeof (res.dispatch_id), &res.dispatch_id); + + amd_dbgapi_wave_get_info (wave_id, + AMD_DBGAPI_WAVE_INFO_WORKGROUP_COORD, + sizeof (res.group_ids), &res.group_ids); + + amd_dbgapi_wave_get_info (wave_id, + AMD_DBGAPI_WAVE_INFO_WAVE_NUMBER_IN_WORKGROUP, + sizeof (res.wave_in_group), &res.wave_in_group); + + return res; +} + /* Clear our async event handler. */ static void @@ -510,7 +556,21 @@ amd_dbgapi_target::pid_to_str (ptid_t ptid) if (!ptid_is_gpu (ptid)) return beneath ()->pid_to_str (ptid); - return wave_target_id_string (get_amd_dbgapi_wave_id (ptid)); + process_stratum_target *proc_target = current_inferior ()->process_target (); + inferior *inf = find_inferior_pid (proc_target, ptid.pid ()); + gdb_assert (inf != nullptr); + amd_dbgapi_inferior_info *info = get_amd_dbgapi_inferior_info (inf); + + auto wave_id = get_amd_dbgapi_wave_id (ptid); + + auto it = info->wave_info_map.find (wave_id.handle); + if (it != info->wave_info_map.end ()) + return it->second.to_string (); + + /* A wave we don't know about. Shouldn't usually happen, but + asserting and bringing down the session is a bit too harsh. Just + print all unknown info as "?"s. */ + return wave_info (wave_id).to_string (); } const char * @@ -929,6 +989,46 @@ make_gpu_ptid (ptid_t::pid_type pid, amd_dbgapi_wave_id_t wave_id) return ptid_t (pid, 1, wave_id.handle); } +/* When a thread is deleted, remove its wave_info from the inferior's + wave_info map. */ + +static void +amd_dbgapi_thread_deleted (thread_info *tp) +{ + if (tp->inf->target_at (arch_stratum) == &the_amd_dbgapi_target + && ptid_is_gpu (tp->ptid)) + { + amd_dbgapi_inferior_info *info = amd_dbgapi_inferior_data.get (tp->inf); + auto wave_id = get_amd_dbgapi_wave_id (tp->ptid); + auto it = info->wave_info_map.find (wave_id.handle); + gdb_assert (it != info->wave_info_map.end ()); + info->wave_info_map.erase (it); + } +} + +/* Register WAVE_PTID as a new thread in INF's thread list, and record + its wave_info in the inferior's wave_info map. */ + +static thread_info * +add_gpu_thread (inferior *inf, ptid_t wave_ptid) +{ + process_stratum_target *proc_target = inf->process_target (); + amd_dbgapi_inferior_info *info = get_amd_dbgapi_inferior_info (inf); + + auto wave_id = get_amd_dbgapi_wave_id (wave_ptid); + + if (!info->wave_info_map.try_emplace (wave_id.handle, + get_wave_info (wave_id)).second) + internal_error ("wave ID %ld already in map", wave_id.handle); + + /* Create new GPU threads silently to avoid spamming the terminal + with thousands of "[New Thread ...]" messages. */ + thread_info *thread = add_thread_silent (proc_target, wave_ptid); + set_running (proc_target, wave_ptid, true); + set_executing (proc_target, wave_ptid, true); + return thread; +} + /* Process an event that was just pulled out of the amd-dbgapi library. */ static void @@ -1015,13 +1115,7 @@ process_one_event (amd_dbgapi_event_id_t event_id, thread_info *thread = proc_target->find_thread (event_ptid); if (thread == nullptr) - { - /* Silently create new GPU threads to avoid spamming the - terminal with thousands of "[New Thread ...]" messages. */ - thread = add_thread_silent (proc_target, event_ptid); - set_running (proc_target, event_ptid, true); - set_executing (proc_target, event_ptid, true); - } + thread = add_gpu_thread (inf, event_ptid); /* If the wave is stopped because of a software breakpoint, the program counter needs to be adjusted so that it points to the @@ -1686,10 +1780,7 @@ amd_dbgapi_target::update_thread_list () { ptid_t wave_ptid = make_gpu_ptid (inf->pid, amd_dbgapi_wave_id_t {tid}); - - add_thread_silent (inf->process_target (), wave_ptid); - set_running (inf->process_target (), wave_ptid, true); - set_executing (inf->process_target (), wave_ptid, true); + add_gpu_thread (inf, wave_ptid); } } @@ -2115,6 +2206,7 @@ _initialize_amd_dbgapi_target () gdb::observers::inferior_forked.attach (amd_dbgapi_inferior_forked, "amd-dbgapi"); gdb::observers::inferior_exit.attach (amd_dbgapi_inferior_exited, "amd-dbgapi"); gdb::observers::inferior_pre_detach.attach (amd_dbgapi_inferior_pre_detach, "amd-dbgapi"); + gdb::observers::thread_deleted.attach (amd_dbgapi_thread_deleted, "amd-dbgapi"); add_basic_prefix_cmd ("amdgpu", no_class, _("Generic command for setting amdgpu flags."), diff --git a/gdb/observable.c b/gdb/observable.c index f2e65b11604..29675f3abf3 100644 --- a/gdb/observable.c +++ b/gdb/observable.c @@ -46,6 +46,7 @@ DEFINE_OBSERVABLE (all_objfiles_removed); DEFINE_OBSERVABLE (free_objfile); DEFINE_OBSERVABLE (new_thread); DEFINE_OBSERVABLE (thread_exit); +DEFINE_OBSERVABLE (thread_deleted); DEFINE_OBSERVABLE (thread_stop_requested); DEFINE_OBSERVABLE (target_resumed); DEFINE_OBSERVABLE (about_to_proceed); diff --git a/gdb/observable.h b/gdb/observable.h index 32ef65435cc..91a2c871524 100644 --- a/gdb/observable.h +++ b/gdb/observable.h @@ -126,6 +126,11 @@ extern observable /* exit_code */, bool /* silent */> thread_exit; +/* The thread specified by T has been deleted, with delete_thread. + This is called just before the thread_info object is destroyed with + operator delete. */ +extern observable thread_deleted; + /* An explicit stop request was issued to PTID. If PTID equals minus_one_ptid, the request applied to all threads. If ptid_is_pid(PTID) returns true, the request applied to all diff --git a/gdb/thread.c b/gdb/thread.c index 85bdbaa6cd8..bd3fe85f3b9 100644 --- a/gdb/thread.c +++ b/gdb/thread.c @@ -527,6 +527,8 @@ delete_thread_1 (thread_info *thr, std::optional exit_code, auto it = thr->inf->thread_list.iterator_to (*thr); thr->inf->thread_list.erase (it); + gdb::observers::thread_deleted.notify (thr); + delete thr; } From patchwork Thu Dec 14 20:22:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 82172 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1016C386180F for ; Thu, 14 Dec 2023 20:23:35 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from mail-wm1-f48.google.com (mail-wm1-f48.google.com [209.85.128.48]) by sourceware.org (Postfix) with ESMTPS id 19AEC384DEC4 for ; Thu, 14 Dec 2023 20:23:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 19AEC384DEC4 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 19AEC384DEC4 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.85.128.48 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585383; cv=none; b=iNNVrSyMJmM+vhw053bfjoACIQE24cGvDrHKk9mb9sBDY5P8F7gS//LCBLrh/So5JAvuVlPBl90B8N/H5/21n+xubUiORLvwXmxxBYW3cHXcmCWgyj0GOUiIWzJMWVc5TEJpfmTqKD5Lsp432lOuKk/JNbN2v+8w2bjNJF32+lg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585383; c=relaxed/simple; bh=qKISIx1a8a+8dUGHqq3fxFw0VJt9HXKCm+v83cZ5lOU=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=L/OoKquXCaQ2v/MGfq4C4/x1iG9G7WNP7AQ/LDYQAxPM+qJi9v7gNkQPXkVV116jNY2BYenItPxm/lpLtzkWnMtJrs/WrAMYeFrpoR7COuCEC1EG9BUWvhm0v8QZTjANRsTO9YjgpUaYb0twTeqxKbOSTstcU93pGCFiYF7lxfw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-f48.google.com with SMTP id 5b1f17b1804b1-40c2bb872e2so80495735e9.3 for ; Thu, 14 Dec 2023 12:23:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702585378; x=1703190178; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Fz+z9x2FdMYeVaqbtNmI7MCm4nXZuEfPDTPjtkI/Uio=; b=SvUpq1oAJMkk2uJscodPetJl3f2xU+mL8EPHXOkoYH1q6aRd/zfuGzYvq29AiJrst9 E1DxPWCw39QBjBsyRE3tCo0rZn1ay3x+B1q2kaqwiwwx7EApqYeNzIGPySGM0c1hI11c tNJYN1/bvzPaBkYrr0YWs8gmsucjBPApRJEKSyk8PGU5MLEtd8q1aE1iVVQe8dJ7JO/m G0Z7h4HyQ7+U2em/7LmCAzoF1iGpFsZNEUbMvVtkxc5lYtUFxp/CLl3jsOIWsctYUO/c 1SZS6TCz16evus0rTJqAVqLWxROk+qGQ9wpaujzl2HEb8fHx+e4Lfl9bNWpouv+6dPyX NELA== X-Gm-Message-State: AOJu0YzCLX7W73+X8QNKFCNTlH/Vmc+7ypQ/468J3ObTtHDuwmMwdXDE AsS26tSJQaF5FxRb3oi7+uQXcKYf96pSPg== X-Google-Smtp-Source: AGHT+IGANuiVhTEGBYRnZpI9YUTK5OCilp0LKT1wjrF+oyZC0fdGsaQCDRoBcWkW7+ToaCqifG0dJw== X-Received: by 2002:a7b:ce8f:0:b0:40c:2205:e5f6 with SMTP id q15-20020a7bce8f000000b0040c2205e5f6mr2764144wmj.293.1702585378098; Thu, 14 Dec 2023 12:22:58 -0800 (PST) Received: from localhost ([2001:8a0:f923:4f00:2646:535c:5a04:e380]) by smtp.gmail.com with UTF8SMTPSA id j29-20020a5d6e5d000000b003364e437577sm194796wrz.84.2023.12.14.12.22.57 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 Dec 2023 12:22:57 -0800 (PST) From: Pedro Alves To: gdb-patches@sourceware.org Subject: [PATCH 6/8] Fix handling of vanishing threads that were stepping/stopping Date: Thu, 14 Dec 2023 20:22:36 +0000 Message-ID: <20231214202238.1065676-7-pedro@palves.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214202238.1065676-1-pedro@palves.net> References: <20231214202238.1065676-1-pedro@palves.net> MIME-Version: 1.0 X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Downstream, AMD is carrying a testcase (gdb.rocm/continue-over-kernel-exit.exp) that exposes a couple issues with the amd-dbgapi target's handling of exited threads. The test can't be added upstream yet, unfortunately, due to dependency on DWARF extensions that can't be upstreamed yet. However, it can be found on the mailing list on the same series as this patch. The test spawns a kernel with a number of waves. The waves do nothing but exit. There is a breakpoint on the s_endpgm instruction. Once that breakpoint is hit, the test issues a "continue" command. We should see one breakpoint hit per wave, and then the whole program exiting. We do see that, however we also see this: [New AMDGPU Wave ?:?:?:1 (?,?,?)/?] [AMDGPU Wave ?:?:?:1 (?,?,?)/? exited] *repeat for other waves* ... [Thread 0x7ffff626f640 (LWP 3048491) exited] [Thread 0x7fffeb7ff640 (LWP 3048488) exited] [Inferior 1 (process 3048475) exited normally] That "New AMDGPU Wave" output comes from infrun.c itself adding the thread to the GDB thread list, because it got an event for a thread not on the thread list yet. The output shows "?"s instead of proper coordinates, because the event was a TARGET_WAITKIND_THREAD_EXITED, i.e., the wave was already gone when infrun.c added the thread to the thread list. That shouldn't ever happen for the amd-dbgapi target, threads should only ever be added by the backend. Note "New AMDGPU Wave ?:?:?:1" is for wave 1. What happened was that wave 1 terminated previously, and a previous call to amd_dbgapi_target::update_thread_list() noticed the wave had vanished and removed it from the GDB thread list. However, because the wave was stepping when it terminated (due to the displaced step over the s_endpgm) instruction, it is guaranteed that the amd-dbgapi library queues a WAVE_COMMAND_TERMINATED event for the exit. When we process that WAVE_COMMAND_TERMINATED event, in amd-dbgapi-target.c:process_one_event, we return it to the core as a TARGET_WAITKIND_THREAD_EXITED event: static void process_one_event (amd_dbgapi_event_id_t event_id, amd_dbgapi_event_kind_t event_kind) { ... if (status == AMD_DBGAPI_STATUS_ERROR_INVALID_WAVE_ID && event_kind == AMD_DBGAPI_EVENT_KIND_WAVE_COMMAND_TERMINATED) ws.set_thread_exited (0); ... } Recall the wave is already gone from the GDB thread list. So when GDB sees that TARGET_WAITKIND_THREAD_EXITED event for a thread it doesn't know about, it adds the thread to the thread list, resulting in that: [New AMDGPU Wave ?:?:?:1 (?,?,?)/?] and then, because it was a TARGET_WAITKIND_THREAD_EXITED event, GDB marks the thread exited right afterwards: [AMDGPU Wave ?:?:?:1 (?,?,?)/? exited] The fix is to make amd_dbgapi_target::update_thread_list() _not_ delete vanishing waves iff they were stepping or in progress of being stopped. These two cases are the ones dbgapi guarantees will result in a WAVE_COMMAND_TERMINATED event if the wave terminates: /** * A command for a wave was not able to complete because the wave has * terminated. * * Commands that can result in this event are ::amd_dbgapi_wave_stop and * ::amd_dbgapi_wave_resume in single step mode. Since the wave terminated * before stopping, this event will be reported instead of * ::AMD_DBGAPI_EVENT_KIND_WAVE_STOP. * * The wave that terminated is available by the ::AMD_DBGAPI_EVENT_INFO_WAVE * query. However, the wave will be invalid since it has already terminated. * It is the client's responsibility to know what command was being performed * and was unable to complete due to the wave terminating. */ AMD_DBGAPI_EVENT_KIND_WAVE_COMMAND_TERMINATED = 2, As the comment says, it's GDB's responsability to know whether the wave was stepping or being stopped. Since we now have a wave_info map with one entry for each wave, that seems like the place to store that information. However, I still decided to put all the coordinate information in its own structure. I.e., basically renamed the existing wave_info to wave_coordinates, and then added a new wave_info structure that holds the new state, plus a wave_coordinates object. This seemer cleaner as there are places where we only need to instantiate a wave_coordinates object. There's an extra twist. The testcase also exercises stopping at a new kernel right after the first kernel fully exits. In that scenario, we were hitting this assertion after the first kernel fully exits and the hit of the breakpoint at the second kernel is handled: [amd-dbgapi] process_event_queue: Pulled event from dbgapi: event_id.handle = 26, event_kind = WAVE_STOP [amd-dbgapi-lib] suspending queue_3, queue_2, queue_1 (refresh wave list) ../../src/gdb/amd-dbgapi-target.c:1625: internal-error: amd_dbgapi_thread_deleted: Assertion `it != info->wave_info_map.end ()' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. This is the exact same problem as above, just a different manifestation. In this scenario, we end up in update_thread_list successfully deleting the exited thread (because it was no longer the current thread) that was incorrectly added by infrun.c. Because it was added by infrun.c and not by amd-dbgapi-target.c:add_gpu_thread, it doesn't have an entry in the wave_info map, so amd_dbgapi_thread_deleted trips on this assertion: gdb_assert (it != info->wave_info_map.end ()); here: ... -> stop_all_threads -> update_thread_list -> target_update_thread_list -> amd_dbgapi_target::update_thread_list -> thread_db_target::update_thread_list -> linux_nat_target::update_thread_list -> delete_exited_threads -> delete_thread -> delete_thread_1 -> gdb::observers::observable::notify -> amd_dbgapi_thread_deleted -> internal_error_loc The testcase thus tries both running to exit after the first kernel exits, and running to a breakpoint in a second kernel after the first kernel exits. Change-Id: I43a66f060c35aad1fe0d9ff022ce2afd0537f028 --- gdb/amd-dbgapi-target.c | 197 ++++++++++++++++++++++++++++++---------- 1 file changed, 149 insertions(+), 48 deletions(-) diff --git a/gdb/amd-dbgapi-target.c b/gdb/amd-dbgapi-target.c index 86102b7fb03..06f9e8c5f9c 100644 --- a/gdb/amd-dbgapi-target.c +++ b/gdb/amd-dbgapi-target.c @@ -109,10 +109,9 @@ get_amd_dbgapi_target_inferior_created_observer_token () return amd_dbgapi_target_inferior_created_observer_token; } -/* A type holding coordinate, etc. info for a given wave. We cache - this because we need this information after a wave exits. */ +/* A type holding coordinates, etc. info for a given wave. */ -struct wave_info +struct wave_coordinates { /* The wave. Set by the ctor. */ amd_dbgapi_wave_id_t wave_id; @@ -125,11 +124,44 @@ struct wave_info uint32_t group_ids[3] {UINT32_MAX, UINT32_MAX, UINT32_MAX}; uint32_t wave_in_group = UINT32_MAX; - explicit wave_info (amd_dbgapi_wave_id_t wave_id); + explicit wave_coordinates (amd_dbgapi_wave_id_t wave_id) + : wave_id (wave_id) + {} - /* Return the target ID string for the wave this wave_info is + /* Return the target ID string for the wave this wave_coordinates is for. */ std::string to_string () const; + + /* Pull out coordinates info from the amd-dbgapi library. */ + void fetch (); +}; + +/* A type holding info about a given wave. */ + +struct wave_info +{ + /* We cache the coordinates info because we need it after a wave + exits. The wave's ID is here. */ + wave_coordinates coords; + + /* The last resume_mode passed to amd_dbgapi_wave_resume for this + wave. We track this because we are guaranteed to see a + WAVE_COMMAND_TERMINATED event if a stepping wave terminates, and + we need to know to not delete such a wave until we process that + event. */ + amd_dbgapi_resume_mode_t last_resume_mode = AMD_DBGAPI_RESUME_MODE_NORMAL; + + /* Whether we've called amd_dbgapi_wave_stop for this wave and are + waiting for its stop event. Similarly, we track this because + we're guaranteed to get a WAVE_COMMAND_TERMINATED event if the + wave terminates while being stopped. */ + bool stopping = false; + + explicit wave_info (amd_dbgapi_wave_id_t wave_id) + : coords (wave_id) + { + coords.fetch (); + } }; /* Big enough to hold the size of the largest register in bytes. */ @@ -275,6 +307,19 @@ static struct amd_dbgapi_target the_amd_dbgapi_target; static const registry::key amd_dbgapi_inferior_data; +/* Fetch the amd_dbgapi_inferior_info data for the given inferior. */ + +static struct amd_dbgapi_inferior_info * +get_amd_dbgapi_inferior_info (struct inferior *inferior) +{ + amd_dbgapi_inferior_info *info = amd_dbgapi_inferior_data.get (inferior); + + if (info == nullptr) + info = amd_dbgapi_inferior_data.emplace (inferior, inferior); + + return info; +} + /* The async event handler registered with the event loop, indicating that we might have events to report to the core and that we'd like our wait method to be called. @@ -289,7 +334,7 @@ static const registry::key static async_event_handler *amd_dbgapi_async_event_handler = nullptr; std::string -wave_info::to_string () const +wave_coordinates::to_string () const { std::string str = "AMDGPU Wave"; @@ -319,37 +364,41 @@ wave_info::to_string () const return str; } -wave_info::wave_info (amd_dbgapi_wave_id_t wave_id) - : wave_id (wave_id) -{ -} - -/* Read in wave_info for WAVE_ID. */ - -static wave_info -get_wave_info (amd_dbgapi_wave_id_t wave_id) +void +wave_coordinates::fetch () { - wave_info res (wave_id); - /* Any field that fails to be read is left with its in-class initialized value, which is printed as "?". */ amd_dbgapi_wave_get_info (wave_id, AMD_DBGAPI_WAVE_INFO_AGENT, - sizeof (res.agent_id), &res.agent_id); + sizeof (agent_id), &agent_id); amd_dbgapi_wave_get_info (wave_id, AMD_DBGAPI_WAVE_INFO_QUEUE, - sizeof (res.queue_id), &res.queue_id); + sizeof (queue_id), &queue_id); amd_dbgapi_wave_get_info (wave_id, AMD_DBGAPI_WAVE_INFO_DISPATCH, - sizeof (res.dispatch_id), &res.dispatch_id); + sizeof (dispatch_id), &dispatch_id); amd_dbgapi_wave_get_info (wave_id, AMD_DBGAPI_WAVE_INFO_WORKGROUP_COORD, - sizeof (res.group_ids), &res.group_ids); + sizeof (group_ids), &group_ids); amd_dbgapi_wave_get_info (wave_id, AMD_DBGAPI_WAVE_INFO_WAVE_NUMBER_IN_WORKGROUP, - sizeof (res.wave_in_group), &res.wave_in_group); + sizeof (wave_in_group), &wave_in_group); +} + +/* Get the wave_info object for TP, from the wave_info map. It is + assumed that the wave is in the map. */ + +static wave_info & +get_thread_wave_info (thread_info *tp) +{ + amd_dbgapi_inferior_info *info = get_amd_dbgapi_inferior_info (tp->inf); + amd_dbgapi_wave_id_t wave_id = get_amd_dbgapi_wave_id (tp->ptid); + + auto it = info->wave_info_map.find (wave_id.handle); + gdb_assert (it != info->wave_info_map.end ()); - return res; + return it->second; } /* Clear our async event handler. */ @@ -370,19 +419,6 @@ async_event_handler_mark () mark_async_event_handler (amd_dbgapi_async_event_handler); } -/* Fetch the amd_dbgapi_inferior_info data for the given inferior. */ - -static struct amd_dbgapi_inferior_info * -get_amd_dbgapi_inferior_info (struct inferior *inferior) -{ - amd_dbgapi_inferior_info *info = amd_dbgapi_inferior_data.get (inferior); - - if (info == nullptr) - info = amd_dbgapi_inferior_data.emplace (inferior, inferior); - - return info; -} - /* Set forward progress requirement to REQUIRE for all processes of PROC_TARGET matching PTID. */ @@ -565,12 +601,12 @@ amd_dbgapi_target::pid_to_str (ptid_t ptid) auto it = info->wave_info_map.find (wave_id.handle); if (it != info->wave_info_map.end ()) - return it->second.to_string (); + return it->second.coords.to_string (); /* A wave we don't know about. Shouldn't usually happen, but asserting and bringing down the session is a bit too harsh. Just print all unknown info as "?"s. */ - return wave_info (wave_id).to_string (); + return wave_coordinates (wave_id).to_string (); } const char * @@ -694,16 +730,24 @@ amd_dbgapi_target::resume (ptid_t scope_ptid, int step, enum gdb_signal signo) amd_dbgapi_wave_id_t wave_id = get_amd_dbgapi_wave_id (thread->ptid); amd_dbgapi_status_t status; + + wave_info &wi = get_thread_wave_info (thread); + amd_dbgapi_resume_mode_t &resume_mode = wi.last_resume_mode; + amd_dbgapi_exceptions_t wave_exception; if (thread->ptid == inferior_ptid) - status = amd_dbgapi_wave_resume (wave_id, - (step - ? AMD_DBGAPI_RESUME_MODE_SINGLE_STEP - : AMD_DBGAPI_RESUME_MODE_NORMAL), - exception); + { + resume_mode = (step + ? AMD_DBGAPI_RESUME_MODE_SINGLE_STEP + : AMD_DBGAPI_RESUME_MODE_NORMAL); + wave_exception = exception; + } else - status = amd_dbgapi_wave_resume (wave_id, AMD_DBGAPI_RESUME_MODE_NORMAL, - AMD_DBGAPI_EXCEPTION_NONE); + { + resume_mode = AMD_DBGAPI_RESUME_MODE_NORMAL; + wave_exception = AMD_DBGAPI_EXCEPTION_NONE; + } + status = amd_dbgapi_wave_resume (wave_id, resume_mode, wave_exception); if (status != AMD_DBGAPI_STATUS_SUCCESS /* Ignore the error that wave is no longer valid as that could indicate that the process has exited. GDB treats resuming a @@ -711,6 +755,8 @@ amd_dbgapi_target::resume (ptid_t scope_ptid, int step, enum gdb_signal signo) && status != AMD_DBGAPI_STATUS_ERROR_INVALID_WAVE_ID) error (_("wave_resume for wave_%ld failed (%s)"), wave_id.handle, get_status_string (status)); + + wi.stopping = false; } } @@ -725,6 +771,15 @@ amd_dbgapi_target::commit_resumed () require_forward_progress (minus_one_ptid, proc_target, true); } +/* Return a string version of RESUME_MODE, for debug log purposes. */ +static const char * +resume_mode_to_string (amd_dbgapi_resume_mode_t resume_mode) +{ + return (resume_mode == AMD_DBGAPI_RESUME_MODE_SINGLE_STEP + ? "step" + : "normal"); +} + void amd_dbgapi_target::stop (ptid_t ptid) { @@ -758,7 +813,11 @@ amd_dbgapi_target::stop (ptid_t ptid) status = amd_dbgapi_wave_stop (wave_id); if (status == AMD_DBGAPI_STATUS_SUCCESS) - return; + { + wave_info &wi = get_thread_wave_info (thread); + wi.stopping = true; + return; + } if (status != AMD_DBGAPI_STATUS_ERROR_INVALID_WAVE_ID) error (_("wave_stop for wave_%ld failed (%s)"), wave_id.handle, @@ -772,6 +831,23 @@ amd_dbgapi_target::stop (ptid_t ptid) could have terminated since the last time the wave list was refreshed. */ + wave_info &wi = get_thread_wave_info (thread); + wi.stopping = true; + + amd_dbgapi_debug_printf ("got AMD_DBGAPI_STATUS_ERROR_INVALID_WAVE_ID " + "for wave_%ld, last_resume_mode=%s, " + "report_thread_events=%d", + wave_id.handle, + resume_mode_to_string (wi.last_resume_mode), + m_report_thread_events); + + /* If the wave was stepping when it terminated, then it is + guaranteed that we will see a WAVE_COMMAND_TERMINATED event + for it. Don't report a thread exit event or delete the + thread yet, until we see such event. */ + if (wi.last_resume_mode == AMD_DBGAPI_RESUME_MODE_SINGLE_STEP) + return; + if (m_report_thread_events) { get_amd_dbgapi_inferior_info (thread->inf)->wave_events.emplace_back @@ -1018,7 +1094,7 @@ add_gpu_thread (inferior *inf, ptid_t wave_ptid) auto wave_id = get_amd_dbgapi_wave_id (wave_ptid); if (!info->wave_info_map.try_emplace (wave_id.handle, - get_wave_info (wave_id)).second) + wave_info (wave_id)).second) internal_error ("wave ID %ld already in map", wave_id.handle); /* Create new GPU threads silently to avoid spamming the terminal @@ -1770,7 +1846,32 @@ amd_dbgapi_target::update_thread_list () auto it = threads.find (tp->ptid.tid ()); if (it == threads.end ()) - delete_thread_silent (tp); + { + auto wave_id = get_amd_dbgapi_wave_id (tp->ptid); + wave_info &wi = get_thread_wave_info (tp); + + /* Waves that were stepping or in progress of being + stopped are guaranteed to report a + WAVE_COMMAND_TERMINATED event if they terminate. + Don't delete such threads until we see the + event. */ + if (wi.last_resume_mode == AMD_DBGAPI_RESUME_MODE_SINGLE_STEP + || wi.stopping) + { + amd_dbgapi_debug_printf + ("wave_%ld disappeared, keeping it" + " (last_resume_mode=%s, stopping=%d)", + wave_id.handle, + resume_mode_to_string (wi.last_resume_mode), + wi.stopping); + } + else + { + amd_dbgapi_debug_printf ("wave_%ld disappeared, deleting it", + wave_id.handle); + delete_thread_silent (tp); + } + } else threads.erase (it); } From patchwork Thu Dec 14 20:22:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 82173 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B3DCB3861881 for ; Thu, 14 Dec 2023 20:23:45 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) by sourceware.org (Postfix) with ESMTPS id 7CBC738618D3 for ; Thu, 14 Dec 2023 20:23:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7CBC738618D3 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7CBC738618D3 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.85.128.54 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585385; cv=none; b=WPQ8BQImzNUQC8F5pPHBiVyKZhbhW62rjTvTWrUiRNV5tedGkRWLDxPlINMS77ONqhQ4uFA8LRrBPv3Y3Tt4CAknUo6Dfp1GUlPU4BJWIBbxo3TVcmaZTTtALJGf+5EgaTm2HHdjhTb4x1C2RiLzukGLRzlhPIy5lkN2oOaSXkI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585385; c=relaxed/simple; bh=5xGaqn1hOiWx3o1YRf5MFXN6z1PSSI6HRXa0NooZydg=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=bRFDfdt2bq5jRiPr5vHE7aUFCN3aQvqLE/CQjSW+nC5g97OErjyeOrYLlcK1i284G8vm2FLcrBcntHCesZS/+phoDmFic4ZTyKfwXI0mBJW6nFSHFWK07MuCq/DBgK1GtfD6iiuRGmYdZOzmqV5E6odqmB09z8om1lkAN8ifIKg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-40c68c1990dso2702035e9.0 for ; Thu, 14 Dec 2023 12:23:03 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702585382; x=1703190182; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8yJKjDWNL/Zsu3xBwUCwPv0Bg1OJbakz819JYuRx9BQ=; b=t3M2wjq6O5L/+xZaKMvode6MzUyw5qGVQgsyTTbzjoKWaIZ/4sYc/ex3ALTHofsOmN pK4ZG6osjqonggTJXa2CzRo9HR8Xzs9li4l0FMxprxFqvtLsT6kLJCACEIqqElxZkYUS 4DWUfTV7k+w6FOZTqcPskHsK5i7Jl7fjonudQ9bLPXIFOqrDvPpjILggGX4mlazlolAZ HLyXh0Ii7xGvVrLqQaFAUCttk1LF7qDyK9miU0vG8pPrMDLuUuM3eJetuYsfmV0kCRhz X3259cROZ6hgQzQZ/1zvHq0mJ/fgiXwVK0/+6jZbR9+iWDRbyUXb911Saa83MxXIvKhZ kZeA== X-Gm-Message-State: AOJu0Yyxb3hy5l7GoYss91dVEn5pMAFRufnQpLnA96zSkC0FgVSeDX7p 9fzX1cLoM+KIaINOl4tudoTdT25yn9yT+Q== X-Google-Smtp-Source: AGHT+IGi0tSU/2SlNPtJI0wOSY0ZQaGGm4EyXILGqjE5XXGwWLwBJvEA5t2NegxPAvnv0cJCXSxeQQ== X-Received: by 2002:a05:600c:2942:b0:40c:357e:27b with SMTP id n2-20020a05600c294200b0040c357e027bmr3450970wmd.162.1702585382061; Thu, 14 Dec 2023 12:23:02 -0800 (PST) Received: from localhost ([2001:8a0:f923:4f00:2646:535c:5a04:e380]) by smtp.gmail.com with UTF8SMTPSA id p32-20020a05600c1da000b0040b360cc65csm26356106wms.0.2023.12.14.12.23.01 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 Dec 2023 12:23:01 -0800 (PST) From: Pedro Alves To: gdb-patches@sourceware.org Cc: Laurent Morichetti , Simon Marchi Subject: [PATCH 7/8] Add tests for s_endpgm handling Date: Thu, 14 Dec 2023 20:22:37 +0000 Message-ID: <20231214202238.1065676-8-pedro@palves.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214202238.1065676-1-pedro@palves.net> References: <20231214202238.1065676-1-pedro@palves.net> MIME-Version: 1.0 X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org [Not for commit. This won't work with current upstream, unfortunately.] Check that a wave can halt at an s_endpgm instruction by single-stepping or displaced stepping the instruction preceding the s_endpgm. Check that a wave can single-step s_endpgm, and that it can step over a breakpoint placed at an s_endpgm by displaced-stepping the instruction. Test all three cases: - no step-over (stepping without a breakpoint installed) - in-line step-over - displaced step-over Check the same with "set scheduler-locking on". Check that GDB always prints the exited wave's ID when aborting a command due to thread exit, and that it prints a valid ID with no "?" in it. This is named gdb.rocm/step-over-kernel-exit.cpp and not gdb.rocm/s_endpgm.cpp because we will most probably want to extend this to test s_sendmsg deallow vgprs before s_endpgm as well. Co-Authored-By: Laurent Morichetti Co-Authored-By: Simon Marchi Change-Id: I6db617ac009383698e1c66744d68e70b1d1ca90f --- .../gdb.rocm/step-over-kernel-exit.cpp | 48 ++++++++ .../gdb.rocm/step-over-kernel-exit.exp | 108 ++++++++++++++++++ 2 files changed, 156 insertions(+) create mode 100644 gdb/testsuite/gdb.rocm/step-over-kernel-exit.cpp create mode 100644 gdb/testsuite/gdb.rocm/step-over-kernel-exit.exp diff --git a/gdb/testsuite/gdb.rocm/step-over-kernel-exit.cpp b/gdb/testsuite/gdb.rocm/step-over-kernel-exit.cpp new file mode 100644 index 00000000000..61f1b431df1 --- /dev/null +++ b/gdb/testsuite/gdb.rocm/step-over-kernel-exit.cpp @@ -0,0 +1,48 @@ +/* Copyright (C) 2023 Free Software Foundation, Inc. + Copyright (C) 2023 Advanced Micro Devices, Inc. All rights reserved. + + This file is part of GDB. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . +*/ + +#include +#include + +#define CHECK(cmd) \ + do \ + { \ + hipError_t error = cmd; \ + if (error != hipSuccess) \ + { \ + fprintf (stderr, "error: '%s'(%d) at %s:%d\n", \ + hipGetErrorString (error), error, \ + __FILE__, __LINE__); \ + exit (EXIT_FAILURE); \ + } \ + } while (0) + +__global__ void +kernel () +{ + asm ("before_s_endpgm_insn: s_nop 0"); + asm ("s_endpgm_insn: s_endpgm"); /* set breakpoint here */ +} + +int +main (int argc, char **argv) +{ + kernel<<<1, 1>>> (); + CHECK (hipDeviceSynchronize ()); +} diff --git a/gdb/testsuite/gdb.rocm/step-over-kernel-exit.exp b/gdb/testsuite/gdb.rocm/step-over-kernel-exit.exp new file mode 100644 index 00000000000..484298ffa3e --- /dev/null +++ b/gdb/testsuite/gdb.rocm/step-over-kernel-exit.exp @@ -0,0 +1,108 @@ +# Copyright (C) 2023 Free Software Foundation, Inc. +# Copyright (C) 2023 Advanced Micro Devices, Inc. All rights reserved. + +# This file is part of GDB. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# Test stopping at and single-stepping and displaced-stepping an +# s_endpgm instruction. + +load_lib rocm.exp + +standard_testfile .cpp + +require allow_hipcc_tests + +if { [build_executable "failed to prepare" \ + $testfile $srcfile {debug hip}] == -1 } { + return -1 +} + +# Test stepping over an s_endpgm instruction. +# +# STEP_OVER_MODE can be one of: +# +# - none: don't put a breakpoint on the s_endpgm instruction. + +# - inline: put a breakpoint on the s_endpgm instruction, and use +# in-line stepping to step over it (disable displaced-stepping). + +# - displaced: same, but use displaced stepping. +# +# SCHEDLOCK can be "on" or "off". + +proc do_test { step_over_mode schedlock } { + with_rocm_gpu_lock { + clean_restart $::binfile + + if { $step_over_mode == "none" } { + # Nothing to do. + } elseif { $step_over_mode == "inline" } { + gdb_test_no_output "set displaced-stepping off" + } elseif { $step_over_mode == "displaced" } { + gdb_test_no_output "set displaced-stepping on" + } else { + error "Invalid step_over_mode value: $step_over_mode" + } + + if ![runto_main] { + fail "can't run to main" + return -1 + } + + # Put a breakpoint on the instruction before s_endpgm, + # continue to it. + gdb_breakpoint "before_s_endpgm_insn" allow-pending + gdb_continue_to_breakpoint "before_s_endpgm_insn" + + gdb_test_no_output "set scheduler-locking $schedlock" + + gdb_test "stepi" \ + "\"s_endpgm_insn: s_endpgm\".*" \ + "single-step instruction before s_endpgm" + + gdb_test "x/i \$pc" \ + "$::hex <\[^\r\n\]*>:\[ \t\]+s_endpgm.*" \ + "stopped at s_endpgm" + + # If testing a step-over is requested, place a breakpoint at + # the current instruction to force a step-over. + if { $step_over_mode != "none" } { + gdb_test "break s_endpgm_insn" "Breakpoint $::decimal at $::hex.*" + } + + set d $::decimal + set wave_target_id_re "AMDGPU Wave $d:$d:$d:1 \\(0,0,0\\)/0" + + set selected_thread_before \ + [get_integer_valueof "\$_thread" 0 "get selected thread before"] + + gdb_test "stepi" \ + "\r\n\[$wave_target_id_re exited\]\r\nCommand aborted, thread exited\\." \ + "single-step s_endpgm" + + # Check that the selected thread didn't change, and that GDB + # manages to print the exited wave's target ID properly. + gdb_test "thread" \ + "\r\n\[Current thread is $selected_thread_before \\($wave_target_id_re\\) \\(exited\\)\]" \ + "exited wave target id" + } +} + +foreach_with_prefix step_over_mode {none inline displaced} { + foreach_with_prefix schedlock {off on} { + do_test $step_over_mode $schedlock + } +} From patchwork Thu Dec 14 20:22:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 82174 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8CD0A3861880 for ; Thu, 14 Dec 2023 20:23:57 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) by sourceware.org (Postfix) with ESMTPS id CD7FC38618AB for ; Thu, 14 Dec 2023 20:23:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CD7FC38618AB Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CD7FC38618AB Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.85.128.51 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585387; cv=none; b=gaCtF/YLM1DYTfqc3gNwMICga3Wi1HXnjgxugWc6zVWtllXwkBoPSgIT5Iffa92+ymy7ZAilESYHeCm+2+8foSCbdVGafyxy6SdzpJy0f1eYDDTYHYqCNdL4Jks7RdWVzZz8z6AX1o9XhdN4G/i3QdZaH4F+SpLUL2Nv1mA9oF0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702585387; c=relaxed/simple; bh=LZnyR8ApVvwsa2D2h+0tT8iukG/p2mF9a8o3o3zdCm4=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=qb3YUErkELI+FxIRHF3/dJHTwUrCzxix+JxPsIH5QmbY7X70ti3ri6ZwhSbcV86eVUQzt05EWrw6QVS61q/hq2uq/vp+VH6wl46VYPU6Y5P4BFvjIm3GXwfRirawxDcIoDb9LQVGFma8OfYJPrEARSEqjgs6D08rWghnJ049k3E= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-40c2db2ee28so91858385e9.2 for ; Thu, 14 Dec 2023 12:23:05 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702585384; x=1703190184; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eROZZW26mn+2YuM6OcsIO1ppUF8zBUFQTgYW7snKLnw=; b=GdJgsW7gP2f3Wbq3Na+fgMQEvpSpibPRl5CeMTL3iL04Ed0PWF+m7GQi2jaFSMD6wR WwU5Plek3IC48UpDKSkMAkYRRk7jKw4mMRfX943GAqO+0hGbq3osnvMY9AqFcupo0V9d sZpz/q8il8vHQHjx/5gKdY9hZ+VYigIuDV0oAbpffxSNRDzf+hqAlf+tXtS8pblY0o3r sofZi/kwoNNbi2szETbsTZu3FfL48xqUWlDdmxr4gS1W3hB6ehOXSkhhnHfElWewGQgy VFewzDBxey+HluIAzb7F+VAMsJVCabJP9G9fwsNNL0f54gQG/WIs2XLXIauR36yA/+sK fdyA== X-Gm-Message-State: AOJu0Yy5a5Jfes8X+/FdzxxypWbmjjBkvZuOWo4jDbJfhO23G0Yh6h+y zlx1mXNPjXy31CRJv5900lEC2eOrbgq5Tw== X-Google-Smtp-Source: AGHT+IFFKXNfBGxSJvCPodOVvn/pinM3ezUNwZYoDfT72k6TwIIVel9PXkc+hIhXdGyOrnM080KbsA== X-Received: by 2002:a05:600c:21c7:b0:40c:2699:b625 with SMTP id x7-20020a05600c21c700b0040c2699b625mr6076150wmj.138.1702585384210; Thu, 14 Dec 2023 12:23:04 -0800 (PST) Received: from localhost ([2001:8a0:f923:4f00:2646:535c:5a04:e380]) by smtp.gmail.com with UTF8SMTPSA id gw18-20020a05600c851200b004063c9f68f2sm25507262wmb.26.2023.12.14.12.23.03 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 Dec 2023 12:23:03 -0800 (PST) From: Pedro Alves To: gdb-patches@sourceware.org Subject: [PATCH 8/8] Add tests for handling of vanishing threads that were stepping/stopping Date: Thu, 14 Dec 2023 20:22:38 +0000 Message-ID: <20231214202238.1065676-9-pedro@palves.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214202238.1065676-1-pedro@palves.net> References: <20231214202238.1065676-1-pedro@palves.net> MIME-Version: 1.0 X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Not for commit. This won't work with current upstream, unfortunately. Change-Id: I43a66f060c35aad1fe0d9ff022ce2afd0537f028 --- .../gdb.rocm/continue-over-kernel-exit.cpp | 66 +++++++ .../gdb.rocm/continue-over-kernel-exit.exp | 165 ++++++++++++++++++ 2 files changed, 231 insertions(+) create mode 100644 gdb/testsuite/gdb.rocm/continue-over-kernel-exit.cpp create mode 100644 gdb/testsuite/gdb.rocm/continue-over-kernel-exit.exp diff --git a/gdb/testsuite/gdb.rocm/continue-over-kernel-exit.cpp b/gdb/testsuite/gdb.rocm/continue-over-kernel-exit.cpp new file mode 100644 index 00000000000..bad7064f30b --- /dev/null +++ b/gdb/testsuite/gdb.rocm/continue-over-kernel-exit.cpp @@ -0,0 +1,66 @@ +/* Copyright (C) 2023 Free Software Foundation, Inc. + Copyright (C) 2023 Advanced Micro Devices, Inc. All rights reserved. + + This file is part of GDB. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . +*/ + +#include +#include + +#define CHECK(cmd) \ + do \ + { \ + hipError_t error = cmd; \ + if (error != hipSuccess) \ + { \ + fprintf (stderr, "error: '%s'(%d) at %s:%d\n", \ + hipGetErrorString (error), error, \ + __FILE__, __LINE__); \ + exit (EXIT_FAILURE); \ + } \ + } while (0) + +__global__ void +kern () +{ + asm ("s_endpgm_insn: s_endpgm"); +} + +__global__ void +second_kernel () +{ +} + +int +main () +{ + /* Use 1-thread blocks to easily control number of waves. */ + size_t blocksize = 1; + size_t gridsize = 10; + + kern<<>> (); + + /* Stopping at this second kernel after the first kernel completely + finishes makes GDB refresh its thread list while the + amd-dbgapi-target is still active, which triggers different code + paths in GDB that lead to deleting exited threads. We test both + stopping here, and not stopping here. */ + second_kernel<<<1, 1>>> (); + + CHECK (hipDeviceSynchronize ()); + + return 0; +} diff --git a/gdb/testsuite/gdb.rocm/continue-over-kernel-exit.exp b/gdb/testsuite/gdb.rocm/continue-over-kernel-exit.exp new file mode 100644 index 00000000000..ed98ab4697b --- /dev/null +++ b/gdb/testsuite/gdb.rocm/continue-over-kernel-exit.exp @@ -0,0 +1,165 @@ +# Copyright (C) 2023 Free Software Foundation, Inc. +# Copyright (C) 2023 Advanced Micro Devices, Inc. All rights reserved. + +# This file is part of GDB. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# Test stopping at and continuing from a s_endpgm instruction, with +# and without stepping over a breakpoint on top of it. + +load_lib rocm.exp + +standard_testfile .cpp + +require allow_hipcc_tests + +if { [build_executable "failed to prepare" \ + $testfile $srcfile {debug hip}] == -1 } { + return -1 +} + +# Test continuing from an s_endpgm instruction with multiple waves. +# +# STEP_OVER_MODE can be one of: +# +# - none: don't put a breakpoint on the s_endpgm instruction. +# +# - inline: put a breakpoint on the s_endpgm instruction, and use +# in-line stepping to step over it (disable displaced-stepping). +# +# - displaced: same, but use displaced stepping. +# +# FINISH_HOW can be one of: +# +# - second_kernel: stop at a breakpoint at "second_kernel" after the +# first kernel finishes. +# +# - normal_exit: let the inferior exit normally after the first +# kernel finishes, without stopping at second_kernel. +# +proc do_test { step_over_mode finish_how } { + with_rocm_gpu_lock { + clean_restart $::binfile + + if { $step_over_mode == "none" } { + # Nothing to do. + } elseif { $step_over_mode == "inline" } { + gdb_test_no_output "set displaced-stepping off" + } elseif { $step_over_mode == "displaced" } { + gdb_test_no_output "set displaced-stepping on" + } else { + error "Invalid step_over_mode value: $step_over_mode" + } + + if ![runto_main] { + fail "can't run to main" + return -1 + } + + # Put a breakpoint on the s_endpgm instruction, and continue + # to it. If testing a step-over is requested, leave the + # breakpoint inserted after the initial hit to force a + # step-over. + if { $step_over_mode != "none" } { + gdb_breakpoint "s_endpgm_insn" allow-pending + set bp_hits 10 + } else { + gdb_breakpoint "s_endpgm_insn" allow-pending temporary + set bp_hits 0 + } + gdb_continue_to_breakpoint "s_endpgm_insn" + + gdb_test "x/i \$pc" \ + "$::hex <\[^\r\n\]*>:\[ \t\]+s_endpgm.*" \ + "stopped at s_endpgm" + + if {$finish_how == "second_kernel"} { + gdb_breakpoint "second_kernel" + } + + for {set i 1} {$i < $bp_hits} {incr i} { + with_test_prefix "iter $i" { + gdb_test_multiple "continue" "continue to s_endpgm" { + -re -wrap "Continuing\\.\r\n.*hit Breakpoint $::decimal, .* kern .*\"s_endpgm_insn: .*" { + pass $gdb_test_name + } + } + } + } + + # GDB used to mishandle wave exits resulting in + # WAVE_COMMAND_TERMINATED events being left in the + # amd-dbgapi-target's event queue _after_ the wave had already + # been removed from the GDB thread list. That in turn would + # result in seeing already-dead waves re-added to GDB's thread + # list, and then immediately deleted, with these user + # notifications: + # + # [New AMDGPU Wave ?:?:?:1 (?,?,?)/?] + # [AMDGPU Wave ?:?:?:1 (?,?,?)/? exited] + # + # The test below fails if we see any "?" in the wave's target + # id. It is written this way instead of a tighter match to + # increase the chances of the problem being caught if the GDB + # output ever changes. + # + # The bad thread additions mentioned above were done without + # adding the amd-dbgapi-target-specific wave info to the + # amd-dbgapi-target data structures, resulting in GDB + # assertion failures in the amd-dbgapi-target if GDB handled + # any other stop event after the first kernel finishes. We + # exercise that with FINISH_HOW=second_kernel. + + set bad_coords 0 + set exited_normally 0 + set second_kernel_breakpoint 0 + gdb_test_multiple "continue" "last continue" -lbl { + -re "AMDGPU Wave (\[^\r\n\]*)(?=\r\n)" { + set wave_coords $expect_out(1,string) + if {[string first "?" $wave_coords] != -1} { + incr bad_coords + } + exp_continue + } + -re "Inferior 1 \\(process $::decimal\\) exited normally" { + incr exited_normally + exp_continue + } + -re "hit Breakpoint $::decimal, with lane 0, second_kernel " { + incr second_kernel_breakpoint + exp_continue + } + -re -wrap "" { + if {$bad_coords > 0} { + fail "$gdb_test_name (bad coords)" + } elseif {[expr ($second_kernel_breakpoint + $exited_normally) > 1]} { + fail "$gdb_test_name (bad finish)" + } elseif {$finish_how == "second_kernel" && $second_kernel_breakpoint != 1} { + fail "$gdb_test_name (no second_kernel breakpoint)" + } elseif {$finish_how == "normal_exit" && $exited_normally != 1} { + fail "$gdb_test_name (not normal exit)" + } else { + pass $gdb_test_name + } + } + } + } +} + +foreach_with_prefix step_over_mode {none inline displaced} { + foreach_with_prefix finish_how {second_kernel normal_exit} { + do_test $step_over_mode $finish_how + } +}