[gdb/testsuite] Fix failure in gdb.threads/signal-sigtrap.exp

Message ID 20240829075205.26857-1-tdevries@suse.de
State Superseded
Headers
Series [gdb/testsuite] Fix failure in gdb.threads/signal-sigtrap.exp |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Test passed

Commit Message

Tom de Vries Aug. 29, 2024, 7:52 a.m. UTC
  The test-case gdb.threads/signal-sigtrap.exp:
- installs a signal handler called sigtrap_handler for SIGTRAP,
- sets a breakpoint on sigtrap_handler, and
- expects the breakpoint to trigger after issuing "signal SIGTRAP".

Usually, that happens indeed:
...
(gdb) signal SIGTRAP^M
Continuing with signal SIGTRAP.^M
^M
Thread 1 "signal-sigtrap" hit Breakpoint 2, sigtrap_handler (sig=5)^M
28      }^M
(gdb) PASS: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
...

Occasionally, I run into this failure on openSUSE Tumbleweed:
...
(gdb) signal SIGTRAP^M
Continuing with signal SIGTRAP.^M
^M
Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.^M
__pthread_create_2_1 () at pthread_create.c:843^M
(gdb) FAIL: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
...

AFAIU, the problem is in the situation that is setup before issuing that
command, by running to a breakpoint in thread_function:
...
void *thread_function (void *arg) {
  return NULL;
}
int main (void) {
  pthread_t child_thread;
  signal (SIGTRAP, sigtrap_handler);
  pthread_create (&child_thread, NULL, thread_function, NULL);
  pthread_join (child_thread, NULL);
  return 0;
}
...

In the passing case, thread 2 is stopped in thread_function, and thread 1 is
stopped somewhere in pthread_join:
...
(gdb) info threads^M
  Id   Target Id                                          Frame ^M
  1    Thread ... (LWP ...) "signal-sigtrap" __futex_abstimed_wait_common64 ()
* 2    Thread ... (LWP ...) "signal-sigtrap" thread_function ()
...

In the failing case, thread 2 is stopped in thread_function, but thread 1 is
stopped somewhere in pthread_create:
...
(gdb) info threads^M
  Id   Target Id                                          Frame ^M
  1    Thread ... (LWP ...) "signal-sigtrap" __GI___clone3 ()
* 2    Thread ... (LWP ...) "signal-sigtrap" thread_function ()
...

What I think happens is that pthread_create blocks SIGTRAP at some point, and
if the "signal SIGTRAP" command is issued while that is the case, the signal
becomes pending and consequently there's no longer a guarantee that the signal
will be delivered to the inferior.

Instead the signal will be handled by gdb like this:
...
(gdb) info signals SIGTRAP
Signal        Stop	Print	Pass to program	Description
SIGTRAP       Yes	Yes	No		Trace/breakpoint trap
...

Fix this by adding a barrier that ensures that pthread_create is done before
we issue the "signal SIGTRAP" command.

Using the fixed test-case, I tested my theory by explicitly blocking SIGTRAP:
...
+  sigset_t old_ss, new_ss;
+  sigemptyset (&new_ss);
+  sigaddset (&new_ss, SIGTRAP);
+  sigprocmask (SIG_BLOCK, &new_ss, &old_ss);
+
   /* Make sure that pthread_create is done once the breakpoint on
      thread_function triggers.  */
   pthread_barrier_wait (&barrier);

   pthread_join (child_thread, NULL);
+  sigprocmask (SIG_SETMASK, &old_ss, NULL);
...
and managed to reproduce the same failure:
...
(gdb) signal SIGTRAP^M
Continuing with signal SIGTRAP.^M
[Thread 0x7ffff7c00700 (LWP 13254) exited]^M
^M
Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.^M
0x00007ffff7c80056 in __GI___sigprocmask () sigprocmask.c:39^M
(gdb) FAIL: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
...

Tested on x86_64-linux.

PR testsuite/26867
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26867
---
 gdb/testsuite/gdb.threads/signal-sigtrap.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)


base-commit: b55b65bc56604e45fae38dea9a22eeb3ffa2b33e
  

Comments

Tom de Vries Aug. 29, 2024, 7:58 a.m. UTC | #1
On 8/29/24 09:52, Tom de Vries wrote:
> The test-case gdb.threads/signal-sigtrap.exp:

I see the same issue in gdb.threads/signal-command-handle-nopass.exp, 
I'll send a v2 that fixes that one as well.

Thanks,
- Tom

> - installs a signal handler called sigtrap_handler for SIGTRAP,
> - sets a breakpoint on sigtrap_handler, and
> - expects the breakpoint to trigger after issuing "signal SIGTRAP".
> 
> Usually, that happens indeed:
> ...
> (gdb) signal SIGTRAP^M
> Continuing with signal SIGTRAP.^M
> ^M
> Thread 1 "signal-sigtrap" hit Breakpoint 2, sigtrap_handler (sig=5)^M
> 28      }^M
> (gdb) PASS: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
> ...
> 
> Occasionally, I run into this failure on openSUSE Tumbleweed:
> ...
> (gdb) signal SIGTRAP^M
> Continuing with signal SIGTRAP.^M
> ^M
> Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.^M
> __pthread_create_2_1 () at pthread_create.c:843^M
> (gdb) FAIL: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
> ...
> 
> AFAIU, the problem is in the situation that is setup before issuing that
> command, by running to a breakpoint in thread_function:
> ...
> void *thread_function (void *arg) {
>    return NULL;
> }
> int main (void) {
>    pthread_t child_thread;
>    signal (SIGTRAP, sigtrap_handler);
>    pthread_create (&child_thread, NULL, thread_function, NULL);
>    pthread_join (child_thread, NULL);
>    return 0;
> }
> ...
> 
> In the passing case, thread 2 is stopped in thread_function, and thread 1 is
> stopped somewhere in pthread_join:
> ...
> (gdb) info threads^M
>    Id   Target Id                                          Frame ^M
>    1    Thread ... (LWP ...) "signal-sigtrap" __futex_abstimed_wait_common64 ()
> * 2    Thread ... (LWP ...) "signal-sigtrap" thread_function ()
> ...
> 
> In the failing case, thread 2 is stopped in thread_function, but thread 1 is
> stopped somewhere in pthread_create:
> ...
> (gdb) info threads^M
>    Id   Target Id                                          Frame ^M
>    1    Thread ... (LWP ...) "signal-sigtrap" __GI___clone3 ()
> * 2    Thread ... (LWP ...) "signal-sigtrap" thread_function ()
> ...
> 
> What I think happens is that pthread_create blocks SIGTRAP at some point, and
> if the "signal SIGTRAP" command is issued while that is the case, the signal
> becomes pending and consequently there's no longer a guarantee that the signal
> will be delivered to the inferior.
> 
> Instead the signal will be handled by gdb like this:
> ...
> (gdb) info signals SIGTRAP
> Signal        Stop	Print	Pass to program	Description
> SIGTRAP       Yes	Yes	No		Trace/breakpoint trap
> ...
> 
> Fix this by adding a barrier that ensures that pthread_create is done before
> we issue the "signal SIGTRAP" command.
> 
> Using the fixed test-case, I tested my theory by explicitly blocking SIGTRAP:
> ...
> +  sigset_t old_ss, new_ss;
> +  sigemptyset (&new_ss);
> +  sigaddset (&new_ss, SIGTRAP);
> +  sigprocmask (SIG_BLOCK, &new_ss, &old_ss);
> +
>     /* Make sure that pthread_create is done once the breakpoint on
>        thread_function triggers.  */
>     pthread_barrier_wait (&barrier);
> 
>     pthread_join (child_thread, NULL);
> +  sigprocmask (SIG_SETMASK, &old_ss, NULL);
> ...
> and managed to reproduce the same failure:
> ...
> (gdb) signal SIGTRAP^M
> Continuing with signal SIGTRAP.^M
> [Thread 0x7ffff7c00700 (LWP 13254) exited]^M
> ^M
> Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.^M
> 0x00007ffff7c80056 in __GI___sigprocmask () sigprocmask.c:39^M
> (gdb) FAIL: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
> ...
> 
> Tested on x86_64-linux.
> 
> PR testsuite/26867
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26867
> ---
>   gdb/testsuite/gdb.threads/signal-sigtrap.c | 17 ++++++++++++++++-
>   1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/testsuite/gdb.threads/signal-sigtrap.c b/gdb/testsuite/gdb.threads/signal-sigtrap.c
> index 24625ba9bac..7c903a13cf1 100644
> --- a/gdb/testsuite/gdb.threads/signal-sigtrap.c
> +++ b/gdb/testsuite/gdb.threads/signal-sigtrap.c
> @@ -20,6 +20,8 @@
>   #include <pthread.h>
>   #include <signal.h>
>   
> +static pthread_barrier_t barrier;
> +
>   void
>   sigtrap_handler (int sig)
>   {
> @@ -31,6 +33,13 @@ thread_function (void *arg)
>     return NULL;
>   }
>   
> +void *
> +thread_function_sync (void *arg)
> +{
> +  pthread_barrier_wait (&barrier);
> +  return thread_function (arg);
> +}
> +
>   int
>   main (void)
>   {
> @@ -38,7 +47,13 @@ main (void)
>   
>     signal (SIGTRAP, sigtrap_handler);
>   
> -  pthread_create (&child_thread, NULL, thread_function, NULL);
> +  pthread_barrier_init (&barrier, NULL, 2);
> +
> +  pthread_create (&child_thread, NULL, thread_function_sync, NULL);
> +
> +  /* Make sure that pthread_create is done once the breakpoint on
> +     thread_function triggers.  */
> +  pthread_barrier_wait (&barrier);
>   
>     pthread_join (child_thread, NULL);
>   
> 
> base-commit: b55b65bc56604e45fae38dea9a22eeb3ffa2b33e
  
Tom de Vries Aug. 29, 2024, 8:34 a.m. UTC | #2
On 8/29/24 09:58, Tom de Vries wrote:
> On 8/29/24 09:52, Tom de Vries wrote:
>> The test-case gdb.threads/signal-sigtrap.exp:
> 
> I see the same issue in gdb.threads/signal-command-handle-nopass.exp, 
> I'll send a v2 that fixes that one as well.
> 

https://sourceware.org/pipermail/gdb-patches/2024-August/211395.html

Thanks,
- Tom
  

Patch

diff --git a/gdb/testsuite/gdb.threads/signal-sigtrap.c b/gdb/testsuite/gdb.threads/signal-sigtrap.c
index 24625ba9bac..7c903a13cf1 100644
--- a/gdb/testsuite/gdb.threads/signal-sigtrap.c
+++ b/gdb/testsuite/gdb.threads/signal-sigtrap.c
@@ -20,6 +20,8 @@ 
 #include <pthread.h>
 #include <signal.h>
 
+static pthread_barrier_t barrier;
+
 void
 sigtrap_handler (int sig)
 {
@@ -31,6 +33,13 @@  thread_function (void *arg)
   return NULL;
 }
 
+void *
+thread_function_sync (void *arg)
+{
+  pthread_barrier_wait (&barrier);
+  return thread_function (arg);
+}
+
 int
 main (void)
 {
@@ -38,7 +47,13 @@  main (void)
 
   signal (SIGTRAP, sigtrap_handler);
 
-  pthread_create (&child_thread, NULL, thread_function, NULL);
+  pthread_barrier_init (&barrier, NULL, 2);
+
+  pthread_create (&child_thread, NULL, thread_function_sync, NULL);
+
+  /* Make sure that pthread_create is done once the breakpoint on
+     thread_function triggers.  */
+  pthread_barrier_wait (&barrier);
 
   pthread_join (child_thread, NULL);