[v2,gdb/testsuite] Fix failure in gdb.threads/signal-sigtrap.exp

Message ID 20240829083335.19356-1-tdevries@suse.de
State Committed
Headers
Series [v2,gdb/testsuite] Fix failure in gdb.threads/signal-sigtrap.exp |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Test passed

Commit Message

Tom de Vries Aug. 29, 2024, 8:33 a.m. UTC
  The test-case gdb.threads/signal-sigtrap.exp:
- installs a signal handler called sigtrap_handler for SIGTRAP,
- sets a breakpoint on sigtrap_handler, and
- expects the breakpoint to trigger after issuing "signal SIGTRAP".

Usually, that happens indeed:
...
(gdb) signal SIGTRAP^M
Continuing with signal SIGTRAP.^M
^M
Thread 1 "signal-sigtrap" hit Breakpoint 2, sigtrap_handler (sig=5)^M
28      }^M
(gdb) PASS: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
...

Occasionally, I run into this failure on openSUSE Tumbleweed:
...
(gdb) signal SIGTRAP^M
Continuing with signal SIGTRAP.^M
^M
Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.^M
__pthread_create_2_1 () at pthread_create.c:843^M
(gdb) FAIL: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
...

AFAIU, the problem is in the situation that is setup before issuing that
command, by running to a breakpoint in thread_function:
...
void *thread_function (void *arg) {
  return NULL;
}
int main (void) {
  pthread_t child_thread;
  signal (SIGTRAP, sigtrap_handler);
  pthread_create (&child_thread, NULL, thread_function, NULL);
  pthread_join (child_thread, NULL);
  return 0;
}
...

In the passing case, thread 2 is stopped in thread_function, and thread 1 is
stopped somewhere in pthread_join:
...
(gdb) info threads^M
  Id   Target Id                                          Frame ^M
  1    Thread ... (LWP ...) "signal-sigtrap" __futex_abstimed_wait_common64 ()
* 2    Thread ... (LWP ...) "signal-sigtrap" thread_function ()
...

In the failing case, thread 2 is stopped in thread_function, but thread 1 is
stopped somewhere in pthread_create:
...
(gdb) info threads^M
  Id   Target Id                                          Frame ^M
  1    Thread ... (LWP ...) "signal-sigtrap" __GI___clone3 ()
* 2    Thread ... (LWP ...) "signal-sigtrap" thread_function ()
...

What I think happens is that pthread_create blocks SIGTRAP at some point, and
if the "signal SIGTRAP" command is issued while that is the case, the signal
becomes pending and consequently there's no longer a guarantee that the signal
will be delivered to the inferior.

Instead the signal will be handled by gdb like this:
...
(gdb) info signals SIGTRAP
Signal        Stop	Print	Pass to program	Description
SIGTRAP       Yes	Yes	No		Trace/breakpoint trap
...

Fix this by adding a barrier that ensures that pthread_create is done before
we issue the "signal SIGTRAP" command.

Likewise in test-case gdb.threads/signal-command-handle-nopass.exp.

Using the fixed test-case, I tested my theory by explicitly blocking SIGTRAP:
...
+  sigset_t old_ss, new_ss;
+  sigemptyset (&new_ss);
+  sigaddset (&new_ss, SIGTRAP);
+  sigprocmask (SIG_BLOCK, &new_ss, &old_ss);
+
   /* Make sure that pthread_create is done once the breakpoint on
      thread_function triggers.  */
   pthread_barrier_wait (&barrier);

   pthread_join (child_thread, NULL);
+  sigprocmask (SIG_SETMASK, &old_ss, NULL);
...
and managed to reproduce the same failure:
...
(gdb) signal SIGTRAP^M
Continuing with signal SIGTRAP.^M
[Thread 0x7ffff7c00700 (LWP 13254) exited]^M
^M
Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.^M
0x00007ffff7c80056 in __GI___sigprocmask () sigprocmask.c:39^M
(gdb) FAIL: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
...

Tested on x86_64-linux.

PR testsuite/26867
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26867
---
 .../signal-command-handle-nopass.c            | 19 ++++++++++++++++++-
 gdb/testsuite/gdb.threads/signal-sigtrap.c    | 17 ++++++++++++++++-
 2 files changed, 34 insertions(+), 2 deletions(-)


base-commit: b55b65bc56604e45fae38dea9a22eeb3ffa2b33e
  

Comments

Tom de Vries Sept. 23, 2024, 7:54 a.m. UTC | #1
On 8/29/24 10:33, Tom de Vries wrote:
> The test-case gdb.threads/signal-sigtrap.exp:
> - installs a signal handler called sigtrap_handler for SIGTRAP,
> - sets a breakpoint on sigtrap_handler, and
> - expects the breakpoint to trigger after issuing "signal SIGTRAP".
> 
> Usually, that happens indeed:
> ...
> (gdb) signal SIGTRAP^M
> Continuing with signal SIGTRAP.^M
> ^M
> Thread 1 "signal-sigtrap" hit Breakpoint 2, sigtrap_handler (sig=5)^M
> 28      }^M
> (gdb) PASS: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
> ...
> 
> Occasionally, I run into this failure on openSUSE Tumbleweed:
> ...
> (gdb) signal SIGTRAP^M
> Continuing with signal SIGTRAP.^M
> ^M
> Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.^M
> __pthread_create_2_1 () at pthread_create.c:843^M
> (gdb) FAIL: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
> ...
> 
> AFAIU, the problem is in the situation that is setup before issuing that
> command, by running to a breakpoint in thread_function:
> ...
> void *thread_function (void *arg) {
>    return NULL;
> }
> int main (void) {
>    pthread_t child_thread;
>    signal (SIGTRAP, sigtrap_handler);
>    pthread_create (&child_thread, NULL, thread_function, NULL);
>    pthread_join (child_thread, NULL);
>    return 0;
> }
> ...
> 
> In the passing case, thread 2 is stopped in thread_function, and thread 1 is
> stopped somewhere in pthread_join:
> ...
> (gdb) info threads^M
>    Id   Target Id                                          Frame ^M
>    1    Thread ... (LWP ...) "signal-sigtrap" __futex_abstimed_wait_common64 ()
> * 2    Thread ... (LWP ...) "signal-sigtrap" thread_function ()
> ...
> 
> In the failing case, thread 2 is stopped in thread_function, but thread 1 is
> stopped somewhere in pthread_create:
> ...
> (gdb) info threads^M
>    Id   Target Id                                          Frame ^M
>    1    Thread ... (LWP ...) "signal-sigtrap" __GI___clone3 ()
> * 2    Thread ... (LWP ...) "signal-sigtrap" thread_function ()
> ...
> 
> What I think happens is that pthread_create blocks SIGTRAP at some point, and
> if the "signal SIGTRAP" command is issued while that is the case, the signal
> becomes pending and consequently there's no longer a guarantee that the signal
> will be delivered to the inferior.
> 
> Instead the signal will be handled by gdb like this:
> ...
> (gdb) info signals SIGTRAP
> Signal        Stop	Print	Pass to program	Description
> SIGTRAP       Yes	Yes	No		Trace/breakpoint trap
> ...
> 
> Fix this by adding a barrier that ensures that pthread_create is done before
> we issue the "signal SIGTRAP" command.
> 
> Likewise in test-case gdb.threads/signal-command-handle-nopass.exp.
> 
> Using the fixed test-case, I tested my theory by explicitly blocking SIGTRAP:
> ...
> +  sigset_t old_ss, new_ss;
> +  sigemptyset (&new_ss);
> +  sigaddset (&new_ss, SIGTRAP);
> +  sigprocmask (SIG_BLOCK, &new_ss, &old_ss);
> +
>     /* Make sure that pthread_create is done once the breakpoint on
>        thread_function triggers.  */
>     pthread_barrier_wait (&barrier);
> 
>     pthread_join (child_thread, NULL);
> +  sigprocmask (SIG_SETMASK, &old_ss, NULL);
> ...
> and managed to reproduce the same failure:
> ...
> (gdb) signal SIGTRAP^M
> Continuing with signal SIGTRAP.^M
> [Thread 0x7ffff7c00700 (LWP 13254) exited]^M
> ^M
> Thread 1 "signal-sigtrap" received signal SIGTRAP, Trace/breakpoint trap.^M
> 0x00007ffff7c80056 in __GI___sigprocmask () sigprocmask.c:39^M
> (gdb) FAIL: $exp: sigtrap thread 1: signal SIGTRAP reaches handler
> ...
> 

Pushed.

Thanks,
- Tom

> Tested on x86_64-linux.
> 
> PR testsuite/26867
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26867
> ---
>   .../signal-command-handle-nopass.c            | 19 ++++++++++++++++++-
>   gdb/testsuite/gdb.threads/signal-sigtrap.c    | 17 ++++++++++++++++-
>   2 files changed, 34 insertions(+), 2 deletions(-)
> 
> diff --git a/gdb/testsuite/gdb.threads/signal-command-handle-nopass.c b/gdb/testsuite/gdb.threads/signal-command-handle-nopass.c
> index 6d82bd6f256..548d0d3701d 100644
> --- a/gdb/testsuite/gdb.threads/signal-command-handle-nopass.c
> +++ b/gdb/testsuite/gdb.threads/signal-command-handle-nopass.c
> @@ -21,6 +21,8 @@
>   #include <pthread.h>
>   #include <signal.h>
>   
> +static pthread_barrier_t barrier;
> +
>   void
>   handler (int sig)
>   {
> @@ -35,6 +37,13 @@ thread_function (void *arg)
>       usleep (1);
>   }
>   
> +void *
> +thread_function_sync (void *arg)
> +{
> +  pthread_barrier_wait (&barrier);
> +  return thread_function (arg);
> +}
> +
>   int
>   main (void)
>   {
> @@ -42,7 +51,15 @@ main (void)
>     int i;
>   
>     signal (SIGUSR1, handler);
> -  pthread_create (&child_thread, NULL, thread_function, NULL);
> +
> +  pthread_barrier_init (&barrier, NULL, 2);
> +
> +  pthread_create (&child_thread, NULL, thread_function_sync, NULL);
> +
> +  /* Make sure that pthread_create is done once the breakpoint on
> +     thread_function triggers.  */
> +  pthread_barrier_wait (&barrier);
> +
>     pthread_join (child_thread, NULL);
>   
>     return 0;
> diff --git a/gdb/testsuite/gdb.threads/signal-sigtrap.c b/gdb/testsuite/gdb.threads/signal-sigtrap.c
> index 24625ba9bac..7c903a13cf1 100644
> --- a/gdb/testsuite/gdb.threads/signal-sigtrap.c
> +++ b/gdb/testsuite/gdb.threads/signal-sigtrap.c
> @@ -20,6 +20,8 @@
>   #include <pthread.h>
>   #include <signal.h>
>   
> +static pthread_barrier_t barrier;
> +
>   void
>   sigtrap_handler (int sig)
>   {
> @@ -31,6 +33,13 @@ thread_function (void *arg)
>     return NULL;
>   }
>   
> +void *
> +thread_function_sync (void *arg)
> +{
> +  pthread_barrier_wait (&barrier);
> +  return thread_function (arg);
> +}
> +
>   int
>   main (void)
>   {
> @@ -38,7 +47,13 @@ main (void)
>   
>     signal (SIGTRAP, sigtrap_handler);
>   
> -  pthread_create (&child_thread, NULL, thread_function, NULL);
> +  pthread_barrier_init (&barrier, NULL, 2);
> +
> +  pthread_create (&child_thread, NULL, thread_function_sync, NULL);
> +
> +  /* Make sure that pthread_create is done once the breakpoint on
> +     thread_function triggers.  */
> +  pthread_barrier_wait (&barrier);
>   
>     pthread_join (child_thread, NULL);
>   
> 
> base-commit: b55b65bc56604e45fae38dea9a22eeb3ffa2b33e
  

Patch

diff --git a/gdb/testsuite/gdb.threads/signal-command-handle-nopass.c b/gdb/testsuite/gdb.threads/signal-command-handle-nopass.c
index 6d82bd6f256..548d0d3701d 100644
--- a/gdb/testsuite/gdb.threads/signal-command-handle-nopass.c
+++ b/gdb/testsuite/gdb.threads/signal-command-handle-nopass.c
@@ -21,6 +21,8 @@ 
 #include <pthread.h>
 #include <signal.h>
 
+static pthread_barrier_t barrier;
+
 void
 handler (int sig)
 {
@@ -35,6 +37,13 @@  thread_function (void *arg)
     usleep (1);
 }
 
+void *
+thread_function_sync (void *arg)
+{
+  pthread_barrier_wait (&barrier);
+  return thread_function (arg);
+}
+
 int
 main (void)
 {
@@ -42,7 +51,15 @@  main (void)
   int i;
 
   signal (SIGUSR1, handler);
-  pthread_create (&child_thread, NULL, thread_function, NULL);
+
+  pthread_barrier_init (&barrier, NULL, 2);
+
+  pthread_create (&child_thread, NULL, thread_function_sync, NULL);
+
+  /* Make sure that pthread_create is done once the breakpoint on
+     thread_function triggers.  */
+  pthread_barrier_wait (&barrier);
+
   pthread_join (child_thread, NULL);
 
   return 0;
diff --git a/gdb/testsuite/gdb.threads/signal-sigtrap.c b/gdb/testsuite/gdb.threads/signal-sigtrap.c
index 24625ba9bac..7c903a13cf1 100644
--- a/gdb/testsuite/gdb.threads/signal-sigtrap.c
+++ b/gdb/testsuite/gdb.threads/signal-sigtrap.c
@@ -20,6 +20,8 @@ 
 #include <pthread.h>
 #include <signal.h>
 
+static pthread_barrier_t barrier;
+
 void
 sigtrap_handler (int sig)
 {
@@ -31,6 +33,13 @@  thread_function (void *arg)
   return NULL;
 }
 
+void *
+thread_function_sync (void *arg)
+{
+  pthread_barrier_wait (&barrier);
+  return thread_function (arg);
+}
+
 int
 main (void)
 {
@@ -38,7 +47,13 @@  main (void)
 
   signal (SIGTRAP, sigtrap_handler);
 
-  pthread_create (&child_thread, NULL, thread_function, NULL);
+  pthread_barrier_init (&barrier, NULL, 2);
+
+  pthread_create (&child_thread, NULL, thread_function_sync, NULL);
+
+  /* Make sure that pthread_create is done once the breakpoint on
+     thread_function triggers.  */
+  pthread_barrier_wait (&barrier);
 
   pthread_join (child_thread, NULL);