[v2,1/4] support: Add support_wait_for_thread_exit

Message ID 59c56a95c559ed9b855965050fe4d1d9f3ad74ba.1629475813.git.fweimer@redhat.com
State Superseded
Headers
Series Fix ESRCH issues in pthread_cancel, pthread_kill |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent

Commit Message

Florian Weimer Aug. 20, 2021, 4:12 p.m. UTC
  ---
v2: Makefile update, make it clearer where the wait operation with
    pidfd_open would be put (it would replace the short usleep call).

 support/Makefile                       |  3 +-
 support/support.h                      |  4 ++
 support/support_wait_for_thread_exit.c | 72 ++++++++++++++++++++++++++
 3 files changed, 78 insertions(+), 1 deletion(-)
 create mode 100644 support/support_wait_for_thread_exit.c
  

Comments

Florian Weimer Sept. 29, 2021, 8:27 p.m. UTC | #1
* Florian Weimer via Libc-alpha:

> +      int task_tid = atoi (e->d_name);
> +      if (task_tid <= 0)
> +        FAIL_EXIT1 ("Invalid /proc/self/task entry: %s", e->d_name);

There's a kernel bug that causes /proc/self/task/0 to be listed
due to concurrent task exit in the kernel.

=====FAIL: nptl/tst-pthread_cancel-exited.out=====
error: support_wait_for_thread_exit.c:51: Invalid /proc/self/task entry: 0
error: 1 test failures
=====FAIL: nptl/tst-pthread_cancel-exited.test-result=====

Should work around this in our test code?

Thanks,
Florian
  
Adhemerval Zanella Sept. 30, 2021, 11:48 a.m. UTC | #2
On 29/09/2021 17:27, Florian Weimer via Libc-alpha wrote:
> * Florian Weimer via Libc-alpha:
> 
>> +      int task_tid = atoi (e->d_name);
>> +      if (task_tid <= 0)
>> +        FAIL_EXIT1 ("Invalid /proc/self/task entry: %s", e->d_name);
> 
> There's a kernel bug that causes /proc/self/task/0 to be listed
> due to concurrent task exit in the kernel.
> 
> =====FAIL: nptl/tst-pthread_cancel-exited.out=====
> error: support_wait_for_thread_exit.c:51: Invalid /proc/self/task entry: 0
> error: 1 test failures
> =====FAIL: nptl/tst-pthread_cancel-exited.test-result=====
> 
> Should work around this in our test code?

I have see this failures in some tests run and I was about to bring
this to you.  Can we safely mitigate it by ignoring the tid '0'?
  
Florian Weimer Sept. 30, 2021, 12:07 p.m. UTC | #3
* Adhemerval Zanella:

> On 29/09/2021 17:27, Florian Weimer via Libc-alpha wrote:
>> * Florian Weimer via Libc-alpha:
>> 
>>> +      int task_tid = atoi (e->d_name);
>>> +      if (task_tid <= 0)
>>> +        FAIL_EXIT1 ("Invalid /proc/self/task entry: %s", e->d_name);
>> 
>> There's a kernel bug that causes /proc/self/task/0 to be listed
>> due to concurrent task exit in the kernel.
>> 
>> =====FAIL: nptl/tst-pthread_cancel-exited.out=====
>> error: support_wait_for_thread_exit.c:51: Invalid /proc/self/task entry: 0
>> error: 1 test failures
>> =====FAIL: nptl/tst-pthread_cancel-exited.test-result=====
>> 
>> Should work around this in our test code?
>
> I have see this failures in some tests run and I was about to bring
> this to you.  Can we safely mitigate it by ignoring the tid '0'?

Right, I have sent a patch:

  [PATCH] support: Add check for TID zero in support_wait_for_thread_exit
  <https://sourceware.org/pipermail/libc-alpha/2021-September/131560.html>

Thanks,
Florian
  

Patch

diff --git a/support/Makefile b/support/Makefile
index a462781718..ef2b1a980a 100644
--- a/support/Makefile
+++ b/support/Makefile
@@ -82,9 +82,10 @@  libsupport-routines = \
   support_test_compare_blob \
   support_test_compare_failure \
   support_test_compare_string \
-  support_write_file_string \
   support_test_main \
   support_test_verify_impl \
+  support_wait_for_thread_exit \
+  support_write_file_string \
   temp_file \
   timespec \
   timespec-time64 \
diff --git a/support/support.h b/support/support.h
index 834dba9097..a5978b939a 100644
--- a/support/support.h
+++ b/support/support.h
@@ -174,6 +174,10 @@  timer_t support_create_timer (uint64_t sec, long int nsec, bool repeat,
 /* Disable the timer TIMER.  */
 void support_delete_timer (timer_t timer);
 
+/* Wait until all threads except the current thread have exited (as
+   far as the kernel is concerned).  */
+void support_wait_for_thread_exit (void);
+
 struct support_stack
 {
   void *stack;
diff --git a/support/support_wait_for_thread_exit.c b/support/support_wait_for_thread_exit.c
new file mode 100644
index 0000000000..658a813810
--- /dev/null
+++ b/support/support_wait_for_thread_exit.c
@@ -0,0 +1,72 @@ 
+/* Wait until all threads except the current thread has exited.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <dirent.h>
+#include <errno.h>
+#include <string.h>
+#include <support/check.h>
+#include <support/support.h>
+#include <unistd.h>
+
+void
+support_wait_for_thread_exit (void)
+{
+#ifdef __linux__
+  DIR *proc_self_task = opendir ("/proc/self/task");
+  TEST_VERIFY_EXIT (proc_self_task != NULL);
+
+  while (true)
+    {
+      errno = 0;
+      struct dirent *e = readdir (proc_self_task);
+      if (e == NULL && errno != 0)
+        FAIL_EXIT1 ("readdir: %m");
+      if (e == NULL)
+        {
+          /* Only the main thread remains.  Testing may continue.  */
+          closedir (proc_self_task);
+          return;
+        }
+
+      if (strcmp (e->d_name, ".") == 0 || strcmp (e->d_name, "..") == 0)
+        continue;
+
+      int task_tid = atoi (e->d_name);
+      if (task_tid <= 0)
+        FAIL_EXIT1 ("Invalid /proc/self/task entry: %s", e->d_name);
+
+      if (task_tid == gettid ())
+        /* The current thread.  Keep scanning for other
+           threads.  */
+        continue;
+
+      /* task_tid does not refer to this thread here, i.e., there is
+         another running thread.  */
+
+      /* Small timeout to give the thread a chance to exit.  */
+      usleep (50 * 1000);
+
+      /* Start scanning the directory from the start.  */
+      rewinddir (proc_self_task);
+    }
+#else
+  /* Use a large timeout because we cannot verify that the thread has
+     exited.  */
+  usleep (5 * 1000 * 1000);
+#endif
+}