test-dlclose-exit-race: avoid hang on pthread_create error

Message ID xnim0mp8k4.fsf@greed.delorie.com
State Committed
Commit ac30324c67d94696fdb0799e9d4fc51dc70d490b
Delegated to: Arjun Shankar
Headers
Series test-dlclose-exit-race: avoid hang on pthread_create error |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

DJ Delorie Aug. 3, 2021, 8:39 p.m. UTC
  This test depends on the "last" function being called in a
different thread than the "first" function, as "last" posts
a semaphore that "first" is waiting on.  However, if pthread_create
fails - for example, if running in an older container before
the clone3()-in-container-EPERM fixes - exit() is called in the
same thread as everything else, the semaphore never gets posted,
and first hangs.

The fix is to pre-post that semaphore before a single-threaded
exit.
---
 stdlib/test-dlclose-exit-race.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)
  

Comments

Arjun Shankar Aug. 4, 2021, 2:51 p.m. UTC | #1
Hi DJ,
 
> This test depends on the "last" function being called in a
> different thread than the "first" function, as "last" posts
> a semaphore that "first" is waiting on.  However, if pthread_create
> fails - for example, if running in an older container before
> the clone3()-in-container-EPERM fixes - exit() is called in the
> same thread as everything else, the semaphore never gets posted,
> and first hangs.

> The fix is to pre-post that semaphore before a single-threaded
> exit.

> ---
>  stdlib/test-dlclose-exit-race.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/stdlib/test-dlclose-exit-race.c b/stdlib/test-dlclose-exit-race.c
> index b4632532813..2c44efdde6d 100644
> --- a/stdlib/test-dlclose-exit-race.c
> +++ b/stdlib/test-dlclose-exit-race.c
> @@ -29,6 +29,7 @@
>  
>  #include <stdio.h>
>  #include <stdlib.h>
> +#include <errno.h>
>  #include <semaphore.h>
>  #include <support/check.h>
>  #include <support/xdlfcn.h>
> @@ -64,6 +65,7 @@ last (void)
>  int
>  main (void)
>  {
> +  int value;
>    void *dso;
>    pthread_t thread;
>  
> @@ -71,7 +73,17 @@ main (void)
>  
>    dso = xdlopen ("$ORIGIN/test-dlclose-exit-race-helper.so",
>  		 RTLD_NOW|RTLD_GLOBAL);
> -  thread = xpthread_create (NULL, exit_thread, NULL);
> +  if ((value = pthread_create (&thread, NULL, exit_thread, NULL)) != 0)
> +    {
> +      /* If pthread_create fails, then exit() is called in the main
> +	 thread instead of a second thread, so the semaphore post that
> +	 would have happened in 'last' gets blocked behind the call to
> +	 first() - which is waiting on the semaphore, and thus
> +	 hangs.  */
> +      sem_post (&order2);
> +      errno = value;
> +      FAIL_EXIT1 ("pthread_create: %m");
> +    }

OK. This makes sense. I also verified that it removes the hang.

There is a second bug here: the test isn't using the support
infrastructure. That would have caused the test to time out instead
of hanging. But that doesn't necessarily have to be fixed within
the scope of this patch. With or without it, this patch looks
useful.

Reviewed-by: Arjun Shankar <arjun@redhat.com>

Cheers!
  
DJ Delorie Aug. 4, 2021, 7:38 p.m. UTC | #2
Arjun Shankar <arjun.is@lostca.se> writes:
> OK. This makes sense. I also verified that it removes the hang.
>
> There is a second bug here: the test isn't using the support
> infrastructure. That would have caused the test to time out instead
> of hanging. But that doesn't necessarily have to be fixed within
> the scope of this patch. With or without it, this patch looks
> useful.

Yup, Carlos and I talked about that, and it was my opinion that (1) the
test would still be broken, (2) we'd have to wait for the timeout
instead of exiting immediately, and (3) I had a bit of concern that the
framework would invalidate the test (we think it won't though).

But yeah, eventually all the tests should be converted.  Not today ;-)

> Reviewed-by: Arjun Shankar <arjun@redhat.com>

Thanks!  Pushed.
  

Patch

diff --git a/stdlib/test-dlclose-exit-race.c b/stdlib/test-dlclose-exit-race.c
index b4632532813..2c44efdde6d 100644
--- a/stdlib/test-dlclose-exit-race.c
+++ b/stdlib/test-dlclose-exit-race.c
@@ -29,6 +29,7 @@ 
 
 #include <stdio.h>
 #include <stdlib.h>
+#include <errno.h>
 #include <semaphore.h>
 #include <support/check.h>
 #include <support/xdlfcn.h>
@@ -64,6 +65,7 @@  last (void)
 int
 main (void)
 {
+  int value;
   void *dso;
   pthread_t thread;
 
@@ -71,7 +73,17 @@  main (void)
 
   dso = xdlopen ("$ORIGIN/test-dlclose-exit-race-helper.so",
 		 RTLD_NOW|RTLD_GLOBAL);
-  thread = xpthread_create (NULL, exit_thread, NULL);
+  if ((value = pthread_create (&thread, NULL, exit_thread, NULL)) != 0)
+    {
+      /* If pthread_create fails, then exit() is called in the main
+	 thread instead of a second thread, so the semaphore post that
+	 would have happened in 'last' gets blocked behind the call to
+	 first() - which is waiting on the semaphore, and thus
+	 hangs.  */
+      sem_post (&order2);
+      errno = value;
+      FAIL_EXIT1 ("pthread_create: %m");
+    }
 
   xdlclose (dso);
   xpthread_join (thread);