From patchwork Fri Sep 20 12:59:07 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Stefan Liebler <stli@linux.ibm.com>
X-Patchwork-Id: 34613
Received: (qmail 28504 invoked by alias); 20 Sep 2019 12:59:17 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Delivered-To: mailing list libc-alpha@sourceware.org
Received: (qmail 28494 invoked by uid 89); 20 Sep 2019 12:59:16 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-22.1 required=5.0 tests=AWL, BAYES_00,
	GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,
	RCVD_IN_DNSWL_LOW, SPF_PASS,
	UNSUBSCRIBE_BODY autolearn=ham version=3.3.1 spammy=saves, r23,
	2110, Explain
X-HELO: mx0a-001b2d01.pphosted.com
Subject: Re: [PATCH] Speedup various nptl/tst-cancel20 and nptl/tst-cancel21
	tests.
To: "Carlos O'Donell" <carlos@redhat.com>,
	libc-alpha <libc-alpha@sourceware.org>
References: <c2818c4a-3afe-e691-6589-1c24651b43a5@linux.ibm.com>
	<62d353ae-c6a8-16ce-01d5-56933cd18802@redhat.com>
From: Stefan Liebler <stli@linux.ibm.com>
Date: Fri, 20 Sep 2019 14:59:07 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
	Thunderbird/60.8.0
MIME-Version: 1.0
In-Reply-To: <62d353ae-c6a8-16ce-01d5-56933cd18802@redhat.com>
x-cbid: 19092012-0020-0000-0000-0000036F8EA6
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 19092012-0021-0000-0000-000021C541EC
Message-Id: <205f17ba-a82b-7806-54fa-49b61268e64d@linux.ibm.com>

On 9/12/19 4:07 PM, Carlos O'Donell wrote:
> On 9/12/19 8:15 AM, Stefan Liebler wrote:
>> Hi,
>>
>> each of thease tests - tst-cancel20, tst-cancelx20, tst-cancel21,
>> tst-cancelx21 and tst-cancel21-static - runs for 8s.
>>
>> do_one_test is called 4 times. Either in do_one_test (tst-cancel20.c)
>> or in tf (tst-cancel21.c) "sleep (1)" is called 2 times.
>>
>> This patch reduces the sleep time. Using usleep (5ms) leads to a
>> runtime of one tst-cancel... invocation of roughly 40ms instead of 8s.
>> As the nptl tests run in sequence, this patch saves roughly 39s of
>> runtime for "make check".
>>
>> Bye,
>> Stefan
>>
>> ChangeLog:
>>
>>      * nptl/tst-cancel20.c (do_one_test): Reduce sleep times.
>>      * nptl/tst-cancel21.c (tf): Likewise.
> 
> I see two possible solutions:
> 
> (a) Explain in detail why the sleep is needed, and why reducing
>      the sleep is safe.
> 
> (b) Remove the sleep and use proper barriers to provide the
>      synchronization the test is using sleep to accomplish.
> 
> Your current patch does (a) without explaining why it is safe
> to reduce the sleep period, and that the test continues to test
> what it was written to test.
> 
> Is it safe to reduce the sleep timing?
> 
> Why is it there?
> 

Hi,

tst-cancel20.c cancels a thread and checks if all cleanup-handlers were 
called in the correct order:
if (cleanups != 0x1234L)
{
    printf ("called cleanups %lx\n", cleanups);
    return 1;
}

These numbers belong to the cleanup-handlers called in the following 
functions:
1 => sh_body (sets in_sh_body=1 and blocks in read)
2 => sh (SIGHUP signal-handler; is received after pthread_barrier_wait)
3 => tf_body (waits with pthread_barrier_wait and then blocks in read)
4 => tf (started via pthread_create in do_one_test)

do_one_test starts tf, sends SIGHUP after waiting for the barrier and 
cancels the thread as soon as in_sh_body has been set to 1.

tst-cancel21.c differs in the fact, that there a thread sends SIGHUP and 
cancels the main thread.

I've tried to just remove the sleeps. This works fine with tst-cancel20.
But with tst-cancelx20 (same code as tst-cancel20.c but compiled with 
-fexceptions -fasynchronous-unwind-tables) it sometimes fails with: 
called cleanups 124 => cleanup-handler in tf_body was not called.

In the "good" case, SIGHUP is received while tf_body is blocking in read.
In the "error" case, SIGHUP is received while tf_body is waiting in 
pthread_barrier_wait.
(This occures on s390x in the same way as on e.g. x86_64; I've used gcc 
version 9.1.1)

I've compiled the test with gcc -S -fverbose-asm. Here is the shortened 
output:
tf_body:
.LFB28:
...
# tst-cancel20.c:75:   int r = pthread_barrier_wait (&b);
	brasl	%r14,pthread_barrier_wait	#,
...
# tst-cancel20.c:82:   if (read (fd[0], &c, 1) == 1)
.LEHB4:
	brasl	%r14,read	#,
...
.LEHE4:
...
.L25:
# ../sysdeps/nptl/pthread.h:612:     __frame->__cancel_routine 
(__frame->__cancel_arg);
	lghi	%r2,3	#,
	brasl	%r14,cl	#,
...
brasl	%r14,_Unwind_Resume	#,
	.cfi_endproc


.section	.gcc_except_table
.LLSDA28:
	.byte	0xff
	.byte	0xff
	.byte	0x1
	.uleb128 .LLSDACSE28-.LLSDACSB28
.LLSDACSB28:
	.uleb128 .LEHB4-.LFB28
	.uleb128 .LEHE4-.LEHB4
	.uleb128 .L25-.LFB28
	.uleb128 0
	.uleb128 .LEHB5-.LFB28
	.uleb128 .LEHE5-.LEHB5
	.uleb128 0
	.uleb128 0
.LLSDACSE28:


According to the .gcc_except_table, the region with our cleanup-handler 
starts - after pthread_barrier_wait - directly before the read call, 
even though pthread_barrier_wait is within pthread_cleanup_push / pop:
tf_body (void)
{ ...
   pthread_cleanup_push (cl, (void *) 3L);

   int r = pthread_barrier_wait (&b);
   ...
   if (read (fd[0], &c, 1) == 1)
      ...
   read (fd[0], &c, 1);

   pthread_cleanup_pop (0);
}
Is this behaviour intended?

The difference between those calls is "nothrow":
extern ssize_t read (int __fd, void *__buf, size_t __nbytes) ;
vs
extern int pthread_barrier_wait (pthread_barrier_t *__barrier)
      __attribute__ ((__nothrow__))  __attribute__ ((__nonnull__ (1)));

Placing a e.g. a printf or read call just before pthread_barrier_wait 
will lead to the inclusion of pthread_barrier_wait and the 
cleanup-handler is called!


Now, I've removed the sleeps and also removed the pthread_barrier_wait. 
Instead write to and read from a pipe is used to synchronizse tf_body 
and do_one_test.
Please have a look at the attached patch.

Bye.
Stefan

commit 47847907fd2dae3201bf87096886c44d7801768b
Author: Stefan Liebler <stli@linux.ibm.com>
Date:   Wed Sep 11 09:43:08 2019 +0200

    Speedup various nptl/tst-cancel20 and nptl/tst-cancel21 tests.
    
    Each of thease tests - tst-cancel20, tst-cancelx20, tst-cancel21,
    tst-cancelx21 and tst-cancel21-static - runs for 8s.
    
    do_one_test is called 4 times. Either in do_one_test (tst-cancel20.c)
    or in tf (tst-cancel21.c) "sleep (1)" is called 2 times.
    
    This patch removes the sleep which leads to a short runtime of one
    tst-cancel... invocation and as the nptl tests run in sequence,
    this patch saves roughly 39s of runtime for "make check".
    
    Just removing the sleeps leads to fails in tst-cancelx20 / x21
    as one cleanup-handler won't be called as pthread_barrier_wait is
    marked with attribute ((__nothrow__)).  Therefore the pthread_barrier_wait
    calls are replaced by writing to and reading from a pipe in order
    to synchronize the threads.
    
    ChangeLog:
    
            * nptl/tst-cancel20.c (b): Remove.
            (fd): Enhance to 6 elements.
            (tf_body): Use xwrite instead of pthread_barrier_wait.
            (do_one_test): Remove sleep and use read instead of
            pthread_barrier_wait.
            * nptl/tst-cancel21.c (b): Remove.
            (fd): Enhance to 6 elements.
            (tf_body): Use xwrite instead of pthread_barrier_wait.
            (tf): Remove sleep and use read instead of
            pthread_barrier_wait.

diff --git a/nptl/tst-cancel20.c b/nptl/tst-cancel20.c
index 96dfb71fa9..f82f899a83 100644
--- a/nptl/tst-cancel20.c
+++ b/nptl/tst-cancel20.c
@@ -21,11 +21,10 @@
 #include <signal.h>
 #include <stdio.h>
 #include <stdlib.h>
-#include <unistd.h>
+#include <support/xunistd.h>
 
 
-static int fd[4];
-static pthread_barrier_t b;
+static int fd[6];
 volatile int in_sh_body;
 unsigned long cleanups;
 
@@ -68,17 +67,21 @@ sh (int sig)
 static void __attribute__((noinline))
 tf_body (void)
 {
-  char c;
+  char c = 0;
 
   pthread_cleanup_push (cl, (void *) 3L);
 
-  int r = pthread_barrier_wait (&b);
-  if (r != 0 && r != PTHREAD_BARRIER_SERIAL_THREAD)
-    {
-      puts ("child thread: barrier_wait failed");
-      exit (1);
-    }
-
+  /* Write to the pipe in order to inform do_one_test that we are now within
+     pthread_cleanup_push / pop and do_one_test can send us SIGHUP and cancel
+     this thread.
+     We do not use pthread_barrier_wait here as it is marked with
+     attribute ((__nothrow__)) and the region in .gcc_except_table section
+     which is responsible for calling the cleanup-handler would start just
+     before the next call to read in case of nptl/tst-cancelx20.c where we
+     build with "gcc -fexceptions -fasynchronous-unwind-tables".  */
+  xwrite (fd[5], &c, 1);
+
+  /* Blocks until either SIGHUP is received or the thread is canceled.  */
   if (read (fd[0], &c, 1) == 1)
     {
       puts ("read succeeded");
@@ -106,7 +109,7 @@ do_one_test (void)
 {
   in_sh_body = 0;
   cleanups = 0;
-  if (pipe (fd) != 0 || pipe (fd + 2) != 0)
+  if (pipe (fd) != 0 || pipe (fd + 2) != 0 || pipe (fd + 4) != 0)
     {
       puts ("pipe failed");
       return 1;
@@ -119,16 +122,17 @@ do_one_test (void)
       return 1;
     }
 
-  int r = pthread_barrier_wait (&b);
-  if (r != 0 && r != PTHREAD_BARRIER_SERIAL_THREAD)
+  /* Block until the thread is within pthread_cleanup_push / pop
+     in tf_body which is called by tf.  */
+  char c;
+  if (read (fd[4], &c, 1) != 1)
     {
-      puts ("parent thread: barrier_wait failed");
+      printf ("parent thread: Reading from pipe failed %m\n");
       return 1;
     }
 
-  sleep (1);
-
-  r = pthread_kill (th, SIGHUP);
+  /* Send signal SIGHUP where in_sh_body will be set to 1 in sh_body.  */
+  int r = pthread_kill (th, SIGHUP);
   if (r)
     {
       errno = r;
@@ -137,8 +141,9 @@ do_one_test (void)
     }
 
   while (in_sh_body == 0)
-    sleep (1);
+    ;
 
+  /* Cancel the thread while the signal handler blocks in read in sh_body.  */
   if (pthread_cancel (th) != 0)
     {
       puts ("cancel failed");
@@ -172,6 +177,8 @@ do_one_test (void)
   close (fd[1]);
   close (fd[2]);
   close (fd[3]);
+  close (fd[4]);
+  close (fd[5]);
 
   return 0;
 }
@@ -195,12 +202,6 @@ do_test (void)
       return 1;
     }
 
-  if (pthread_barrier_init (&b, NULL, 2) != 0)
-    {
-      puts ("barrier_init failed");
-      return 1;
-    }
-
   struct sigaction sa;
   sa.sa_handler = sh;
   sigemptyset (&sa.sa_mask);
diff --git a/nptl/tst-cancel21.c b/nptl/tst-cancel21.c
index 361510b027..30e396dee2 100644
--- a/nptl/tst-cancel21.c
+++ b/nptl/tst-cancel21.c
@@ -22,11 +22,10 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <sys/wait.h>
-#include <unistd.h>
+#include <support/xunistd.h>
 
 
-static int fd[4];
-static pthread_barrier_t b;
+static int fd[6];
 volatile int in_sh_body;
 unsigned long cleanups;
 
@@ -69,17 +68,21 @@ sh (int sig)
 static void __attribute__((noinline))
 tf_body (void)
 {
-  char c;
+  char c = 0;
 
   pthread_cleanup_push (cl, (void *) 3L);
 
-  int r = pthread_barrier_wait (&b);
-  if (r != 0 && r != PTHREAD_BARRIER_SERIAL_THREAD)
-    {
-      puts ("child thread: barrier_wait failed");
-      exit (1);
-    }
-
+  /* Write to the pipe in order to inform tf that we are now within
+     pthread_cleanup_push / pop and tf can send us SIGHUP and cancel
+     this thread.
+     We do not use pthread_barrier_wait here as it is marked with
+     attribute ((__nothrow__)) and the region in .gcc_except_table section
+     which is responsible for calling the cleanup-handler would start just
+     before the next call to read in case of nptl/tst-cancelx21.c where we
+     build with "gcc -fexceptions -fasynchronous-unwind-tables".  */
+  xwrite (fd[5], &c, 1);
+
+  /* Blocks until either SIGHUP is received or the thread is canceled.  */
   if (read (fd[0], &c, 1) == 1)
     {
       puts ("read succeeded");
@@ -97,16 +100,17 @@ tf (void *arg)
 {
   pthread_t th = (pthread_t) arg;
 
-  int r = pthread_barrier_wait (&b);
-  if (r != 0 && r != PTHREAD_BARRIER_SERIAL_THREAD)
+  /* Block until the main thread is within pthread_cleanup_push / pop
+     in tf_body which is called by do_one_test.  */
+  char c;
+  if (read (fd[4], &c, 1) != 1)
     {
-      puts ("parent thread: barrier_wait failed");
+      printf ("Reading from pipe failed %m\n");
       exit (1);
     }
 
-  sleep (1);
-
-  r = pthread_kill (th, SIGHUP);
+  /* Send signal SIGHUP where in_sh_body will be set to 1 in sh_body.  */
+  int r = pthread_kill (th, SIGHUP);
   if (r)
     {
       errno = r;
@@ -115,8 +119,10 @@ tf (void *arg)
     }
 
   while (in_sh_body == 0)
-    sleep (1);
+    ;
 
+  /* Cancel the parent thread while the signal handler blocks in read
+     in sh_body.  */
   if (pthread_cancel (th) != 0)
     {
       puts ("cancel failed");
@@ -142,11 +148,6 @@ tf (void *arg)
       exit (1);
     }
 
-  if (pthread_barrier_destroy (&b))
-    {
-      puts ("barrier destroy failed");
-      exit (1);
-    }
 
   /* The pipe closing must be issued after the cancellation handling to avoid
      a race condition where the cancellation runs after both pipe ends are
@@ -156,6 +157,8 @@ tf (void *arg)
   close (fd[1]);
   close (fd[2]);
   close (fd[3]);
+  close (fd[4]);
+  close (fd[5]);
 
   exit (0);
 }
@@ -186,14 +189,8 @@ do_one_test (void)
       return !WIFEXITED (status) || WEXITSTATUS (status);
     }
 
-  if (pthread_barrier_init (&b, NULL, 2) != 0)
-    {
-      puts ("barrier_init failed");
-      exit (1);
-    }
-
   cleanups = 0;
-  if (pipe (fd) != 0 || pipe (fd + 2) != 0)
+  if (pipe (fd) != 0 || pipe (fd + 2) != 0 || pipe (fd + 4) != 0)
     {
       puts ("pipe failed");
       exit (1);