From patchwork Fri Feb 26 17:42:07 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Mike Frysinger <vapier@gentoo.org>
X-Patchwork-Id: 11119
X-Patchwork-Delegate: vapier@gentoo.org
Received: (qmail 2172 invoked by alias); 26 Feb 2016 17:42:13 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Delivered-To: mailing list libc-alpha@sourceware.org
Received: (qmail 2158 invoked by uid 89); 26 Feb 2016 17:42:11 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,
	RP_MATCHES_RCVD,
	SPF_PASS autolearn=ham version=3.3.2 spammy=hangs, 476, poc,
	Setting
X-HELO: smtp.gentoo.org
Date: Fri, 26 Feb 2016 12:42:07 -0500
From: Mike Frysinger <vapier@gentoo.org>
To: Carlos O'Donell <carlos@redhat.com>
Cc: GNU C Library <libc-alpha@sourceware.org>
Subject: Re: Setting high test timeouts makes some tests take longer.
Message-ID: <20160226174207.GN19841@vapier.lan>
Mail-Followup-To: Carlos O'Donell <carlos@redhat.com>,
	GNU C Library <libc-alpha@sourceware.org>
References: <56D011D4.5020508@redhat.com>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <56D011D4.5020508@redhat.com>

On 26 Feb 2016 03:50, Carlos O'Donell wrote:
> I noticed nptl/tst-exit2 taking longer and longer the more
> I increased my timeouts.
> 
> If you look at the test it actually relies on the test to
> timeout for it to complete! Yikes!
> 
> It uses pthread_exit to exit the main thread, and then it
> proves the created thread is still running by timing out
> the test.
> 
> These tests that have "EXPECTED_SIGNAL SIGALRM" (not all
> of them, but several) need to be rewritten to use alarm()
> to trigger the signal at some reasonable time instead of
> the full duration of the test timeout.
> 
> So with your patch to increase timeouts, it makes
> these tests take longer :-(

looks like most of the tests that rely on a lower timeout actually set
the timeout to be low.  only a few were letting it float higher.  this
is good though as it also means my patch to delete TIMEOUT from all the
tests should be respun.

we should land a quick fix to get back to the status quo by setting
TIMEOUT in these tests to a low value.  i'll post a patch to do that.

> Do you think using alarm() is a good solution? Seems
> easiest.

if we explicitly set TIMEOUT low for these tests (and drop a comment in
to keep people from "cleaning" them up), this still allows for the factor
to scale it up pretty high, which is both good & bad.  good in that for
truly slow devices, we give it a chance to pass, but for ones that would
complete in time, we sit around waiting a very long time.

looking closer, i think any & all use of "EXPECTED_SIGNAL SIGALRM" is a
bad idea.  if an earlier part of the test setup hangs, then it will still
be detected as a pass because it timedout in the end (which the skeleton
expected).

so if we added a call to alarm(1) in these tests, we still wouldn't catch
bad/early flakes.  i think we want to grow the test harness a bit here so
it can differentiate between the two.  PoC below.
-mike

--- a/nptl/tst-exit2.c
+++ b/nptl/tst-exit2.c
@@ -1,13 +1,17 @@
+#include <atomic.h>
 #include <pthread.h>
 #include <signal.h>
 #include <stdio.h>
+#include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
 
+static int reached;
 
 static void *
 tf (void *arg)
 {
+  atomic_increment (&reached);
   while (1)
     sleep (100);
 
@@ -15,6 +19,11 @@ tf (void *arg)
   return NULL;
 }
 
+static int
+timeout_check (void)
+{
+  return (reached == 2) ? 0 : 1;
+}
 
 static int
 do_test (void)
@@ -28,6 +37,9 @@ do_test (void)
       return 1;
     }
 
+  atomic_increment (&reached);
+  alarm (3);
+
   /* Terminate only this thread.  */
   pthread_exit (NULL);
 
@@ -35,6 +47,6 @@ do_test (void)
   return 1;
 }
 
-#define EXPECTED_SIGNAL SIGALRM
 #define TEST_FUNCTION do_test ()
+#define TIMEOUT_CHECK timeout_check()
 #include "../test-skeleton.c"
--- a/test-skeleton.c
+++ b/test-skeleton.c
@@ -219,6 +219,20 @@ signal_handler (int sig __attribute__ ((unused)))
   exit (1);
 }
 
+#ifdef TIMEOUT_CHECK
+static void
+__attribute__ ((noreturn))
+child_timeout_handler (int sig __attribute__ ((unused)))
+{
+  if (TIMEOUT_CHECK == 0)
+    exit (0);
+
+  /* If the check returns, use a generic diagnosis.  */
+  puts ("Timeout check failed");
+  exit (1);
+}
+#endif
+
 /* Avoid all the buffer overflow messages on stderr.  */
 static void
 __attribute__ ((unused))
@@ -451,6 +465,10 @@ main (int argc, char *argv[])
       if (setpgid (0, 0) != 0)
 	printf ("Failed to set the process group ID: %m\n");
 
+#ifdef TIMEOUT_CHECK
+      signal (SIGALRM, child_timeout_handler);
+#endif
+
       /* Execute the test function and exit with the return value.   */
       exit (TEST_FUNCTION);
     }