From patchwork Fri Mar  7 17:42:18 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Joseph Myers <joseph@codesourcery.com>
X-Patchwork-Id: 11
Return-Path: <x14307373@homiemail-mx22.g.dreamhost.com>
X-Original-To: siddhesh@wilcox.dreamhost.com
Delivered-To: siddhesh@wilcox.dreamhost.com
Received: from homiemail-mx22.g.dreamhost.com (caibbdcaabij.dreamhost.com
	[208.113.200.189])
	by wilcox.dreamhost.com (Postfix) with ESMTP id A37C33600EF
	for <siddhesh@wilcox.dreamhost.com>;
	Fri,  7 Mar 2014 09:42:32 -0800 (PST)
Received: by homiemail-mx22.g.dreamhost.com (Postfix, from userid 14307373)
	id 506934B86061; Fri,  7 Mar 2014 09:42:32 -0800 (PST)
X-Original-To: glibc@patchwork.siddhesh.in
Delivered-To: x14307373@homiemail-mx22.g.dreamhost.com
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by homiemail-mx22.g.dreamhost.com (Postfix) with ESMTPS id
	291A14B86015 for <glibc@patchwork.siddhesh.in>;
	Fri,  7 Mar 2014 09:42:32 -0800 (PST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:date:from:to:subject:message-id:mime-version
	:content-type; q=dns; s=default; b=UNSK2AgKJqB1lyhKLEaFX0RO0EhQ/
	K7jqWEJARgXguz3DAaS3XllrHk3HEGFK4MCE3aB7GlmbywKe0AekNWpxP+yU38pl
	yjubn/8V06GTF4oeLa2pO18blgpjYttpcttmTQFTvoWn0Vx1zUJBBJ4KntjgJgKi
	4nFn1LQo1Eh8ks=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:date:from:to:subject:message-id:mime-version
	:content-type; s=default; bh=w44K1XoMLYUe/3+iNgcHYwWJ6Q8=; b=Rah
	f51UBP2YxqHHJq5lVr43HzW7d6J9S/dSgEZqV5WB/7Fwcnh6GAdkbqmeSzlv3r7D
	ygpAoIicvWayuPdFJiY2ktZE2xb8VYHV3uiFGn7eDKT4ngS6yc4722OXyH/LD1ao
	3lLLHgdOwDj4JlMnSmtgiW2xRbmEHU6muF0Ow3Cs=
Received: (qmail 27282 invoked by alias); 7 Mar 2014 17:42:30 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Unsubscribe: 
 <mailto:libc-alpha-unsubscribe-glibc=patchwork.siddhesh.in@sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Delivered-To: mailing list libc-alpha@sourceware.org
Received: (qmail 27269 invoked by uid 89); 7 Mar 2014 17:42:29 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,
	BAYES_00 autolearn=ham version=3.3.2
X-HELO: relay1.mentorg.com
Date: Fri, 7 Mar 2014 17:42:18 +0000
From: "Joseph S. Myers" <joseph@codesourcery.com>
To: Carlos O'Donell <carlos@redhat.com>, <libc-alpha@sourceware.org>
Subject: Do not terminate default test runs on test failure
Message-ID: <Pine.LNX.4.64.1403071741020.6302@digraph.polyomino.org.uk>
MIME-Version: 1.0
X-DH-Original-To: glibc@patchwork.siddhesh.in

This patch is an updated version of
<https://sourceware.org/ml/libc-alpha/2014-01/msg00198.html> now
proposed for inclusion in glibc.

Normal practice for software testsuites is that rather than
terminating immediately when a test fails, they continue running and
report at the end on how many tests passed or failed.

The principle behind the glibc testsuite stopping on failure was
probably that the expected state is no failures and so any failure
indicates a problem such as miscompilation.  In practice, while this
is fairly close to true for native testing on x86_64 and x86 (kernel
bugs and race conditions can still cause intermittent failures), it's
less likely to be the case on other platforms, and so people testing
glibc run the testsuite with "make -k" and then examine the logs to
determine whether the failures are what they expect to fail on that
platform, possibly with some automation for the comparison.

This patch switches the glibc testsuite to the normal convention of
not stopping on failure - unless you use stop-on-test-failure=y, in
which case it behaves essentially as it did before (and does not
generate overall test summaries on failure).  Instead, the summary
tests.sum may contain tests that FAILed.  At the end of the test run,
any FAIL or ERROR lines from tests.sum are printed, and then it exits
with error status if there were any ERROR lines (meaning a directory
had no test results).  In addition, build failures will also cause the
test run to stop - this has the justification that those *do* indicate
serious problems that should be promptly fixed and aren't generally
hard to fix (but apart from that, avoiding the build stopping on those
failures seems harder).

Note that unlike the previous patches in this series, this *does*
require people with automation around testing glibc to change their
processes - either to start using tests.sum / xtests.sum to track
failures and compare them with expectations (with or without also
using "make -k" and examining "make" logs to identify build failures),
or else to use stop-on-test-failure=y and ignore the new tests.sum /
xtests.sum mechanism.

Tested x86_64.

Compared to the previous version, this includes changes to the manual
text as suggested by Brooks in
<https://sourceware.org/ml/libc-alpha/2014-02/msg00301.html>.

This concludes the present series of patches to generate PASS / FAIL
test results and fix associated testsuite issues.  Once it's in, I
intend to put the remaining TODO list items from the discussions of
these patches on the wiki.  I may return later to further testsuite
infrastructure improvements (such as supporting installed testing),
but also encourage other people interested in testsuite issues to work
on issues of interest to them (whether or not also listed on the wiki
todo list or in Bugzilla).

2014-03-07  Joseph Myers  <joseph@codesourcery.com>

	* scripts/evaluate-test.sh: Handle fourth argument to determine
	whether test run should stop on failure.
	* Makeconfig (stop-on-test-failure): New variable.
	(evaluate-test): Pass fourth argument to evaluate-test.sh based on
	$(stop-on-test-failure).
	* Makefile (tests): Give a summary of results from testing and
	exit with failure status if they include an ERROR.
	(xtests): Likewise.
	* manual/install.texi (Configuring and compiling): Mention
	stop-on-test-failure=y.
	* INSTALL: Regenerated.

diff --git a/INSTALL b/INSTALL
index bfa692d..bcb53b8 100644
--- a/INSTALL
+++ b/INSTALL
@@ -154,7 +154,7 @@ will be used, and CFLAGS sets optimization options for the compiler.
 
      If you only specify `--host', `configure' will prepare for a
      native compile but use what you specify instead of guessing what
-     your system is. This is most useful to change the CPU submodel.
+     your system is.  This is most useful to change the CPU submodel.
      For example, if `configure' guesses your machine as
      `i686-pc-linux-gnu' but you want to compile a library for 586es,
      give `--host=i586-pc-linux-gnu' or just `--host=i586-linux' and add
@@ -192,11 +192,16 @@ an appropriate numeric parameter to `make'.  You need a recent GNU
 
    To build and run test programs which exercise some of the library
 facilities, type `make check'.  If it does not complete successfully,
-do not use the built library, and report a bug after verifying that the
-problem is not already known.  *Note Reporting Bugs::, for instructions
-on reporting bugs.  Note that some of the tests assume they are not
-being run by `root'.  We recommend you compile and test the GNU C
-Library as an unprivileged user.
+or if it reports any unexpected failures or errors in its final summary
+of results, do not use the built library, and report a bug after
+verifying that the problem is not already known.  (You can specify
+`stop-on-test-failure=y' when running `make check' to make the test run
+stop and exit with an error status immediately when a failure occurs,
+rather than completing the test run and reporting all problems found.)
+*Note Reporting Bugs::, for instructions on reporting bugs.  Note that
+some of the tests assume they are not being run by `root'.  We
+recommend you compile and test the GNU C Library as an unprivileged
+user.
 
    Before reporting bugs make sure there is no problem with your system.
 The tests (and later installation) use some pre-existing files of the
diff --git a/Makeconfig b/Makeconfig
index 9078b29..0a2e12b 100644
--- a/Makeconfig
+++ b/Makeconfig
@@ -601,6 +601,12 @@ run-built-tests = yes
 endif
 endif
 
+# Whether to stop immediately when a test fails.  Nonempty means to
+# stop, empty means not to stop.
+ifndef stop-on-test-failure
+stop-on-test-failure =
+endif
+
 # How to run a program we just linked with our library.
 # The program binary is assumed to be $(word 2,$^).
 built-program-file = $(dir $(word 2,$^))$(notdir $(word 2,$^))
@@ -1091,6 +1097,7 @@ test-xfail-name = $(strip $(patsubst %.out, %, $(patsubst $(objpfx)%, %, $@)))
 # XPASS or XFAIL rather than PASS or FAIL.
 evaluate-test = $(..)scripts/evaluate-test.sh $(test-name) $$? \
 		  $(if $(test-xfail-$(test-xfail-name)),true,false) \
+		  $(if $(stop-on-test-failure),true,false) \
 		  > $(common-objpfx)$(test-name).test-result
 
 endif # Makeconfig not yet included
diff --git a/Makefile b/Makefile
index 8214dda..bdece42 100644
--- a/Makefile
+++ b/Makefile
@@ -324,10 +324,20 @@ tests: $(tests-special)
 	$(..)scripts/merge-test-results.sh -t $(objpfx) subdir-tests.sum \
 	  $(sort $(subdirs) .) \
 	  > $(objpfx)tests.sum
+	@grep '^ERROR:' $(objpfx)tests.sum || true
+	@grep '^FAIL:' $(objpfx)tests.sum || true
+	@echo "Summary of test results:"
+	@sed 's/:.*//' < $(objpfx)tests.sum | sort | uniq -c
+	@if grep -q '^ERROR:' $(objpfx)tests.sum; then exit 1; fi
 xtests:
 	$(..)scripts/merge-test-results.sh -t $(objpfx) subdir-xtests.sum \
 	  $(sort $(subdirs)) \
 	  > $(objpfx)xtests.sum
+	@grep '^ERROR:' $(objpfx)xtests.sum || true
+	@grep '^FAIL:' $(objpfx)xtests.sum || true
+	@echo "Summary of test results for extra tests:"
+	@sed 's/:.*//' < $(objpfx)xtests.sum | sort | uniq -c
+	@if grep -q '^ERROR:' $(objpfx)xtests.sum; then exit 1; fi
 
 # The realclean target is just like distclean for the parent, but we want
 # the subdirs to know the difference in case they care.
diff --git a/NEWS b/NEWS
index 0c5b39a..381b8a5 100644
--- a/NEWS
+++ b/NEWS
@@ -12,6 +12,14 @@ Version 2.20
   15347, 15804, 15894, 16447, 16532, 16545, 16574, 16600, 16609, 16610,
   16611, 16613, 16623, 16632.
 
+* Running the testsuite no longer terminates as soon as a test fails.
+  Instead, a file tests.sum (xtests.sum from "make xcheck") is generated,
+  with PASS or FAIL lines for individual tests.  A summary of the results is
+  printed, including a list of failing lists, and "make check" exits with
+  error status only if test results for a subdirectory are completely
+  missing, or if a test failed to compile.  "make check
+  stop-on-test-failure=y" may be used to keep the old behavior.
+
 * The am33 port, which had not worked for several years, has been removed
   from ports.
 
diff --git a/manual/install.texi b/manual/install.texi
index 8562bdc..fa44519 100644
--- a/manual/install.texi
+++ b/manual/install.texi
@@ -224,9 +224,14 @@ GNU @code{make} version, though.
 
 To build and run test programs which exercise some of the library
 facilities, type @code{make check}.  If it does not complete
-successfully, do not use the built library, and report a bug after
-verifying that the problem is not already known.  @xref{Reporting Bugs},
-for instructions on reporting bugs.  Note that some of the tests assume
+successfully, or if it reports any unexpected failures or errors in
+its final summary of results, do not use the built library, and report
+a bug after verifying that the problem is not already known.  (You can
+specify @samp{stop-on-test-failure=y} when running @code{make check}
+to make the test run stop and exit with an error status immediately
+when a failure occurs, rather than completing the test run and
+reporting all problems found.)  @xref{Reporting Bugs}, for
+instructions on reporting bugs.  Note that some of the tests assume
 they are not being run by @code{root}.  We recommend you compile and
 test @theglibc{} as an unprivileged user.
 
diff --git a/scripts/evaluate-test.sh b/scripts/evaluate-test.sh
index c8f5012..2a5c156 100755
--- a/scripts/evaluate-test.sh
+++ b/scripts/evaluate-test.sh
@@ -17,12 +17,13 @@
 # License along with the GNU C Library; if not, see
 # <http://www.gnu.org/licenses/>.
 
-# usage: evaluate-test.sh test_name rc xfail
+# usage: evaluate-test.sh test_name rc xfail stop_on_failure
 
 test_name=$1
 rc=$2
 orig_rc=$rc
 xfail=$3
+stop_on_failure=$4
 
 if [ $rc -eq 0 ]; then
   result="PASS"
@@ -37,4 +38,8 @@ fi
 
 echo "$result: $test_name"
 echo "original exit status $orig_rc"
-exit $rc
+if $stop_on_failure; then
+  exit $rc
+else
+  exit 0
+fi