[buildbot] keep build directory after failure

Message ID ad5d790e-e9a1-6c1f-4440-ed1c68625d0a@arm.com
State New, archived
Headers

Commit Message

Szabolcs Nagy Nov. 2, 2018, 11:33 a.m. UTC
  aarch64 buildbot failed with

FAIL: malloc/tst-malloc-tcache-leak

4 times so far even though there were no related commits,
the failure happens about 1 out of 15 builds and only if
the build was done in an unclean directory, but i could
not manually reproduce it yet.

the buildbot first builds glibc in the previous build
directory, if that fails then that immediately gets
deleted and glibc is rebuilt starting from an empty
build directory, so it's impossible to tell what went
wrong in case of such non-reproducible failures.

i suggest keeping the last failed build directory for
analysis.

the build directory rename is not optimal since that
breaks absolute paths pointing to it (e.g. paths in
testrun.sh), but it allows looking at test outputs
and failed binaries.
  

Comments

Szabolcs Nagy Nov. 9, 2018, 11:16 a.m. UTC | #1
On 02/11/18 11:33, Szabolcs Nagy wrote:
> aarch64 buildbot failed with

> 

> FAIL: malloc/tst-malloc-tcache-leak

>

> 4 times so far even though there were no related commits,

> the failure happens about 1 out of 15 builds and only if

> the build was done in an unclean directory, but i could

> not manually reproduce it yet.


this is most likely caused by a cgroup task limit.

is there a known safe task limit setting for make check -j1 ?

the current setting is
$ cat /sys/fs/cgroup/pids/user.slice/user-1002.slice/pids.max
12288

it also caused other failures

env GCONV_PATH=.... /home/szabolcs/tx1-ubuntu-aarch64/glibc-aarch64-linux/build/build/nptl/tst-eintr1  >
/home/szabolcs/tx1-ubuntu-aarch64/glibc-aarch64-linux/build/build/nptl/tst-eintr1.out; \
../scripts/evaluate-test.sh nptl/tst-eintr1 $? false false >
/home/szabolcs/tx1-ubuntu-aarch64/glibc-aarch64-linux/build/build/nptl/tst-eintr1.test-result
/bin/sh: 2: Cannot fork
make[2]: *** [/home/szabolcs/tx1-ubuntu-aarch64/glibc-aarch64-linux/build/build/nptl/tst-eintr1.out] Error 2


> 

> the buildbot first builds glibc in the previous build

> directory, if that fails then that immediately gets

> deleted and glibc is rebuilt starting from an empty

> build directory, so it's impossible to tell what went

> wrong in case of such non-reproducible failures.

> 

> i suggest keeping the last failed build directory for

> analysis.

> 

> the build directory rename is not optimal since that

> breaks absolute paths pointing to it (e.g. paths in

> testrun.sh), but it allows looking at test outputs

> and failed binaries.

>
  

Patch

diff --git a/scripts/slave/glibc-native.sh b/scripts/slave/glibc-native.sh
index 21c9d2d..8b22687 100755
--- a/scripts/slave/glibc-native.sh
+++ b/scripts/slave/glibc-native.sh
@@ -14,6 +14,7 @@  nproc=$(getconf _NPROCESSORS_ONLN)
 root_dir=$(pwd)
 src_dir="${root_dir}/glibc"
 build_dir="${root_dir}/build"
+old_build_dir="${root_dir}/old_build"
 
 git_branch=${BUILDBOT_BRANCHNAME:-master}
 git_revision=${BUILDBOT_REVISION:-origin/${git_branch}}
@@ -81,7 +82,10 @@  do_check() {
 do_clobber() {
   clobber=true
   cd "$root_dir"
-  rm -rf "$build_dir"
+  if [ -d "$build_dir" ]; then
+    rm -rf "$old_build_dir"
+    mv "$build_dir" "$old_build_dir"
+  fi
 }