diff mbox

gnu: Add dlib.

Message ID 87h9ajzc1z.fsf@ike.i-did-not-set--mail-host-address--so-tickle-me
State New
Headers show

Commit Message

Marius Bakke Aug. 17, 2016, 2:48 p.m. UTC
Ben Woodcroft <b.woodcroft@uq.edu.au> writes:

> On 17/08/16 09:45, Leo Famulari wrote:
>> On Wed, Aug 17, 2016 at 09:31:11AM +1000, Ben Woodcroft wrote:
>>>
>>> On 17/08/16 06:47, Leo Famulari wrote:
>>>> On Tue, Aug 16, 2016 at 11:45:16AM +0100, Marius Bakke wrote:
>>>>> I initially made this package on a foreign distro without "lapack" in
>>>>> inputs and have verified that dropping LAPACK makes the tests pass.
>>>>>
>>>>> I also found some other optional dependencies after digging around the
>>>>> source, as well as a recommendation to disable/enable asserts:
>>>>>
>>>>> http://dlib.net/dlib/config.h.html
>>>>>
>>>>> Ben, Leo: Can you try the following patch and see if that works for you?
>>>> Yes, this patch builds for me.
>>> Me too, although it appeared non-deterministic. I'm afraid I haven't time to
>>> see if this patch is suitable to push just now. Leo?
>> How did it appear non-deterministic to you?
> Just based on guix build --check:
>
> guix build: error: build failed: derivation 
> `/gnu/store/sxybcxw64q1ajzq6dysal75ffgq6238i-dlib-19.1.drv' may not be 
> deterministic: output 
> `/gnu/store/il57dcii4pzii11zlixjjxxxw699bg5x-dlib-19.1' differs
>
> I'm actually not sure, why does it say "may not be deterministic"? If it 
> builds twice and the second version is different, doesn't that mean it 
> is definitely not deterministic by counter-example, unless there has 
> been some leakage into the build container?
>
> I also tried building it with #parallel-build? #f and #parallel-tests? 
> #f. It worked in the first round but failed the second.
>
> Note also that the 'disable-asserts' phase should end with '#t'.

I've attached a patch with a #t in the disable-asserts phase, and also
deleting the 6MB static library.

Since `guix build --rounds=2` passes, is there any reason to delay this
patch? I've built this on GuixSD and a foreign distro and naively
verified that they are the same (the .so and all headers have the same
checksum; some of the recorded cmake input paths are different though).

Thanks,
Marius

Comments

Leo Famulari Aug. 18, 2016, 8:23 p.m. UTC | #1
On Wed, Aug 17, 2016 at 03:48:24PM +0100, Marius Bakke wrote:
> I've attached a patch with a #t in the disable-asserts phase, and also
> deleting the 6MB static library.
> 
> Since `guix build --rounds=2` passes, is there any reason to delay this
> patch? I've built this on GuixSD and a foreign distro and naively
> verified that they are the same (the .so and all headers have the same
> checksum; some of the recorded cmake input paths are different though).

I pushed the patch as 5f0ff6a9e. Hopefully dlib is still useful without
lapack. We should really figure out what the issue is and fix it :)
Marius Bakke Aug. 19, 2016, 10:52 a.m. UTC | #2
Leo Famulari <leo@famulari.name> writes:

> I pushed the patch as 5f0ff6a9e. Hopefully dlib is still useful without
> lapack. We should really figure out what the issue is and fix it :)

I noticed this fails to build on Hydra. What's worse is that the i686,
x86_64 and armhf targets fails at completely different things. armhf and
i686 exits cleanly after failing 2 and 5 tests respectively, while the
x86_64 target seems to get the segfault we saw with lapack in inputs.

What should we do? I'd prefer to keep the package so it can easily be
tested on various architectures, but can understand if it is reverted.
Perhaps we can disable substitutes or tests, to stop bothering Hydra?

I'll try reproducing it this weekend on various qemu configurations.
Leo Famulari Aug. 21, 2016, 8:17 p.m. UTC | #3
On Fri, Aug 19, 2016 at 11:52:37AM +0100, Marius Bakke wrote:
> Leo Famulari <leo@famulari.name> writes:
> 
> > I pushed the patch as 5f0ff6a9e. Hopefully dlib is still useful without
> > lapack. We should really figure out what the issue is and fix it :)
> 
> I noticed this fails to build on Hydra. What's worse is that the i686,
> x86_64 and armhf targets fails at completely different things. armhf and
> i686 exits cleanly after failing 2 and 5 tests respectively, while the
> x86_64 target seems to get the segfault we saw with lapack in inputs.
> 
> What should we do? I'd prefer to keep the package so it can easily be
> tested on various architectures, but can understand if it is reverted.
> Perhaps we can disable substitutes or tests, to stop bothering Hydra?
> 
> I'll try reproducing it this weekend on various qemu configurations.

Let us know about the results of your tests. Then we can decide what to
do.
Ben Woodcroft Aug. 22, 2016, 2:30 a.m. UTC | #4
On 21/08/16 16:17, Leo Famulari wrote:
> On Fri, Aug 19, 2016 at 11:52:37AM +0100, Marius Bakke wrote:
>> Leo Famulari <leo@famulari.name> writes:
>>
>>> I pushed the patch as 5f0ff6a9e. Hopefully dlib is still useful without
>>> lapack. We should really figure out what the issue is and fix it :)
>> I noticed this fails to build on Hydra. What's worse is that the i686,
>> x86_64 and armhf targets fails at completely different things. armhf and
>> i686 exits cleanly after failing 2 and 5 tests respectively, while the
>> x86_64 target seems to get the segfault we saw with lapack in inputs.
>>
>> What should we do? I'd prefer to keep the package so it can easily be
>> tested on various architectures, but can understand if it is reverted.
>> Perhaps we can disable substitutes or tests, to stop bothering Hydra?
>>
>> I'll try reproducing it this weekend on various qemu configurations.
> Let us know about the results of your tests. Then we can decide what to
> do.
Can this be fixed simply by using #parallel-build #f ? Doing so makes it 
reproducible for me so arguably we should be doing that anyway, even if 
it takes longer to build.

ben
Marius Bakke Aug. 22, 2016, 12:01 p.m. UTC | #5
Ben Woodcroft <b.woodcroft@uq.edu.au> writes:

> On 21/08/16 16:17, Leo Famulari wrote:
>> On Fri, Aug 19, 2016 at 11:52:37AM +0100, Marius Bakke wrote:
>>> Leo Famulari <leo@famulari.name> writes:
>>>
>>>> I pushed the patch as 5f0ff6a9e. Hopefully dlib is still useful without
>>>> lapack. We should really figure out what the issue is and fix it :)
>>> I noticed this fails to build on Hydra. What's worse is that the i686,
>>> x86_64 and armhf targets fails at completely different things. armhf and
>>> i686 exits cleanly after failing 2 and 5 tests respectively, while the
>>> x86_64 target seems to get the segfault we saw with lapack in inputs.
>>>
>>> What should we do? I'd prefer to keep the package so it can easily be
>>> tested on various architectures, but can understand if it is reverted.
>>> Perhaps we can disable substitutes or tests, to stop bothering Hydra?
>>>
>>> I'll try reproducing it this weekend on various qemu configurations.
>> Let us know about the results of your tests. Then we can decide what to
>> do.
> Can this be fixed simply by using #parallel-build #f ? Doing so makes it 
> reproducible for me so arguably we should be doing that anyway, even if 
> it takes longer to build.

Any chance you can diff the outputs? If there is a race condition or
similar it may indeed fix some problems, but it would be nice to notify
upstream about it.

I opened an issue on their bug tracker, and the suggestion was to
disable BLAS: https://github.com/davisking/dlib/issues/197

Without OpenBLAS dlib will use an internal BLAS implementation. I'm
fairly certain that will at least fix the crash on x86_64, which was
a segfault in libopenblasp-r0.2.15.so when we had LAPACK in inputs, but
seems to consistently trigger on Hydra regardless.

I got busy this weekend, but will try to reproduce the i686 errors this
week; and also check if the newer openblas in core-updates solves the
x86_64 segfault.

Stay tuned...

-marius
Marius Bakke Aug. 23, 2016, 6:33 p.m. UTC | #6
> Without OpenBLAS dlib will use an internal BLAS implementation. I'm
> fairly certain that will at least fix the crash on x86_64, which was
> a segfault in libopenblasp-r0.2.15.so when we had LAPACK in inputs, but
> seems to consistently trigger on Hydra regardless.
>
> I got busy this weekend, but will try to reproduce the i686 errors this
> week; and also check if the newer openblas in core-updates solves the
> x86_64 segfault.
>
> Stay tuned...

Short update: I can successfully reproduce the i686 test failures simply
by `guix build --system=i686-linux` on x86_64. Removing OpenBLAS from
inputs had no effect. Now to figure out what's going on..

Marius
diff mbox

Patch

From 77f74972d095aeb08367e00c9d683137bd7a35f3 Mon Sep 17 00:00:00 2001
From: Marius Bakke <mbakke@fastmail.com>
Date: Sat, 13 Aug 2016 11:26:10 +0100
Subject: [PATCH] gnu: Add dlib.

* gnu/packages/machine-learning.scm (dlib): New variable.
---
 gnu/packages/machine-learning.scm | 67 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 66 insertions(+), 1 deletion(-)

diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index f96672c..4332045 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -1,6 +1,7 @@ 
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2015, 2016 Ricardo Wurmus <rekado@elephly.net>
 ;;; Copyright © 2016 Efraim Flashner <efraim@flashner.co.il>
+;;; Copyright © 2016 Marius Bakke <mbakke@fastmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -27,18 +28,21 @@ 
   #:use-module (guix build-system gnu)
   #:use-module (guix build-system r)
   #:use-module (gnu packages)
+  #:use-module (gnu packages algebra)
   #:use-module (gnu packages autotools)
   #:use-module (gnu packages boost)
   #:use-module (gnu packages compression)
   #:use-module (gnu packages dejagnu)
   #:use-module (gnu packages gcc)
+  #:use-module (gnu packages image)
   #:use-module (gnu packages maths)
   #:use-module (gnu packages perl)
   #:use-module (gnu packages pkg-config)
   #:use-module (gnu packages python)
   #:use-module (gnu packages statistics)
   #:use-module (gnu packages swig)
-  #:use-module (gnu packages xml))
+  #:use-module (gnu packages xml)
+  #:use-module (gnu packages xorg))
 
 (define-public libsvm
   (package
@@ -467,3 +471,64 @@  geometric models.")
      "This package provides functions for feed-forward neural networks with a
 single hidden layer, and for multinomial log-linear models.")
     (license (list license:gpl2+ license:gpl3+))))
+
+(define-public dlib
+  (package
+    (name "dlib")
+    (version "19.1")
+    (source (origin
+              (method url-fetch)
+              (uri (string-append
+                    "http://dlib.net/files/dlib-" version ".tar.bz2"))
+              (sha256
+               (base32
+                "0p2pvcdalc6jhb6r99ybvjd9x74sclr0ngswdg9j2xl5pj7knbr4"))
+              (modules '((guix build utils)))
+              (snippet
+               '(begin
+                  ;; Delete ~13MB of bundled dependencies.
+                  (delete-file-recursively "dlib/external")
+                  (delete-file-recursively "docs/dlib/external")))))
+    (build-system cmake-build-system)
+    (arguments
+     `(#:phases
+       (modify-phases %standard-phases
+         (add-after 'unpack 'disable-asserts
+           (lambda _
+             ;; config.h recommends explicitly enabling or disabling asserts
+             ;; when building as a shared library. By default neither is set.
+             (substitute* "dlib/config.h"
+               (("^//#define DLIB_DISABLE_ASSERTS") "#define DLIB_DISABLE_ASSERTS"))
+             #t))
+         (replace 'check
+           (lambda _
+             ;; No test target, so we build and run the unit tests here.
+             (let ((test-dir (string-append "../dlib-" ,version "/dlib/test/build")))
+               (mkdir-p test-dir)
+               (with-directory-excursion test-dir
+                 (and (zero? (system* "cmake" ".."))
+                      (zero? (system* "cmake" "--build" "." "--config" "Release"))
+                      (zero? (system* "./dtest" "--runall")))))))
+         (add-after 'install 'delete-static-library
+           (lambda* (#:key outputs #:allow-other-keys)
+             (delete-file (string-append (assoc-ref outputs "out") "/lib/libdlib.a")))))))
+    (native-inputs
+     `(("pkg-config" ,pkg-config)))
+    (inputs
+     `(("fftw" ,fftw)
+       ("giflib" ,giflib)
+       ;("lapack" ,lapack) XXX lapack here causes test failures in some setups.
+       ("libjpeg" ,libjpeg)
+       ("libpng" ,libpng)
+       ("libx11" ,libx11)
+       ("openblas" ,openblas)
+       ("zlib" ,zlib)))
+    (synopsis
+     "Toolkit for making machine learning and data analysis applications in C++")
+    (description
+     "Dlib is a modern C++ toolkit containing machine learning algorithms and
+tools.  It is used in both industry and academia in a wide range of domains
+including robotics, embedded devices, mobile phones, and large high performance
+computing environments.")
+    (home-page "http://dlib.net")
+    (license license:boost1.0)))
-- 
2.9.2