diff mbox

gnu: Add minced.

Message ID 87zioczycq.fsf@ike.i-did-not-set--mail-host-address--so-tickle-me
State New
Headers show

Commit Message

Marius Bakke Aug. 16, 2016, 12:34 p.m. UTC
Ben Woodcroft <b.woodcroft@uq.edu.au> writes:

> Hi Marius,
>
> Excellent to see others interested in packaging microbial bioinformatics 
> tools.

You may want to look here before packaging other microbio tools:

https://github.com/MRC-CLIMB/guix-climb

I'm currently working on upstreaming most of these :)

> I tried this in a container and it seems that it calls out to a few 
> programs preventing it from working:
>
> [env]# minced -h
> /gnu/store/8asw2i54x53rrwdr8qw4j2rpbkc9fqzz-profile/bin/minced: line 7: 
> dirname: command not found
> /gnu/store/8asw2i54x53rrwdr8qw4j2rpbkc9fqzz-profile/bin/minced: line 9: 
> dirname: command not found
> /gnu/store/8asw2i54x53rrwdr8qw4j2rpbkc9fqzz-profile/bin/minced: line 11: 
> basename: command not found
>
> So I think that (in order of preference), the source files themselves 
> should be patched with the absolute paths to these tools, the binary 
> should be wrapped, or coreutils should be propagated. For the first two 
> options, coreutils should be an input.

Good catch. Since the "minced" executable is just a wrapper script, I
opted to build my own wrapper to avoid the coreutils dependency.

> Now some small comments on the patch itself.
>
>> +    (arguments
>> +     `(#:test-target "test"
>> +       #:phases
>> +       (modify-phases %standard-phases
>> +         (delete 'configure)
>> +         (add-before 'check 'fix-test
>> +           (lambda _
>> +             ;; Fix test for latest version.
>> +             (substitute* "t/Aquifex_aeolicus_VF5.expected"
>> +               (("minced:0.1.6") "minced:0.2.0"))))
> It might be more future-proof to use '(string-append "minced:" 
> ,version)' instead of hard-coding 0.2.0.

I don't think this will be a problem in future releases. And it can also
cause problems, in case a user sets version to e.g. a commit hash.

> Also, this phase (and the next two) should end in #t since the return 
> value of substitute* is undefined.
>
>
>> +         (add-before 'install 'qualify-java-path
>> +           (lambda* (#:key inputs #:allow-other-keys)
>> +             (substitute* "minced"
>> +               ;; Set full path to java binary in wrapper script.
>> +               (("^java") (string-append (assoc-ref inputs "jre")
>> +                                         "/bin/java")))))
>> +         (replace 'install
>> +           ;; No install target.
>> +           (lambda* (#:key outputs #:allow-other-keys)
>> +             (let* ((out (assoc-ref outputs "out"))
>> +                    (bin (string-append out "/bin")))
>> +               (for-each (lambda (file)
>> +                           (install-file file bin))
>> +                         (list "minced" "minced.jar"))))))))
>> +    (native-inputs
>> +     `(("jdk", icedtea "jdk")))
>> +    (inputs
>> +     `(("jre", icedtea)))
> The commas should go after the space.

Oops, I should stop submitting patches late in the night!

>> +    (home-page"https://github.com/ctSkennerton/minced")
>> +    (synopsis "Mining CRISPRs in Environmental Datasets")
>> +    (description
>> +     "MinCED is a program to find Clustered Regularly Interspaced Short
>> +Palindromic Repeats (CRISPRs) in full genomes or environmental datasets such
>> +as metagenomes, in which sequence size can be anywhere from 100 to 800 bp.")
> That description which you took from the README is a little dated at the 
> end. How about this?
>
> "MinCED is a program to find Clustered Regularly Interspaced Short
> Palindromic Repeats (CRISPRs) in both full genomes and shorter metagenomic sequences."
>
> The rest LGTM. Thanks. Can you send an updated patch please?

Please find updated patch below. Thanks for the feedback!

Comments

Ben Woodcroft Aug. 17, 2016, 11:01 a.m. UTC | #1
On 16/08/16 22:34, Marius Bakke wrote:
> Ben Woodcroft <b.woodcroft@uq.edu.au> writes:
>
>> Hi Marius,
>>
>> Excellent to see others interested in packaging microbial bioinformatics
>> tools.
> You may want to look here before packaging other microbio tools:
>
> https://github.com/MRC-CLIMB/guix-climb
>
> I'm currently working on upstreaming most of these :)
Cool, and they generally look in quite good shape too. I'll be away next 
week, but I'd be happy to review them after that.

>> I tried this in a container and it seems that it calls out to a few
>> programs preventing it from working:
>>
>> [env]# minced -h
>> /gnu/store/8asw2i54x53rrwdr8qw4j2rpbkc9fqzz-profile/bin/minced: line 7:
>> dirname: command not found
>> /gnu/store/8asw2i54x53rrwdr8qw4j2rpbkc9fqzz-profile/bin/minced: line 9:
>> dirname: command not found
>> /gnu/store/8asw2i54x53rrwdr8qw4j2rpbkc9fqzz-profile/bin/minced: line 11:
>> basename: command not found
>>
>> So I think that (in order of preference), the source files themselves
>> should be patched with the absolute paths to these tools, the binary
>> should be wrapped, or coreutils should be propagated. For the first two
>> options, coreutils should be an input.
> Good catch. Since the "minced" executable is just a wrapper script, I
> opted to build my own wrapper to avoid the coreutils dependency.
Good idea.

>
>> Now some small comments on the patch itself.
>>
>>> +    (arguments
>>> +     `(#:test-target "test"
>>> +       #:phases
>>> +       (modify-phases %standard-phases
>>> +         (delete 'configure)
>>> +         (add-before 'check 'fix-test
>>> +           (lambda _
>>> +             ;; Fix test for latest version.
>>> +             (substitute* "t/Aquifex_aeolicus_VF5.expected"
>>> +               (("minced:0.1.6") "minced:0.2.0"))))
>> It might be more future-proof to use '(string-append "minced:"
>> ,version)' instead of hard-coding 0.2.0.
> I don't think this will be a problem in future releases. And it can also
> cause problems, in case a user sets version to e.g. a commit hash.
OK.
>> Also, this phase (and the next two) should end in #t since the return
>> value of substitute* is undefined.
>>
>>
>>> +         (add-before 'install 'qualify-java-path
>>> +           (lambda* (#:key inputs #:allow-other-keys)
>>> +             (substitute* "minced"
>>> +               ;; Set full path to java binary in wrapper script.
>>> +               (("^java") (string-append (assoc-ref inputs "jre")
>>> +                                         "/bin/java")))))
>>> +         (replace 'install
>>> +           ;; No install target.
>>> +           (lambda* (#:key outputs #:allow-other-keys)
>>> +             (let* ((out (assoc-ref outputs "out"))
>>> +                    (bin (string-append out "/bin")))
>>> +               (for-each (lambda (file)
>>> +                           (install-file file bin))
>>> +                         (list "minced" "minced.jar"))))))))
>>> +    (native-inputs
>>> +     `(("jdk", icedtea "jdk")))
>>> +    (inputs
>>> +     `(("jre", icedtea)))
>> The commas should go after the space.
> Oops, I should stop submitting patches late in the night!
>
>>> +    (home-page"https://github.com/ctSkennerton/minced")
>>> +    (synopsis "Mining CRISPRs in Environmental Datasets")
>>> +    (description
>>> +     "MinCED is a program to find Clustered Regularly Interspaced Short
>>> +Palindromic Repeats (CRISPRs) in full genomes or environmental datasets such
>>> +as metagenomes, in which sequence size can be anywhere from 100 to 800 bp.")
>> That description which you took from the README is a little dated at the
>> end. How about this?
>>
>> "MinCED is a program to find Clustered Regularly Interspaced Short
>> Palindromic Repeats (CRISPRs) in both full genomes and shorter metagenomic sequences."
>>
>> The rest LGTM. Thanks. Can you send an updated patch please?
> Please find updated patch below. Thanks for the feedback!

Pushed as '318c0ae' after adding a space in the description and only 
using only using the "out" of icedtea in the inputs.

Thanks.
ben
diff mbox

Patch

From 53602b0908956d122146baafdc1a541596df151d Mon Sep 17 00:00:00 2001
From: Marius Bakke <mbakke@fastmail.com>
Date: Mon, 15 Aug 2016 16:06:37 +0100
Subject: [PATCH] gnu: Add minced.

* gnu/packages/bioinformatics.scm (minced): New variable.
---
 gnu/packages/bioinformatics.scm | 55 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index dd69e39..4869d56 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -5,6 +5,7 @@ 
 ;;; Copyright © 2015 Andreas Enge <andreas@enge.fr>
 ;;; Copyright © 2016 Roel Janssen <roel@gnu.org>
 ;;; Copyright © 2016 Efraim Flashner <efraim@flashner.co.il>
+;;; Copyright © 2016 Marius Bakke <mbakke@fastmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -3118,6 +3119,60 @@  probabilistic distances of genome abundance and tetranucleotide frequency.")
    (license (license:non-copyleft "file://license.txt"
                                   "See license.txt in the distribution."))))
 
+(define-public minced
+  (package
+    (name "minced")
+    (version "0.2.0")
+    (source (origin
+              (method url-fetch)
+              (uri (string-append
+                    "https://github.com/ctSkennerton/minced/archive/"
+                    version ".tar.gz"))
+              (file-name (string-append name "-" version ".tar.gz"))
+              (sha256
+               (base32
+                "0wxmlsapxfpxfd3ps9636h7i2xy6la8i42mwh0j2lsky63h63jp1"))))
+    (build-system gnu-build-system)
+    (arguments
+     `(#:test-target "test"
+       #:phases
+       (modify-phases %standard-phases
+         (delete 'configure)
+         (add-before 'check 'fix-test
+           (lambda _
+             ;; Fix test for latest version.
+             (substitute* "t/Aquifex_aeolicus_VF5.expected"
+               (("minced:0.1.6") "minced:0.2.0"))
+             #t))
+         (replace 'install ; No install target.
+           (lambda* (#:key inputs outputs #:allow-other-keys)
+             (let* ((out (assoc-ref outputs "out"))
+                    (bin (string-append out "/bin"))
+                    (wrapper (string-append bin "minced")))
+               ;; Minced comes with a wrapper script that tries to figure out where
+               ;; it is located before running the JAR. Since these paths are known
+               ;; to us, we build our own wrapper to avoid coreutils dependency.
+               (install-file "minced.jar" bin)
+               (with-output-to-file wrapper
+                 (lambda _
+                   (display
+                    (string-append
+                     "#!" (which bash) "\n\n"
+                     (which java) " -jar " bin "/minced.jar \"$@\"\n"))))
+               (chmod "minced" #o555)))))))
+    (native-inputs
+     `(("jdk" ,icedtea "jdk")))
+    (inputs
+     `(("jre" ,icedtea)))
+    (home-page "https://github.com/ctSkennerton/minced")
+    (synopsis "Mining CRISPRs in Environmental Datasets")
+    (description
+     "MinCED is a program to find Clustered Regularly Interspaced Short
+Palindromic Repeats (CRISPRs) in DNA sequences. It can be used for unassembled
+metagenomic reads, but is mainly designed for full genomes and assembled
+metagenomic sequence.")
+    (license license:gpl3+)))
+
 (define-public miso
   (package
     (name "miso")
-- 
2.9.2