Patchwork GSoC NPM

login
register
mail settings
Submitter Jan Nieuwenhuizen
Date Sept. 4, 2016, 2:11 p.m.
Message ID <877farzrdl.fsf@gnu.org>
Download mbox | patch
Permalink /patch/15282/
State New
Headers show

Comments

Jan Nieuwenhuizen - Sept. 4, 2016, 2:11 p.m.
Jelle Licht writes:

Hi Jelle!

Here's a new patch replacing the previous one, summary

   * add --binary option to importer, sets (arguments (#:binary? #t))
   * use `npm build' for non-binary packages as fallback (WAS: skip)
   * use `npm install -g' for non-binary packages; fixes e.g. loadash
   * fallback for packages dist tarball/binary-only: e.g.: http
   * handle packages without any tags in git, e.g.: cjson
   * handle packages with version mismatch, e.g.: xmldom

With these small additions to your work I'm able to automagically fetch
the full list of 318 (binary) packages that I need for my client's
project (which has 40 toplevel dependencies such as bunyan, express,
jison, jquery, nodemailer, pg, q, socket.io, underscore, xmldom).

> The short of it is that the dist tarball does not always contain the
> actual source code.  Examples of this include generated code, minified
> code etc.

Yes, I see that now.  David remarked that the dist tarball should be
considered to be a binary package.

> The devDependencies are, in these cases, the things we need to be able
> to actually build the package. Examples of this include gulp, grunt,
> and several testing frameworks.

Yes...and here is where it starts getting interesting.

I made several attempts to build packages from source, but except for
packages that imho should not be allowed to exist such as `array-equal',
that seemed next to impossible.  Maybe I was unlucky, or maybe I am
missing something?

As a first attempt, I tried to recursively import `q', a fairly basic
package from my possibly ignorant perspective: can you write anything
non-trivial in node without using q?.  When that resulted in over 6004
dependencies (using build systems grunt, gulp and node-gyp, listing 582
errors), I was pretty sure there was a problem with your importer.
Using the --binary option, q has no dependencies.  None.  Single
package.  Hmm.

The `babel' package, a prerequisite for the the `gulp' build system
which is needed to build the `har-validator' library needed to run the
`node-gyp' build system, has a list of over 6000 dependencies.

Build systems building build systems...

> For simple packages, the difference between a npm tarball and a GH
> tarball/repo are non-existent. I made the choice to skip the npm
> tarball because I'd rather err on the side of caution, and not let
> people download and run these non-source packages by accident ;-).

Yes, that makes sense.  I found that the `http' package has this binary
form only so I added it as a fallback for now.

> I will have more time to see this through next week.

That's great, thanks.

Greetings,
Jan

See https://gitlab.com/janneke/guix.git -- branch npm-binary
David Thompson - Sept. 6, 2016, 3:48 p.m.
On Sun, Sep 4, 2016 at 10:11 AM, Jan Nieuwenhuizen <janneke@gnu.org> wrote:

>    * add --binary option to importer, sets (arguments (#:binary? #t))

This violates a core principle of Guix: reproducible builds.  I don't
support patches that encourage using pre-built binaries.

> As a first attempt, I tried to recursively import `q', a fairly basic
> package from my possibly ignorant perspective: can you write anything
> non-trivial in node without using q?.  When that resulted in over 6004
> dependencies (using build systems grunt, gulp and node-gyp, listing 582
> errors), I was pretty sure there was a problem with your importer.
> Using the --binary option, q has no dependencies.  None.  Single
> package.  Hmm.

That's because the thing on npm has already been built for you.  q
*does* have dependencies if you want to build it from source, which is
what we should all be striving for.

- Dave
Pjotr Prins - Sept. 6, 2016, 4:50 p.m.
On Tue, Sep 06, 2016 at 11:48:04AM -0400, Thompson, David wrote:
> On Sun, Sep 4, 2016 at 10:11 AM, Jan Nieuwenhuizen <janneke@gnu.org> wrote:
> 
> >    * add --binary option to importer, sets (arguments (#:binary? #t))
> 
> This violates a core principle of Guix: reproducible builds.  I don't
> support patches that encourage using pre-built binaries.

In principle I agree. We want to be able to read the code.

Still, I think Guix would benefit from a somewhat more relaxed stance
in this. Especially where it comes to cross-platform binary
deployments we could be accelerate things now and then - and maybe
work on source deployment later. I am thinking of Erlang Beam and the
JVM mostly. If binaries are *trusted* we could do that. Point of note,
we distribute *trusted* binaries already. Who builds those?

I am becoming increasingly of the opinion that Guix can be a 'small'
core of rock solid software and we should provide mechanisms to wave
out in other maybe less controlled directions. Whether it is in source
or in binary form.

Pj.
Ludovic Courtès - Sept. 7, 2016, 12:25 p.m.
Howdy,

Pjotr Prins <pjotr.public12@thebird.nl> skribis:

> On Tue, Sep 06, 2016 at 11:48:04AM -0400, Thompson, David wrote:
>> On Sun, Sep 4, 2016 at 10:11 AM, Jan Nieuwenhuizen <janneke@gnu.org> wrote:
>> 
>> >    * add --binary option to importer, sets (arguments (#:binary? #t))
>> 
>> This violates a core principle of Guix: reproducible builds.  I don't
>> support patches that encourage using pre-built binaries.
>
> In principle I agree. We want to be able to read the code.
>
> Still, I think Guix would benefit from a somewhat more relaxed stance
> in this.

It’s part of Guix’s mission to build from source whenever that is
possible, which is the case here, AIUI.

We know from previous discussions that some compilers can no longer be
built from source; we already have exceptions for these.

Ludo’.
Jan Nieuwenhuizen - Sept. 7, 2016, 5:51 p.m.
Ludovic Courtès writes:

>> Still, I think Guix would benefit from a somewhat more relaxed stance
>> in this.
>
> It’s part of Guix’s mission to build from source whenever that is
> possible, which is the case here, AIUI.

Yes, I thought so too...I shared my findings to get more viewpoints on
this assertion and also on the validity of the source/binary metaphor
for npm packages.

I spent the weekend to attempt building `q' using the repository+version
= source package metaphor.  Because given the facts that

  * an npm package must specify a name and a version
  * an npm package may optionally list a source repository
  * an npm package may optionally lists devDependencies
  * using repository + the devDepencies you can build the installable
    npm package

I also thought it would be nice to build as many packages as possible
from their repsository urls.  I encountered several small problems with
the importer (or with inconsistencies/breakages in npm packages) that I
fixed.

Quoting my earlier mail

   ... that resulted in over 6004 dependencies (using build systems
   grunt, gulp and node-gyp, listing 582 errors)

Finally, I became discouraged and Sunday night I added the --binary flag
to the importer, decided to taint the resulting package description
doing

  (arguments `(#:binary #t))

to enable breaking this enormous dependency chain.  How many more
package dependencies will result from the 582 packages that have import
problems?

Oh, please note that the `binary' or installable version of this `q'
package consists of two javascript files: q.js and query.js which are
identical to the ones in the `source' package.  The git repository
additionally has tests and documentation (and history).

Another example is the `http' package: it does not list a repository url
and the installable package is plain readable javascript.  Does the
source/binary metaphor apply here?

The `fibers' package comes with precompiled binaries for popular
platforms.  It also includes all C sources to build these binaries.  It
could be possible that some npm package has only minimalized javascript
(i.e.: can really be considered binary) but I haven't seen such a
package yet.

WDYT, do we have enough information to decide if building from `source'
the right metaphor?  Is it pracically feasible and does feasibilty have
any weight?  What's the next step I could take to help to bring `q' and
`http' (and the other 316 packages I need) into Guix?

Greetings,
Jan
Mike Gerwitz - Sept. 8, 2016, 2:45 a.m.
On Tue, Sep 06, 2016 at 18:50:48 +0200, Pjotr Prins wrote:
> On Tue, Sep 06, 2016 at 11:48:04AM -0400, Thompson, David wrote:
>> This violates a core principle of Guix: reproducible builds.  I don't
>> support patches that encourage using pre-built binaries.
>
> In principle I agree. We want to be able to read the code.
>
> Still, I think Guix would benefit from a somewhat more relaxed stance
> in this.

If a user is able to build from source, shouldn't Guix be able to?  And
if neither can, how can we guarantee that the provided binary is even
free and actually corresponds to the given source?

From a software freedom perspective, the source code _is_ the
program.  If that is unworkable, then so is the software itself.
Pjotr Prins - Sept. 8, 2016, 7:01 a.m.
On Wed, Sep 07, 2016 at 07:51:46PM +0200, Jan Nieuwenhuizen wrote:
> Ludovic Courtès writes:
> 
> >> Still, I think Guix would benefit from a somewhat more relaxed stance
> >> in this.
> >
> > It’s part of Guix’s mission to build from source whenever that is
> > possible, which is the case here, AIUI.

Mission is fine and I agree with that (in principle).

> WDYT, do we have enough information to decide if building from `source'
> the right metaphor?  Is it pracically feasible and does feasibilty have
> any weight?  What's the next step I could take to help to bring `q' and
> `http' (and the other 316 packages I need) into Guix?

I think we are clear we do not want binaries in the main project
unless there is no way to do it from source.

Personally I think we should be easier on ourselves which implies that
we get multiple flavours of Guix. 

Another reason to make 'guix channels' work.

Pj.
Jelle Licht - Sept. 8, 2016, 8:29 a.m.
Just a quick note from me;

AFAIK, the http module is a built-in of node, so you can probably save
yourselves the efforts of packaging it ;-).

Furthermore, lots of development dependencies are not strictly
necessary; e.g. a minifier/uglifier is not required for most
functionality of a package, and ditto for linters and to a certain
extent test frameworks, at least for our initial set of node packages.
This initial set of packages can then (hopefully) be used to package the
rest of npm properly, including tests etc.

The biggest issue here is that an importer can not decide for you which
devDependency is actually is needed to properly build a source archive,
and which just provides convenience functions. The importer should
become more useful when we have a solid set of npm packages in guix.
Before that, the importer will probably be useful to a lesser degree for
any packages besides the most trivial.

Regarding feasibility and its weight, I would say that a simple
transformation such as concatenating files should not be an issue,
whereas more involved transformations such as tree shaking,
uglification, or tranpilation do involve a transformation that take away
much of our freedoms to modify the software, at least in practice.

- Jelle

Pjotr Prins <pjotr.public12@thebird.nl> writes:

> On Wed, Sep 07, 2016 at 07:51:46PM +0200, Jan Nieuwenhuizen wrote:
>> Ludovic Courtès writes:
>> 
>> >> Still, I think Guix would benefit from a somewhat more relaxed stance
>> >> in this.
>> >
>> > It’s part of Guix’s mission to build from source whenever that is
>> > possible, which is the case here, AIUI.
>
> Mission is fine and I agree with that (in principle).
>
>> WDYT, do we have enough information to decide if building from `source'
>> the right metaphor?  Is it pracically feasible and does feasibilty have
>> any weight?  What's the next step I could take to help to bring `q' and
>> `http' (and the other 316 packages I need) into Guix?
>
> I think we are clear we do not want binaries in the main project
> unless there is no way to do it from source.
>
> Personally I think we should be easier on ourselves which implies that
> we get multiple flavours of Guix. 
>
> Another reason to make 'guix channels' work.
>
> Pj.
Jan Nieuwenhuizen - Sept. 8, 2016, 8:45 a.m.
Mike Gerwitz writes:

> If a user is able to build from source

That's a question that I like to explore.

If a user builds an npm package from its source repository, I assume
that they install the devDependencies needed for that using npm?

The transitive closure of installing all devDependencies for the `q'
package by building them all from their source repositories, means
building > 6000 packages.

> , shouldn't Guix be able to?

> And if neither can, how can we guarantee that the provided binary is
> even free and actually corresponds to the given source?

I would also like to explore if the source/binary package metaphor is
a valid one for npm.

For the packages that I considered, I used the `diff' command to assert
that the installable npm package includes javascript and C files and are
identical to the ones in the repository.

Greetings,
Jan
Mike Gerwitz - Sept. 8, 2016, 5:31 p.m.
On Thu, Sep 08, 2016 at 10:45:57 +0200, Jan Nieuwenhuizen wrote:
> If a user builds an npm package from its source repository, I assume
> that they install the devDependencies needed for that using npm?

Unfortunately that depends on the project.  Some projects use
devDependencies only for things like linters, test runners, assertion
systems, etc; others might need them for building.

> The transitive closure of installing all devDependencies for the `q'
> package by building them all from their source repositories, means
> building > 6000 packages.

Many of those packages are shared between others.  Given a sufficiently
large pool of npm packages, there'll be a great deal of intersections in
the graph.

I haven't been following this closely enough to speak intelligently
about the conversion, though.

> I would also like to explore if the source/binary package metaphor is
> a valid one for npm.

Sure it is.

> For the packages that I considered, I used the `diff' command to assert
> that the installable npm package includes javascript and C files and are
> identical to the ones in the repository.

In some cases, this will be true.  Possibly in a majority.  Think of npm
as publishing the results of `make dist` (literally, that's what I
do).  That could do anything, it could do pretty much nothing.  If a
Perl/Python/PHP/Ruby/Scheme/<insert interpreted language here> script is
in the distribution tarball unmodified from the source, what
considerations do we give it when packaging for, say, Debian?

But we'd have to know that on a case-by-case basis.  If we want a
general solution to this problem, we wouldn't want to add a bunch of
exceptions.

If it's literally publishing the source code repository (which many
are), then there is no distinction.  But we'd have to know that to be
true.
Jan Nieuwenhuizen - Sept. 8, 2016, 7:54 p.m.
Mike Gerwitz writes:

> On Thu, Sep 08, 2016 at 10:45:57 +0200, Jan Nieuwenhuizen wrote:
>> If a user builds an npm package from its source repository, I assume
>> that they install the devDependencies needed for that using npm?
>
> Unfortunately that depends on the project.  Some projects use
> devDependencies only for things like linters, test runners, assertion
> systems, etc; others might need them for building.

The question I'm trying to answer is: how does `a user' who builds a
package from the repository install the needed dependencies.

I very much doubt that users install the essential dependencies all by
building those from the source repository.  How would they do that?

My working hypothesis is that it's impossible to do so for any
moderately interesting npm package.  And I would very much like someone
to show me (with working code) that instead it is possible.

>> The transitive closure of installing all devDependencies for the `q'
>> package by building them all from their source repositories, means
>> building > 6000 packages.
>
> Many of those packages are shared between others.

Not so.  The total sum of interrelated dependencies to build `q' is over
41,000.  The number of imported packages for `q' using Jelle's importer
with some small fixes by me is over 6,000 unique dependencies and over
500 that can currently not be resolved by the importer and error out.

Please show me that building `q' this way is possible and what the
benefits are (in terms of software freedom) of spending our energy by
upholding the source/binary metaphor (even if for a majority of packages
there may not be a difference).

Greetings,
Jan
Mike Gerwitz - Sept. 9, 2016, 12:31 a.m.
On Thu, Sep 08, 2016 at 21:54:36 +0200, Jan Nieuwenhuizen wrote:
> The question I'm trying to answer is: how does `a user' who builds a
> package from the repository install the needed dependencies.

Sorry, I misinterpreted.

`npm install <pkg>' will by default install all devDependencies; the
`--production' flag suppresses that behavior.

Many packages define a command to be run when `npm test` is invoked,
which would in turn need the devDependencies to run the test suite.

> I very much doubt that users install the essential dependencies all by
> building those from the source repository.  How would they do that?

No, they don't.  I'm not sure if it's even possible with how npm works,
though I haven't done that sort of research.

But that'd be Guix' responsibility---just because npm doesn't offer a
way to do that doesn't mean that Guix can't, provided that there is an
automated way to track down each of the packages and determine how they
are built.  Some might use Make, some Grunt, nothing, etc.

> My working hypothesis is that it's impossible to do so for any
> moderately interesting npm package.  And I would very much like someone
> to show me (with working code) that instead it is possible.

I'm hoping such code is precisely what this project produces. :)

> what the benefits are (in terms of software freedom) of spending our
> energy by upholding the source/binary metaphor (even if for a majority
> of packages there may not be a difference).

As I mentioned, I don't see a difference between this situation and
packaging other software that has no distinction between source code and
"binary" distribution.  It's just a hell of a lot more complex and the
package manager used to manage these packages (npm) doesn't care about
these issues.

Corresponding source code must include everything needed to build the
software, and must be the preferred form of modifying that
software.  This assumption cannot be made with the state of the packages
in the npm repository.  Some of the files might not even be in the
source repository (e.g. generated).

I have great faith in Guix and its mission; it would be a shame to see
that tainted by something like this.  Normally someone will look over a
package manually before adding it; but mass-adding thousands of packages
in an automated manner is even _more_ of an argument for the importance
not trusting binary distributions.


With all that said, I have no idea how it'll be done.  Someone could
just add any old file to the published npm package and never commit it
to any source repository.  I've done that accidentally.  I don't know
how you'd handle situations like that.  I don't know how you'd handle a
situation where a build script doesn't even work on a machine other than
the author's.  I don't know how you confirm that the software produced
from a build will actually be the same as the software normally
installed when you invoke `npm install <pkg>`.

If a package doesn't build from source, contain all the necessary files,
etc, it's not practical to exercise your freedoms, and so including it
would be bad.  But if one dependency out of thousands has that problem,
then what?  If one of the dependencies happens to actually contain
non-free code, then what?

When I evaluate software offered to GNU, I have to consider each and
every dependency.  This usually isn't a difficult task---many libraries
are standard, have already been reviewed by a project like Debian, and
there's rarely more than a few dozen of them.  I then build it from
source.  If I have the packages on my system (Trisquel at present), I
know that they're free and can be built from source.  Otherwise, I must
compile the library myself, recursively as needed for all
dependencies (which is usually not an issue).  If any of those were a
problem at any point, then the whole of the project is a problem.  If
any of those dependencies are non-free, the whole of the project is
non-free unless it can be swapped out.  If one obscure library requires
several dark incantations and a few dead chickens, users can't
practically exercise their freedoms, and that would be a problem for the
package.

Now how the hell is this enforced with thousands of dependencies that
have not undergone any review, within a community that really couldn't
care less?  Even something as simple as the license: package.json has no
legal force; it's _metadata_.

I feel like this will have to be manually checked no matter how it is
done; any automated process would just be a tool to aid in a
transition and keeping a package up-to-date.  I don't really see any
other way.

So I think that I share in your concern with how such a thing would
possible be done.  My point is that if it can't, it shouldn't be at all
(where is where we differ).
Ludovic Courtès - Sept. 9, 2016, 8:45 a.m.
Hi Mike,

Mike Gerwitz <mtg@gnu.org> skribis:

> Now how the hell is this enforced with thousands of dependencies that
> have not undergone any review, within a community that really couldn't
> care less?  Even something as simple as the license: package.json has no
> legal force; it's _metadata_.
>
> I feel like this will have to be manually checked no matter how it is
> done; any automated process would just be a tool to aid in a
> transition and keeping a package up-to-date.  I don't really see any
> other way.
>
> So I think that I share in your concern with how such a thing would
> possible be done.  My point is that if it can't, it shouldn't be at all
> (where is where we differ).

Yes, that’s a serious concern.  Maybe all we can reasonably hope to
achieve is to provide a core subset of the free NPM packages in Guix
proper, built from source.

People may still end up using automatically-generated, unchecked
packages for the rest.  Nevertheless, that would be an improvement over
the status quo.

(PyPI, Hackage, CPAN, and CRAN seem to be less problematic in this
regard, maybe because they are “culturally closer” to the free software
movement.)

Ludo’.
Pjotr Prins - Sept. 9, 2016, 9:26 a.m.
On Fri, Sep 09, 2016 at 10:45:43AM +0200, Ludovic Courtès wrote:
 
> Yes, that’s a serious concern.  Maybe all we can reasonably hope to
> achieve is to provide a core subset of the free NPM packages in Guix
> proper, built from source.
> 
> People may still end up using automatically-generated, unchecked
> packages for the rest.  Nevertheless, that would be an improvement over
> the status quo.
> 
> (PyPI, Hackage, CPAN, and CRAN seem to be less problematic in this
> regard, maybe because they are “culturally closer” to the free software
> movement.)

Not quite true, though there are generally less dependencies to deal
with. I still install packages using those language systems -
especially with Ruby, R, D and Elixir. It does not matter. Once I want
robustness I make sure to package in Guix. npm is just the worst of
the lot because of the sheer size, stupidity and circular
dependencies.

We should really think a bit harder about the transitional phase.
Also, software development goes faster in general than that we can
package. 

My take is that GNU Guix proper should be lean, mean and robust. That
way we can maintain and rely on stuff. 

For the more experimental packages and other 'solutions' we ought to
depend on channels - or distributed package sources. These need not
take the purist view.

Pj.

Patch

From c60e72504a8ba4bb6a90c07bef7844d461a12467 Mon Sep 17 00:00:00 2001
From: Jan Nieuwenhuizen <janneke@gnu.org>
Date: Fri, 2 Sep 2016 16:16:35 +0200
Subject: [PATCH] npm importer: support --binary and fixes for e.g.: cjson,
 http, xmldom.

* gnu/nmp.scm: New file.
* gnu/local.mk (GNU_SYSTEM_MODULES): Add it.
* guix/scripts/import/npm.scm: Add --binary option.
* guix/import/npm.scm (gh-fuzzy-tag-match): Add two fallbacks: missing /TAGS
and VERSION mismatch.
(strip-.git-if-needed project): New function.
(github-user-slash-repository, github-repository): Use it.
(source-uri): Fallback to use `binary' (dist . tarball).  Add optional binary?
parameter to prefer binary fallback.
(spdx-string->license): Add LGPL, fix LGPL-3.0.
(make-npm-sexp): Add optional binary? parameter to set #:binary? argument.
(npm->guix-package): Add optional binary? parameter to set #:binary? argument
to ignore devDependencies.
(recursive-import): Add optional binary? parameter.
* guix/build-system/node.scm (node-build): Add binary? and make-flags keys.
* guix/build/node-build-system (build): Also check for `Gulpfile.js', fallback
to generic `npm build'.  Skip build if #:binary?.
(binary-install): Rename from install.
(npm-install): New function.
(install): Have #:binary? switch between binary-install, and npm-install.
(package-origin): Handle registry.npmjs.org url.
(npm->guix-package)[npm-binary?]: Discard devDependencies.
---
 gnu/local.mk                     |   1 +
 gnu/packages/npm.scm             |  34 +++++++++
 guix/build-system/node.scm       |   4 +
 guix/build/node-build-system.scm |  30 ++++++--
 guix/import/npm.scm              | 161 +++++++++++++++++++++++++++------------
 guix/scripts/import/npm.scm      |  13 +++-
 6 files changed, 186 insertions(+), 57 deletions(-)
 create mode 100644 gnu/packages/npm.scm

diff --git a/gnu/local.mk b/gnu/local.mk
index b9d2a11..4fa94c7 100644
--- a/gnu/local.mk
+++ b/gnu/local.mk
@@ -255,6 +255,7 @@  GNU_SYSTEM_MODULES =				\
   %D%/packages/nettle.scm			\
   %D%/packages/networking.scm			\
   %D%/packages/ninja.scm			\
+  %D%/packages/npm.scm				\
   %D%/packages/node.scm				\
   %D%/packages/noweb.scm			\
   %D%/packages/ntp.scm				\
diff --git a/gnu/packages/npm.scm b/gnu/packages/npm.scm
new file mode 100644
index 0000000..43b7774
--- /dev/null
+++ b/gnu/packages/npm.scm
@@ -0,0 +1,34 @@ 
+;;; GNU Guix --- Functional package management for GNU
+;;; Copyright © 2016 Jan Nieuwenhuizen <janneke@gnu.org>
+;;;
+;;; This file is part of GNU Guix.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (gnu packages npm)
+  #:use-module (guix licenses)
+  #:use-module (guix packages)
+  #:use-module (guix download)
+  #:use-module (guix build-system node)
+  #:use-module (gnu packages base)
+  #:use-module (gnu packages commencement)
+  #:use-module (gnu packages gcc)
+  #:use-module (gnu packages perl)
+  #:use-module (gnu packages python))
+
+(define npm-license-unknown public-domain)
+
+#!
+for i in array-equal async-q q cjson http fs-extra xmldom; do ./pre-inst-env guix import import --recursive --binary $i >> gnu/packages/npm.scm; make; done
+!#
diff --git a/guix/build-system/node.scm b/guix/build-system/node.scm
index a7b71e6..99e0ef0 100644
--- a/guix/build-system/node.scm
+++ b/guix/build-system/node.scm
@@ -75,10 +75,12 @@  registry."
 
 (define* (node-build store name inputs
                      #:key
+                     (binary? #f)
                      (npm-flags ''())
                      (global? #f)
                      (test-target "test")
                      (tests? #f)
+                     (make-flags ''())
                      (phases '(@ (guix build node-build-system)
                                  %standard-phases))
                      (outputs '("out"))
@@ -103,6 +105,8 @@  registry."
                                 source))
                    #:system ,system
                    #:npm-flags ,npm-flags
+                   #:make-flags ,make-flags                   
+                   #:binary? ,binary?
                    #:global? ,global?
                    #:test-target ,test-target
                    #:tests? ,tests?
diff --git a/guix/build/node-build-system.scm b/guix/build/node-build-system.scm
index 35767d6..1077201 100644
--- a/guix/build/node-build-system.scm
+++ b/guix/build/node-build-system.scm
@@ -50,17 +50,23 @@ 
             (find-files "." "(min\\.js|min\\.js\\.map|min\\.map)$"))
   #t)
 
-(define* (build #:key outputs inputs #:allow-other-keys)
+(define* (build #:key outputs binary? (make-flags '()) (npm-flags '())
+                #:allow-other-keys)
   "Build a new node module using the appropriate build system."
   ;; XXX: Develop a more robust heuristic, allow override
-  (cond ((file-exists? "gulpfile.js")
+  (cond (binary? #t)
+        ((or (file-exists? "gulpfile.js")
+             (file-exists? "Gulpfile.js"))
          (zero? (system* "gulp")))
         ((file-exists? "gruntfile.js")
          (zero? (system* "grunt")))
+        ((file-exists? "binding.gyp")
+         (and (zero? (system* "node-gyp.js" "configure"))
+              (zero? (system* "node-gyp.js" "build"))))
         ((file-exists? "Makefile")
-         (zero? (system* "make")))
+         (zero? (apply system* "make" `(,@make-flags))))
         (else
-         #t)))
+         (zero? (apply system* "npm" "build" `(,@npm-flags))))))
 
 (define* (check #:key tests? #:allow-other-keys)
   "Run 'npm test' if TESTS?"
@@ -69,7 +75,7 @@ 
       (zero? (system* "npm" "test"))
       #t))
 
-(define* (install #:key outputs inputs global? #:allow-other-keys)
+(define* (binary-install #:key outputs inputs global? #:allow-other-keys)
   "Install the node module to the output store item. MODULENAME defines how
 under which name the module will be installed, GLOBAL? determines whether this
 is an npm global install."
@@ -86,6 +92,20 @@  is an npm global install."
       (symlink (string-append tgt-dir "/node_modules/" modulename "/bin") bin-dir))
     #t))
 
+(define* (npm-install #:key outputs inputs (npm-flags '()) #:allow-other-keys)
+  "Install the node module to the output store item. MODULENAME defines how
+under which name the module will be installed, GLOBAL? determines whether this
+is an npm global install."
+  (let* ((out (assoc-ref outputs "out"))
+         (home (string-append "/tmp/home")))
+    (setenv "HOME" home)
+    (zero? (apply system* "npm" "install" "-g" "--prefix" out `(,@npm-flags)))))
+
+(define* (install #:key outputs inputs binary? global? (npm-flags '())
+                  #:allow-other-keys)
+  (if binary?
+      (binary-install #:outputs outputs #:inputs inputs #:global? global?)
+      (npm-install #:outputs outputs #:global? global? #:npm-flags #:npm-flags)))
 
 (define %standard-phases
   (modify-phases gnu:%standard-phases
diff --git a/guix/import/npm.scm b/guix/import/npm.scm
index b6c9120..5d6bd9e 100644
--- a/guix/import/npm.scm
+++ b/guix/import/npm.scm
@@ -1,6 +1,7 @@ 
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2015 David Thompson <davet@gnu.org>
 ;;; Copyright © 2016 Jelle Licht <jlicht@fsfe.org>
+;;; Copyright © 2016 Jan Nieuwenhuizen <janneke@gnu.org>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -47,6 +48,7 @@ 
   #:use-module (guix packages)
   #:use-module (gnu packages)
   #:use-module (guix build-system node)
+  #:use-module (guix build node-build-system)
   #:export (npm->guix-package
             recursive-import))
 
@@ -187,10 +189,10 @@  GITHUB-REPO"
                       "https://api.github.com/repos/"
                       (github-user-slash-repository github-repo)
                       "/tags"))
-         (json (json-fetch*
-                (if token
-                    (string-append api-url "?access_token=" token)
-                    api-url))))
+         (api-url (if token
+                      (string-append api-url "?access_token=" token)
+                      api-url))
+         (json (json-fetch* api-url)))
     (if (eq? json #f)
         (if token
             (error "Error downloading release information through the GitHub
@@ -208,28 +210,50 @@  api-url))
                     (member name fuzzy-tags)))
                 json)))
           (match proper-release
-            (()                       ;empty release list
-             #f)
+            (()                       ;fuzzy version mismatch
+             (if (pair? json)
+                 (begin
+                   ;;XXX: Just pick first release
+                   ;; e.g.: xmldom 0.1.16 vs 0.1.22
+                   (hash-ref (car json) "name"))
+                 ;;XXX: No tags: Just pick latest commit from master
+                 ;; e.g.: cjson
+                 ;; TODO: iso master, snarf default_branch from /
+                 (let* ((branches-url (string-replace-substring api-url "/tags" "/branches"))
+                        (branches (json-fetch* branches-url))
+                        (first-or-master
+                         (or
+                          (find (lambda (x) (equal? (hash-ref x "name") "master"))
+                                branches)
+                          (car branches)))
+                        (commit (hash-ref first-or-master "commit"))
+                        (sha (hash-ref commit "sha")))
+                   sha)))
             ((release . rest)         ;one or more releases
              ;;XXX: Just pick the first release
              (let ((tag (hash-ref release "name")))
                tag)))))))
 
+(define (strip-.git-if-needed project)
+  ;; for babel, e.g. project does not end in `.git'
+  (if (string-suffix? ".git" project)
+      (string-drop-right project 4)
+      project))
 
 (define (github-user-slash-repository github-url)
   "Return a string e.g. arq5x/bedtools2 of the owner and the name of the
 repository separated by a forward slash, from a string URL of the form
 'https://github.com/arq5x/bedtools2.git'"
   (match (string-split (uri-path (string->uri github-url)) #\/)
     ((_ owner project . rest)
-     (string-append owner "/" (string-drop-right project 4)))))
+     (string-append owner "/" (strip-.git-if-needed project)))))
 
 (define (github-repository github-url)
   "Return a string e.g. bedtools2 of the name of the repository, from a string
 URL of the form 'https://github.com/arq5x/bedtools2.git'"
   (match (string-split (uri-path (string->uri github-url)) #\/)
     ((_ owner project . rest)
-     (string-drop-right project 4))))
+     (strip-.git-if-needed project))))
 
 (define (github-release-url github-url version)
   "Return the url for the tagged release VERSION on the github repo found at
@@ -263,10 +288,19 @@  GITHUB-URL."
   "Return true if PACKAGE is a node package."
   (string-prefix? "node-" (package-name package)))
 
-(define (source-uri npm-meta version)
+(define* (source-uri npm-meta version #:optional binary?)
   "Return the repository url for version VERSION of NPM-META"
-  (let* ((v    (assoc-ref* npm-meta "versions" version)))
-    (normalise-url (assoc-ref* v "repository" "url"))))
+  (let* ((v    (assoc-ref* npm-meta "versions" version))
+         (repo (assoc-ref v "repository"))
+         (dist (assoc-ref v "dist")))
+    (or
+     (and binary? dist
+          (assoc-ref dist "tarball"))
+     (and repo
+          (and=> (assoc-ref repo "url") normalise-url))
+     ;; fallback for `binary'-only packages, e.g.: http
+     (and dist
+          (assoc-ref dist "tarball")))))
 
 (define (guix-hash-url path)
   "Return the hash of PATH in nix-base32 format. PATH can be either a file or
@@ -319,11 +353,12 @@  package."
     ("IJG" 'ijg)
     ("Imlib2" 'imlib2)
     ("IPA" 'ipa)
+    ("LGPL" 'lgpl2.0)
     ("LGPL-2.0" 'lgpl2.0)
     ("LGPL-2.0+" 'lgpl2.0+)
     ("LGPL-2.1" 'lgpl2.1)
     ("LGPL-2.1+" 'lgpl2.1+)
-    ("LGPL-3.0" 'lgpl3.0)
+    ("LGPL-3.0" 'lgpl3)
     ("MPL-1.0" 'mpl1.0)
     ("MPL-1.1" 'mpl1.1)
     ("MPL-2.0" 'mpl2.0)
@@ -359,35 +394,50 @@  command."
 located at REPO-URL. Tries to locate a released tarball before falling back to
 a git checkout."
   (let ((uri (string->uri repo-url)))
-    (if (equal? (uri-host uri) "github.com")
-        (call-with-temporary-output-file
-         (lambda (temp port)
-           (let* ((gh-version (gh-fuzzy-tag-match repo-url version))
-                  (tb (github-release-url repo-url gh-version))
-                  (result (url-fetch tb temp))
-                  (hash (bytevector->nix-base32-string (port-sha256 port))))
-             (close-port port)
-             `(origin
-                (method url-fetch)
-                (uri ,tb)
-                (sha256
-                 (base32
-                  ,hash))))))
-        (call-with-temporary-directory
-         (lambda (temp-dir)
-           (let ((fuzzy-version (generic-fuzzy-tag-match repo-url version)))
-             (and (node-git-fetch repo-url fuzzy-version temp-dir)
-                  `(origin
-                     (method git-fetch)
-                     (uri (git-reference
-                           (url ,repo-url)
-                           (commit ,fuzzy-version)))
-                     (sha256
-                      (base32
-                       ,(guix-hash-url temp-dir)))))))))))
+    (cond
+     ((equal? (uri-host uri) "registry.npmjs.org")
+      (call-with-temporary-output-file
+       (lambda (temp port)
+         (let* ((result (url-fetch repo-url temp))
+                (hash (bytevector->nix-base32-string (port-sha256 port))))
+           (close-port port)
+           `(origin
+              (method url-fetch)
+              (uri ,repo-url)
+              (sha256
+               (base32
+                ,hash)))))))
+     ((equal? (uri-host uri) "github.com")
+      (call-with-temporary-output-file
+       (lambda (temp port)
+         (let* ((gh-version (gh-fuzzy-tag-match repo-url version))
+                (tb (github-release-url repo-url gh-version))
+                (result (url-fetch tb temp))
+                (hash (bytevector->nix-base32-string (port-sha256 port))))
+           (close-port port)
+           `(origin
+              (method url-fetch)
+              (uri ,tb)
+              (sha256
+               (base32
+                ,hash)))))))
+     (else
+      (call-with-temporary-directory
+       (lambda (temp-dir)
+         (let ((fuzzy-version (generic-fuzzy-tag-match repo-url version)))
+           (and (node-git-fetch repo-url fuzzy-version temp-dir)
+                `(origin
+                   (method git-fetch)
+                   (uri (git-reference
+                         (url ,repo-url)
+                         (commit ,fuzzy-version)))
+                   (sha256
+                    (base32
+                     ,(guix-hash-url temp-dir))))))))))))
 
 (define (make-npm-sexp name version home-page description
-                       dependencies dev-dependencies license source-url)
+                       dependencies dev-dependencies license source-url
+                       binary?)
   "Return the `package' s-expression for a Node package with the given NAME,
 VERSION, HOME-PAGE, DESCRIPTION, DEPENDENCIES, DEV-DEPENDENCIES, LICENSES and
 SOURCE-URL."
@@ -415,6 +465,9 @@  SOURCE-URL."
                            (,'unquote
                             ,(string->symbol name))))
                        dev-dependencies)))))
+       ,@(if (not binary?)
+             '()
+             '((arguments `(#:binary? #t))))
        (synopsis ,description) ; no synopsis field in package.json files
        (description ,description)
        (home-page ,home-page)
@@ -444,23 +497,32 @@  npm list of dependencies DEPENDENCIES."
       (spdx-string->license (assoc-ref license-entry "type")))
      ((string? license-legacy)
       (spdx-string->license license-legacy))
+     ((and (pair? license-legacy) (string? (car license-legacy)))
+      (if (= (length license-legacy) 1)
+          (spdx-string->license (car license-legacy))
+          (map spdx-string->license license-legacy)))
      ((and license-legacy (positive? (length license-legacy)))
       `(list ,@(map
                 (lambda (l) (spdx-string->license (assoc-ref l "type")))
                 license-legacy)))
      (else
+      (format (current-error-port) "extract-license: no license found: ~a\n" package-json)
       #f))))
 
-(define (npm->guix-package package-name)
+(define* (npm->guix-package package-name #:optional binary?)
   "Fetch the metadata for PACKAGE-NAME from registry.npmjs.com and return the
- `package' s-expression corresponding to that package, or  on failure."
+`package' s-expression corresponding to that package, or on failure.  If
+BINARY?, use the `binary' dist tarball as source url and ignore any
+devDependencies."
   (let ((package (npm-fetch package-name)))
     (if package
         (let* ((name (assoc-ref package "name"))
                (version (latest-source-release package))
                (curr (assoc-ref* package "versions" version))
                (raw-dependencies (assoc-ref curr "dependencies"))
-               (raw-dev-dependencies (assoc-ref curr "devDependencies"))
+               (raw-dev-dependencies (if binary?
+                                         #f
+                                         (assoc-ref curr "devDependencies")))
                (dependencies (extract-guix-dependencies raw-dependencies))
                (dev-dependencies (extract-guix-dependencies
                                   raw-dev-dependencies))
@@ -469,19 +531,20 @@  npm list of dependencies DEPENDENCIES."
                  (extract-npm-dependencies raw-dependencies)
                  (extract-npm-dependencies raw-dev-dependencies)))
                (description (assoc-ref package "description"))
-               (home-page (assoc-ref package "homepage"))
-               (license (extract-license curr))
-               (source-url (source-uri package version)))
+               (home-page (or (assoc-ref package "homepage") "http://npmjs.com"))
+               (license (or (extract-license curr) 'npm-license-unknown))
+               (source-url (source-uri package version binary?)))
           (values 
            (make-npm-sexp name version home-page description
-                          dependencies dev-dependencies license source-url)
+                          dependencies dev-dependencies license source-url
+                          binary?)
            npm-dependencies))
         (error "Could not download metadata:" package-name))))
 
-(define* (recursive-import package-name)
+(define* (recursive-import package-name #:optional binary?)
   "Recursively fetch the metadata for PACKAGE-NAME and its dependencies from
 registry.npmjs.com and return a list of 'package-name, package s-expression'
-tuples."
+tuples.  If BINARY?, use the `binary' tarball from the dist field."
   (define (seen? item seen)
     (or (vhash-assoc item seen)
         (not (null? (find-packages-by-name (node-package-name item))))))
@@ -501,7 +564,7 @@  tuples."
              (receive (package dependencies)
                  (catch #t
                    (lambda ()
-                     (npm->guix-package package-name))
+                     (npm->guix-package package-name binary?))
                    (lambda (key . parameters)
                      (format (current-error-port)
                              "Uncaught throw to '~a: ~a\n" key parameters)
diff --git a/guix/scripts/import/npm.scm b/guix/scripts/import/npm.scm
index 79abcf0..8e39381 100644
--- a/guix/scripts/import/npm.scm
+++ b/guix/scripts/import/npm.scm
@@ -1,5 +1,6 @@ 
 ;;; GNU Guix --- Functional package management for GNU
 ;;; Copyright © 2015 David Thompson <davet@gnu.org>
+;;; Copyright © 2016 Jan Nieuwenhuizen <janneke@gnu.org>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -40,6 +41,8 @@ 
   (display (_ "Usage: guix import npm PACKAGE-NAME
    Import and convert the npm package for PACKAGE-NAME.\n"))
   (display (_ "
+     -b, --binary           use binary dist tarball for source url"))
+  (display (_ "
      -h, --help             display this help and exit"))
   (display (_ "
      -V, --version          display version information and exit"))
@@ -48,7 +51,10 @@ 
 
 (define %options
   ;; Specification of the command-line options.
-  (cons* (option '(#\h "help") #f #f
+  (cons* (option '(#\b "binary") #f #f
+                 (lambda (opt name arg result)
+                   (alist-cons 'binary? #t result)))
+         (option '(#\h "help") #f #f
                  (lambda args
                    (show-help)
                    (exit 0)))
@@ -73,6 +79,7 @@ 
                   (alist-cons 'argument arg result))
                 %default-options))
   (let* ((opts (parse-options))
+         (binary? (assoc-ref opts 'binary?))
          (args (filter-map (match-lambda
                              (('argument . value)
                               value)
@@ -88,9 +95,9 @@ 
                    `(define-public ,(string->symbol name)
                       ,pkg))
                   (_ #f))
-                (recursive-import package-name))
+                (recursive-import package-name binary?))
            ;; Single import
-           (let ((sexp (npm->guix-package package-name)))
+           (let ((sexp (npm->guix-package package-name binary?)))
              (unless sexp
                (leave (_ "failed to download meta-data for package '~a'~%")
                       package-name))
-- 
2.9.3