Wednesday, November 28, 2012

On Nix and GNU Guix


Quite recently, the GNU project has announced Guix, a new package manager for the GNU system. Guix is described on their website as:

GNU Guix is a purely functional package manager, and associated free software distribution, for the GNU system. In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection.

The announcement has apparently attracted quite a lot of attention and it has been in the news quite a lot, such as on Linux Weekly News, Phoronix and Reddit. As my frequent readers may probably notice, this description looks very much like the Nix package manager.

GNU Guix


In fact, Guix is not a new package package manager -- it's using several crucial components of the Nix package manager and gains the described unique deployment properties, such as transactional upgrades, from them.

What Guix basically provides is a new language front-end. The Nix package manager has its own domain-specific language (DSL), called the Nix expression language, to specify how packages can be built from source code and its required dependencies. I have shown many examples of Nix expressions in earlier blog posts. The Nix expression language is an external DSL, meaning that it has a custom syntax and parser to process the language.

Guix provides a different front-end using GNU Guile -- a Scheme programming language interpreter, which is blessed as the official extension language in the GNU project and embedded in a number free software programs, such as TeXmacs and Lilypond. Guix provides an internal DSL (or embedded DSL), meaning that it uses a general purpose host language (in this case Scheme) and its features to implement a DSL.

Furthermore, the Guix repository contains a small set of package specifications (comparable to Nixpkgs) that can be used to deploy a small subset of a system. The developers have the intention to allow a full GNU system to be deployed from these specifications at some point in the future.

A comparison of package specifications


So how does the way packages are specified in Nix and Guix differ to each other? Using the Nix package manager, a package such as GNU cpio, is specified as follows:
{stdenv, fetchurl}:

stdenv.mkDerivation {
  name = "cpio-2.11";

  src = fetchurl {
    url = mirror://gnu/cpio/cpio-2.11.tar.bz2;
    sha256 = "1gavgpzqwgkpagjxw72xgxz52y1ifgz0ckqh8g7cckz7jvyhp0mv";
  };

  patches = [ ./cpio-gets-undeclared.patch ];

  meta = {
    homepage = http://www.gnu.org/software/cpio/;
    longDescription = ''
      GNU cpio copies ...
    '';
    license = "GPLv3+";
  };
}

The above code fragment defines a function in the Nix expression language taking 2 arguments: stdenv is a component providing a collection of standard UNIX utilities and build tools, such as: cat, ls, gcc and make. fetchurl is used to download a file from an external source.

In the remainder of the expression, we do a function invocation to stdenv.mkDerivation, which is the Nix-way of describing a package build operation. As function arguments, we provide a package name, the source code (src which is bound to a function that fetches the tarball from a GNU mirror), a patch that fixes a certain issue and some build instructions. If the build instructions are omitted (which is the case in our example), the standard GNU Autotools build procedure is executed, i.e.: ./configure; make; make install.

The expression shown earlier merely defines a function specifying how to build a package, but does not provide the exact versions or variants of the dependencies that we should use to build it. Therefore we must also compose the package, by calling the function with the required function arguments. In Nix, we do this in a composition expression, containing an attribute set in which each attribute name is a package, while its value is a function invocation, importing package expressions and by providing their dependencies, which are defined in the same expression:
rec {
  stdenv = ...

  fetchurl = import ../pkgs/build-support/fetchurl {
    ...
  };

  cpio = import ../pkgs/tools/archivers/cpio {
    inherit stdenv fetchurl;
  };

  ...
}
In the above expression, the cpio package expression (shown earlier) is imported and called with its required function arguments that provide a particular stdenv and fetchurl instance. By running the following command-line instruction and by providing the above expression as a parameter, cpio can be built. Its result is stored in isolation in the Nix store:
$ nix-build all-packages.nix -A cpio
/nix/store/pl12qa4q1z...-cpio-2.11

In Guix, the GNU cpio package is specified as follows:

(define-module (distro packages cpio)
  #:use-module (distro)
  #:use-module (guix packages)
  #:use-module (guix download)
  #:use-module (guix build-system gnu))

(define-public cpio
  (package
    (name "cpio")
    (version "2.11")
    (source
     (origin
      (method url-fetch)
      (uri (string-append "mirror://gnu/cpio/cpio-"
                          version ".tar.bz2"))
      (sha256
       (base32
        "1gavgpzqwgkpagjxw72xgxz52y1ifgz0ckqh8g7cckz7jvyhp0mv"))))
    (build-system gnu-build-system)
    (arguments
     `(#:patches (list (assoc-ref %build-inputs
                                  "patch/gets"))))
    (inputs
     `(("patch/gets" ,(search-patch "cpio-gets-undeclared.patch"))))
    (home-page "https://www.gnu.org/software/cpio/")
    (synopsis
     "A program to create or extract from cpio archives")
    (description
     "GNU Cpio copies ...")
    (license "GPLv3+")))

As can be seen, the above code fragment defines a package in the Scheme programming language. The above code fragment defines a module (representing a single package), that depends on a collection of modules providing its build-time dependencies, such as all the other packages that are defined in the Guix repository, a module responsible for downloading files from external location and a module providing build instructions.

In the remainder of code fragment, a procedure is defined capturing the properties of the package (in this case: cpio). As can be observed, the information captured in this procedure is quite similar to the Nix expression, such as the package name, the external location from which the source code should be obtained, and the patch that fixes an issue. The build-system parameter says that the standard GNU autotools build procedure (./configure; make; make install) should be executed.

To my knowledge, the composition of the package is also done in the same specification (because it refers to modules defining package compositions, instead of a function, which arguments should be set elsewhere), as opposed to Nix, in which we typically split the build function and its composition.

GNU cpio can be built using Guix by running:
$ guix-build hello
/nix/store/pl12qa4q1z...-cpio-2.11

The above command connects to the nix-worker process (a daemon part of Nix, capable of arranging multi-user builds), generates a Nix derivation (a low-level build specification that the worker uses to perform builds) and finally builds the derivation, resulting in a Nix component containing cpio, which is (like the ordinary package manager) stored in isolation in the Nix store and achieving the same purely functional deployment properties.

Possible advantages of Guix over Nix


So you may probably wonder why Guix has been developed and what (potential) benefits it gives over Nix. The presentation given by Ludovic Courtès at the GNU Hackers Meeting, lists the following advantages:

  • because it rocks!
  • because it's GNU!
  • it has a compiler, Unicode, gettext, libraries, etc.
  • it supports embedded DSLs via macros
  • can be used both for composition and build scripts

To be quite honest, I see some potential interesting advantages in these, but they are not entirely clear to me. The first two points are subjectively defined and should not be taken seriously I guess. I assume that it rocks, because it's cool to show that something can be done and I think it's GNU (probably) because it's using the GNU Guile language which has been used as an extension language for a number of GNU packages or due to the fact that Guix has been blessed as an official GNU project.

The third point lists some potential advantages, that are related to a number of potential interesting features of the host language (GNU Guile) that can be (re)used, which in an external DSL have to be developed from scratch, taking significantly more effort. This observation corresponds to one of the advantages described by others that internal DSLs have over external DSLs -- less time to invest in developing a language and host language features that can be reused.

The fourth point (supporting embedded DSLs) is also a bit unclear to me, why this is an advantage. Yes, I've seen macros that implement stuff, such as the standard GNU Autotools build procedure, but I'm not sure in what respect this is an advantage over the Nix expression language.

The fifth point refers to the fact that Scheme can be used for both writing package specifications and their build instructions, whereas in Nix, we typically use embedded strings containing shell code performing build steps. I'm not entirely sure what Guix does different (apart from using macros) and in what respect it offers benefits compared to Nix? Are strings statically checked? automatically escaped? I haven't seen the details or I may have missed something.

My thoughts


First of all, I'd like to point out that I don't disapprove Guix. First, Nix is free software and freedom 1 says:

The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.
So for me, it's perfectly fine that the GNU project scratches their itches. I also think Guix is interesting for the following reasons:

  • It's interesting to compare an internal vs external DSL approach in deployment. Although Ludovic has listed several potential benefits, I still don't see that these are proven yet by Guix. Some questions that came into my mind are:
    • Does compiling Guix specifications give any benefits, let's say in performance or in a different aspect?
    • Why do we need modules, such as gettext, for package specifications?
    • Can Guix expressions be better debugged than Nix expressions? (I know that the Nix expression language is a lazy purely functional language and that errors are very hard to debug).
    • I know that by using an internal DSL the language is extensible, but for what purposes is this useful and what can you do it what Nix currently cannot?

      On the other hand, I think the Nix expression language is also extensible in the sense that you can call any process from a derivation function (implementing an operation) and encapsulate derivation into a function with a nice interface. For example, I have used this for Disnix to integrate deployment planning algorithms. Maybe there are different kind of extensions or more efficient integration patterns possible with Guix? So far, I haven't seen anything concrete yet.
  • I also find the integration aspect with Nix interesting. I have seen many language/environment specific package managers, for e.g. Perl, Python, Eclipse and they all solve deployment issues in their own way. Furthermore, they do not offer all the features we care about, such as transactional upgrades and reproducible builds. By making it possible to integrate language specific package managers with Nix, we can remove this burden/annoyance.
  • If GNU packages can be easily packaged in a functional way, it will also make the lives of Nix packagers easier, as we don't have to implement any hacks/workarounds anymore and we can (semi-)automatically convert Nix and Guix expressions.
  • It's also good to have a critical look at the foundations Nix/Nixpkgs, such as the bootstrap. In Guix, this entire process is reimplemented that may yield useful techniques/lessons that we can apply in Nix as well.
  • To have Nix and the purely functional deployment model is the media always good, as we want to break conventional thoughts/ideas.

Apart from these potential benefits and the fact that Guix is far from finished, I currently see no reason to recommend Guix over Nix, or to see any reason using it myself. To me, it looks like a nice experiment, but I still have to be convinced that it adds value, apart from integration with language-specific package managers. The advantages of using an internal DSL approach still have to be proven.

On the other hand, I also think that external DSLs (which the Nix expression language is) have benefits over internal DSLs:

  • A more concise syntax, which is shorter and better comprehensible. However, I have to admit that I'm a bit biased on this, because I only have little hands-on experience with the Scheme programming language and quite some experience with the Nix expression language.
  • Better static consistency checking. External DSLs can produce more understandable error messages, whereas host language "abusement" may create all kinds of complex structures significantly making error reports much more complicated and difficult to grasp.

    Furthermore in an embedded DSL, you can also use the host language to do some unintended operations. For example, we could use Scheme to imperatively modify a variable, that is used by another package, affecting reproducibility (although the build of a derivation itself is pure, thanks to Nix). A packager has to manually take care that no side-effects are specified, while the Nix expression language prevents these side-effects to be programmed.

    Again, I have not done any comparisons with Nix and Guix, but I have worked several years in a research group (which besides me, includes Eelco Dolstra, the author of Nix) in which external DSLs are investigated, such as WebDSL, a domain-specific language for web applications.

    My former colleague Zef Hemel (along with a few others) wrote a series of blog posts covering issues with internal DSLs (one of them: 'When rails fails', covering some problems with internal DSLs in the Ruby programming language, has raised quite some (controversial) attention) and a paper titled: 'Static consistency checking of web applications with WebDSL' reporting on this matter.
  • External DSLs have often smaller dependencies. If an internal DSL (embedded in a host language) is used in a tool, it often needs the entire host language runtime, which for some programming languages is quite big, while we only need a small subset of it. One of the questions we have encountered frequently in the Nix community, is why we didn't use Haskell (a lazy purely functional language) to specify package configurations in Nix. One of the reasons is that the Haskell runtime is quite big and has many (optional) dependencies.

I have also observed a few other things in Guix:

  • The Nix expression language is a lazy purely functional language, whereas Guix uses eager evaluation, although the derivation files that are produced and built are still processed lazily by the Nix worker. In Nix, laziness offers various benefits, such as that only the desired packages and its required dependencies are built, while everything we don't need is not, improving efficiency. In Guix, other means have to be used to achieve this goal and I'm not sure if Guix has a better approach.
  • Guix's license is GPLv3 including the package descriptions, whereas Nix's license is LGPLv2.1 and the Nixpkgs is MIT licensed. Guix has a stronger copyleft than Nix. I don't want to have a debate about these licenses and the copyleft here, but for more information I'd like to redirect readers to an earlier blog post about free and open-source software and to form an opinion on this.

Concluding remarks


In this blog post I have covered GNU Guix, a package manager recently announced by the GNU project, which uses Nix under the roof to achieve its desired non-functional properties, such as transactional upgrades and reproducible builds. Guix differs from Nix, because it offers an internal DSL using Scheme (through GNU Guile), instead of the Nix expression language, an external DSL.

The main reason why I wrote this blog post is that GNU Guix has appeared on many news sites, which often copy details from each other. They do not always check facts, but just copy what others are saying. These news messages may sometimes suggest that GNU Guix is something entirely new and providing revolutionary new features to make deployment reliable and reproducible.

Moreover, they may suggest that these features are exclusive to GNU Guix, as they only mention Nix briefly (or sometimes not at all). These facts are not true -- these properties were already in Nix and Nix has been designed with these goals in mind. Currently, GNU Guix is merely a front-end to Nix and inherits these properties because of that. At some point in the future, this may give people the impression that "Nix is something like GNU Guix", which should be exactly the opposite. Furthermore, I'm a researcher and I have to look at stuff critically.

Finally, I'd like to stress out that there is no schism in the Nix project or that I have anything against the Guix effort, although I'm speaking on behalf of myself and not the entire Nix community. Furthermore, Nix is mentioned on the GNU Guix website and covered more in detail in the GNU Guix presentation and README.

Although I find it an interesting experiment, I don't see any benefits yet in using Guile over the Nix expression language. I have raised many questions in this blog post and its usefulness still has to be proven, in my opinion.

4 comments:

  1. It would really be nice if the Guix project made a point of giving credit to Nix.

    ReplyDelete
  2. Hi Jason,

    I know that the Guix maintainer has no intention to give people the false impression that the purely functional deployment aspects are exclusive to Guix or to create a hostile "fork".

    Although the main website briefly mentions: "Guix is based on the Nix package manager.", the README and the presentation, makes the relationship between Guix and Nix very clear. Furthermore, the Guix maintainer is also a regular Nix-contributor and heavily involved with some crucial aspects in the Nix packages repository.

    It's just how external new sites "sell" the story. They sometimes may give their audience a false impression, because they only briefly mention (or copy) what others were saying, without checking the facts themselves.

    ReplyDelete
  3. In my opinion using an internal DSL often makes it easier to learn because if you are lucky, you are already familiar with the host language, otherwise there is already a ton of learning resources available. Also, the usefulness of the knowledge stretches further than that application.

    I do not have much experience with scheme either, but the guix DSL feels more intuative and less awkard to me. (I have played around a bit with nix expressions before). It doesn't feel worth my time to learn about all the quirks and edge cases for a DSL that is only used for a package manager, but I would gladly do that for Guile.

    There is also the issue of effort duplication, both in designing and implementing the host language. In my humble opinion, the Guile language feels a lot more well designed than the nix DSL. That is not strange because a lot more time,effort and knowledge has been put into the design.

    ReplyDelete