Nix and Guix#
Nix/Guix is currently archived by Software Heritage:
Developed by Tweag thanks to a grant from the NLnet Foundation
This page documents how Software Heritage archives source packages from the GNU Guix and Nix distributions.
Those distributions provide functional package managers with similar properties (e.g. transactional, declarative up to the operating system, reproducible, …). Definition of packages is dependent on their respective DSL. As it’s not easily parsable nor any listing api existed, community effort was done to provide regular extraction of origins listing as json manifest.
Software Heritage’s swh.lister.nixguix.lister.NixGuix
lister queries respectively those
manifests. As they contain various types of origins, Software Heritage uses various loaders to
ingest with those origins, url targeting a:
simple file:
swh.loader.core.loader.ContentLoader
ingests origin of typecontent
.tarball:
swh.loader.core.loader.TarballDirectoryLoader
ingests origin with typetarball-directory
.Svn repository:
swh.loader.svn.loader.SvnLoader
ingests origin with typesvn
.Svn repository at a specific revision:
swh.loader.svn.directory.SvnExportLoader
ingests origins with typesvn-export
.Git repository:
swh.loader.git.loader.GitLoader
ingests origin with typegit
.Git repository at a specific git commit:
swh.loader.git.directory.GitCheckoutLoader
ingests origin with typegit-checkout
.Mercurial repository:
swh.loader.mercurial.loader.HgLoader
ingests origin with typehg
.Mercurial repository:
swh.loader.mercurial.directory.HgCheckoutLoader>
ingests origin with typehg-checkout
.
Origin URLs match each main url provided in the manifest.
For some cases like content or tarball urls, there can be mirror urls provided. They are used as fallback artifact retrieval when the main url is no longer available.
No extrinsic nor intrinsic metadata collection is happening on the lister’s side.
For some origin visit types (content
, tarball-directory
, svn-export
,
hg-checkout
, git-checkout
), extra intrinsic information like the artifact
checksums (standard
, e.g. sha256, or nar
, specific intrinsic identifier used by
GNU Guix and Nix, see swh.loader.core.nar.Nar
), are transmitted to the
loaders.
During their ingestion, if the checksum(s) do not match, the artifact is rejected and
the visit is marked as failed
. If not, the artifact is ingested.
The resulting snapshot of the visit is targeting either a content for the loading of a
file (visit type content
) either a directory for tarball (visit type
tarball-directory
) and vcs repository at specific commit (git-checkout
,
svn-export
, hg-checkout
). Usual standard snapshot happens for vcs (git
,
svn
, hg
) repository ingestion.
Note also that a new entry is recorded in the ExtID table to map the SWHID content (for a file) or the SWHID directory (for the other kind) ingested to their their original checksum.
Sample:
extid_type |
extid_version |
extid |
target_type |
target |
checksum-sha256 |
1 |
x00001a5b5be28bde9bc8c353afe546d8fe84e49b269a70393c1616957b0e1cce |
directory |
xbe186100480d766ebdf0cfaeac0c90198f4b42e7 |
nar-sha256 |
1 |
x00002584a56a9bce85793515604298f8b3b1e9497e00fc6361a0c2e731c063f3 |
directory |
x1e1ace5b0ef56e188d3cf99059070cc5448d7454 |
Resources: