Software Heritage - Development Documentation

Components

Here is brief overview of the most relevant software components in the Software Heritage stack. Each component name is linked to the development documentation of the corresponding Python module.

swh.archiver
orchestrator in charge of guaranteeing that object storage content is pristine and available in a sufficient amount of copies
swh.core
low-level utilities and helpers used by almost all other modules in the stack
swh.deposit
push-based deposit of software artifacts to the archive
swh.docs
developer documentation (used to generate this doc you are reading)
swh.indexer
tools and workers used to crawl the content of the archive and extract derived information from any artifact stored in it
swh.journal
persistent logger of changes to the archive, with publish-subscribe support
swh.lister
collection of listers for all sorts of source code hosting and distribution places (forges, distributions, package managers, etc.)
swh.loader-core
low-level loading utilities and helpers used by all other loaders
swh.loader-debian
loader for Debian source packages
swh.loader-dir
loader for source directories (e.g., expanded tarballs)
swh.loader-git
loader for Git repositories
swh.loader-mercurial
loader for Mercurial repositories
swh.loader-svn
loader for Subversion repositories
swh.loader-tar
loader for source tarballs (including Tar, ZIP and other archive formats)
swh.model
implementation of the Data model to archive source code artifacts
swh.objstorage
content-addressable object storage
swh.scheduler
task manager for asynchronous/delayed tasks, used for recurrent (e.g., listing a forge, loading new stuff from a Git repository) and one-off activities (e.g., loading a specific version of a source package)
swh.storage
abstraction layer over the archive, allowing to access all stored source code artifacts as well as their metadata
swh.vault
implementation of the vault service, allowing to retrieve parts of the archive as self-contained bundles (e.g., individual releases, entire repository snapshots, etc.)
swh.web
Web application(s) to browse the archive, for both interactive (HTML UI) and mechanized (REST API) use

Dependencies

The dependency relationships among the various modules are depicted below.

_images/py-deps-swh.svg

Dependencies among top-level Python modules (click to zoom).