How to deploy a mirror#
This section describes how to deploy a mirror using the software stack provided by Software Heritage.
A mirror deployment will consists in running several components of the Software Heritage stack:
An instance of the storage (Software Heritage - Storage);
A backend database (PostgreSQL or Cassandra) for the storage;
An instance of the object storage (Software Heritage - Object storage);
A large storage system (zfs or cloud storage) as the objstorage backend;
An instance of the frontend (Software Heritage - Web applications);
[Optional] An instance of the search engine backend (Software Heritage - Search service);
[Optional] An elasticsearch instance as swh-search backend;
[Optional] The vault service and its support tooling (RabbitMQ, Software Heritage - Job scheduler, Software Heritage - Vault, …);
The replayer services:
swh.storage.replayservice (part of the Software Heritage - Storage package)
swh.objstorage.replayer.replayservice (from the Software Heritage - Object storage replayer package)
Each service consists in an HTTP-based RPC served by a gunicorn WSGI server.
This represents a lot of services to configure and orchestrate. In order to help to start the configuration of a mirror, a docker-swarm based deployment solution is provided as a working example of the mirror stack:
It is strongly recommended to start from there in a test environment before planning a production-like deployment.