Frequently Asked Questions
Generally, only Python and basic Git knowledge are required to contribute. Other than that, it really depends on what technical areas you want to work on.
For student internships, the internships page details specific prerequisites needed to pick up a topic.
Feel free to contact us via our development channels to inquiry about specific skills needed to work on any topic of your interest.
Python 3.7 or newer is required. See the developer setup documentation for more details.
We recommend you read the top links listed at from the documentation home page in order: getting started, contributing, and architecture overview, as well as the data model.
For hacking on the Software Heritage code base you should start from the Developer setup tutorial.
Either way, feel free to contact our developers through any of the development channels, we would love to work with you.
The Run your own Software Heritage tutorial shows how to run a local instance of the Software Heritage software infrastructure, using Docker.
You can setup a job on your local machine, for this you can schedule a listing task for example. Doing so on small forge, will allow you to load some repositories.
Or you can also trigger directly loading from the cli.
See the “Managing tasks” chapter in the Docker environment documentation.
We cannot right now. Stay either anonymous or use the user “test” (password “test”) or the user ambassador (password “ambassador”).
Please report it on our bug tracking system. First create an account, then create a bug report using the “Create task” button. You should get some feedback within a week (at least someone triaging your issue). If not, get in touch with us to make sure we did not miss it.
Yes, on your first diff, you will have to sign such document. As long as it’s not signed, your diff content won’t be visible.
Mostly run tox (or pytest) to run the unit tests suite. When you will propose a patch in our forge, the continuous integration factory will trigger a build (using tox as well).
It’s left at the developer’s discretion. Mostly people hack on their feature, then propose a diff from a git branch or directly from the master branch. There is no imperative. The only imperative is that for a feature to be packaged and deployed, it needs to land first in the master branch.
Any new feature should include documentation in the form of comments and/or docstrings. Ideally, they should also be documented in plain English in the repository’s docs/ folder if relevant to a single package, or in the main swh-docs repository if it is a transversal feature.
Release is mostly done: - first in docker (somewhat as part of the development process) - secondly packaged and deployed on staging (mostly) - thirdly the same package is deployed on production
When a functionality is ready (tests ok, landed in master, docker run ok), the module is tagged. The tag is pushed. This triggers a packaging build process. When the package is ready, depending on the module , sysadms deploy the package with the help of puppet.
 swh-web module is mostly automatic. Other modules are not yet automatic as some internal state migration (dbs) often enters the release cycle and due to the data volume, that may need human intervention.