Source code for swh.lister.puppet

# Copyright (C) 2022  The Software Heritage developers
# See the AUTHORS file at the top-level directory of this distribution
# License: GNU General Public License version 3, or any later version
# See top-level LICENSE file for more information

Puppet lister

The Puppet lister list origins from `Puppet Forge`_.
Puppet Forge is a package manager for Puppet modules.

As of September 2022 `Puppet Forge`_ list 6917 package names.

Origins retrieving strategy

To get a list of all package names we call an `http api endpoint`_  which have a
`getModules`_ endpoint.
It returns a paginated list of results and a `next` url.

The api follow `OpenApi 3.0 specifications`.

The lister is incremental using ``with_release_since`` api argument whose value is an
iso date set regarding the last time the lister has been executed, stored as

Page listing

Each page returns a list of ``results`` which are raw data from api response.
The results size is 100 as 100 is the maximum limit size allowed by the api.

Origins from page

The lister yields one hundred origin url per page.

Origin url is the html page corresponding to a package name on the forge, following
this pattern::


For each origin `last_update` is set via the module "updated_at" value.
As the api also returns all existing versions for a package, we build an `artifacts`
dict in `extra_loader_arguments` with the archive tarball corresponding to each
existing versions.

Example for ``file_concat`` module located at

        "artifacts": [
                "url": "",  # noqa: B950
                "version": "1.0.1",
                "filename": "electrical-file_concat-1.0.1.tar.gz",
                "last_update": "2015-04-17T01:03:46-07:00",
                "checksums": {
                    "md5": "74901a89544134478c2dfde5efbb7f14",
                    "sha256": "15e973613ea038d8a4f60bafe2d678f88f53f3624c02df3157c0043f4a400de6",  # noqa: B950
                "url": "",  # noqa: B950
                "version": "1.0.0",
                "filename": "electrical-file_concat-1.0.0.tar.gz",
                "last_update": "2015-04-09T12:03:13-07:00",
                "checksums": {
                    "length": 13289,

Running tests

Activate the virtualenv and run from within swh-lister directory::

   pytest -s -vv --log-cli-level=DEBUG swh/lister/puppet/tests

Testing with Docker

Change directory to swh/docker then launch the docker environment::

   docker compose up -d

Then schedule a Puppet listing task::

   docker compose exec swh-scheduler swh scheduler task add -p oneshot list-puppet

You can follow lister execution by displaying logs of swh-lister service::

   docker compose logs -f swh-lister

.. _Puppet Forge:
.. _http api endpoint:
.. _getModules:


[docs] def register(): from .lister import PuppetLister return { "lister": PuppetLister, "task_modules": ["%s.tasks" % __name__], }