swh.lister.cpan package#
Submodules#
Module contents#
Cpan lister#
The Cpan lister list origins from cpan.org, the Comprehensive Perl Archive Network. It provides search features via metacpan.org.
As of September 2022 cpan.org list 43675 package names.
Origins retrieving strategy#
To get a list of all package names and their associated release artifacts we call
a first http api endpoint that retrieve results and a _scroll_id
that will
be used to scroll pages through search endpoint.
Page listing#
Each page returns a list of results
which are raw data from api response.
Origins from page#
Origin url is the html page corresponding to a package name on metacpan.org, following this pattern:
"https://metacpan.org/dist/{pkgname}"
Running tests#
Activate the virtualenv and run from within swh-lister directory:
pytest -s -vv --log-cli-level=DEBUG swh/lister/cpan/tests
Testing with Docker#
Change directory to swh/docker then launch the docker environment:
docker compose up -d
Then schedule a Cpan listing task:
docker compose exec swh-scheduler swh scheduler task add -p oneshot list-cpan
You can follow lister execution by displaying logs of swh-lister service:
docker compose logs -f swh-lister