swh.lister.pubdev package#


Module contents:

Pub.dev lister#

The Pubdev lister list origins from pub.dev, the Dart and Flutter packages registry.

The registry provide an http api from where the lister retrieve package names.

As of August 2022 pub.dev list 33535 package names.

Origins retrieving strategy#

To get a list of all package names we call https://pub.dev/api/package-names endpoint. There is no other way for discovery (no archive index, no database dump, no dvcs repository).

Origins from page#

The lister yields all origin urls from a single page.

Getting last update date for each package#

Before sending a listed pubdev origin to the scheduler, we query the https://pub.dev/api/packages/{pkgname} endpoint to get the last update date for a package (date of its latest release). It enables Software Heritage to create new loading task for a package only if it has new releases since last visit.

Running tests#

Activate the virtualenv and run from within swh-lister directory:

pytest -s -vv --log-cli-level=DEBUG swh/lister/pubdev/tests

Testing with Docker#

Change directory to swh/docker then launch the docker environment:

docker-compose up -d

Then schedule a pubdev listing task:

docker compose exec swh-scheduler swh scheduler task add -p oneshot list-pubdev

You can follow lister execution by displaying logs of swh-lister service:

docker compose logs -f swh-lister