swh.lister.cpan.lister module#

swh.lister.cpan.lister.get_field_value(entry, field_name)[source]#

Splits field_name on ., and use it as path in the nested entry dictionary. If a value does not exist, returns None.

>>> entry = {"_source": {"foo": 1, "bar": {"baz": 2, "qux": [3]}}}
>>> get_field_value(entry, "foo")
1
>>> get_field_value(entry, "bar")
{'baz': 2, 'qux': [3]}
>>> get_field_value(entry, "bar.baz")
2
>>> get_field_value(entry, "bar.qux")
3
swh.lister.cpan.lister.get_module_version(module_name: str, module_version: str | float | int, release_name: str) str[source]#
class swh.lister.cpan.lister.CpanLister(scheduler: SchedulerInterface, url: str = 'https://fastapi.metacpan.org/v1', instance: str = 'cpan', credentials: Dict[str, Dict[str, List[Dict[str, str]]]] | None = None, max_origins_per_page: int | None = None, max_pages: int | None = None, enable_origins: bool = True)[source]#

Bases: StatelessLister[Set[str]]

The Cpan lister list origins from ‘Cpan’, the Comprehensive Perl Archive Network.

LISTER_NAME: str = 'cpan'#
VISIT_TYPE = 'cpan'#
INSTANCE = 'cpan'#
API_BASE_URL = 'https://fastapi.metacpan.org/v1'#
REQUIRED_DOC_FIELDS = ['download_url', 'checksum_sha256', 'distribution', 'version']#
OPTIONAL_DOC_FIELDS = ['date', 'author', 'stat.size', 'name', 'metadata.author']#
ORIGIN_URL_PATTERN = 'https://metacpan.org/dist/{module_name}'#
process_release_page(page: List[Dict[str, Any]])[source]#
get_pages() Iterator[Set[str]][source]#

Yield an iterator which returns ‘page’

get_origins_from_page(module_names: Set[str]) Iterator[ListedOrigin][source]#

Iterate on all pages and yield ListedOrigin instances.