swh.lister.hex.lister module#

swh.lister.hex.lister.get_tar_url(pkg_name: str, release_version: str)[source]#
class swh.lister.hex.lister.HexListerState(page_updated_at: str = '0001-01-01T00:00:00.000000Z')[source]#

Bases: object

The HexLister instance state. This is used for incremental listing.

page_updated_at: str = '0001-01-01T00:00:00.000000Z'#

updated_at value of the last seen package in the page.

class swh.lister.hex.lister.HexLister(scheduler: SchedulerInterface, url: str = 'https://hex.pm/api/', instance: str = 'hex', page_size: int = 100, credentials: Dict[str, Dict[str, List[Dict[str, str]]]] | None = None, max_origins_per_page: int | None = None, max_pages: int | None = None, enable_origins: bool = True)[source]#

Bases: Lister[HexListerState, List[Dict[str, Any]]]

List origins from the Hex.pm

LISTER_NAME: str = 'hex'#
VISIT_TYPE = 'hex'#
HEX_API_URL = 'https://hex.pm/api/'#
PACKAGES_PATH = 'packages/'#
state_from_dict(d: Dict[str, Any]) HexListerState[source]#

Convert the state stored in the scheduler backend (as a dict), to the concrete StateType for this lister.

state_to_dict(state: HexListerState) Dict[str, Any][source]#

Convert the StateType for this lister to its serialization as dict for storage in the scheduler.

Values must be JSON-compatible as that’s what the backend database expects.

get_pages() Iterator[List[Dict[str, Any]]][source]#

Retrieve a list of pages of listed results. This is the main loop of the lister.

Returns:

an iterator of raw pages fetched from the platform currently being listed.

get_origins_from_page(page: List[Dict[str, Any]]) Iterator[ListedOrigin][source]#

Convert a page of HexLister repositories into a list of ListedOrigins

commit_page(page: List[Dict[str, Any]]) None[source]#

Custom hook called after the current page has been committed in the scheduler backend.

This method can be used to update the state after a page of origins has been successfully recorded in the scheduler backend. If the new state should be recorded at the point the lister completes, the updated attribute must be set.

finalize() None[source]#

Custom hook to finalize the lister state before returning from the main loop.

This method must set updated if the lister has done some work.

If relevant, this method can use :meth`get_state_from_scheduler` to merge the current lister state with the one from the scheduler backend, reducing the risk of race conditions if we’re running concurrent listings.

This method is called in a finally block, which means it will also run when the lister fails.