swh.lister.hex.lister module#
- class swh.lister.hex.lister.HexListerState(page_updated_at: str = '0001-01-01T00:00:00.000000Z')[source]#
Bases:
object
The HexLister instance state. This is used for incremental listing.
- class swh.lister.hex.lister.HexLister(scheduler: SchedulerInterface, url: str = 'https://hex.pm/api/', instance: str = 'hex', page_size: int = 100, credentials: Dict[str, Dict[str, List[Dict[str, str]]]] | None = None, max_origins_per_page: int | None = None, max_pages: int | None = None, enable_origins: bool = True)[source]#
Bases:
Lister
[HexListerState
,List
[Dict
[str
,Any
]]]List origins from the Hex.pm
- VISIT_TYPE = 'hex'#
- HEX_API_URL = 'https://hex.pm/api/'#
- PACKAGES_PATH = 'packages/'#
- state_from_dict(d: Dict[str, Any]) HexListerState [source]#
Convert the state stored in the scheduler backend (as a dict), to the concrete StateType for this lister.
- state_to_dict(state: HexListerState) Dict[str, Any] [source]#
Convert the StateType for this lister to its serialization as dict for storage in the scheduler.
Values must be JSON-compatible as that’s what the backend database expects.
- get_pages() Iterator[List[Dict[str, Any]]] [source]#
Retrieve a list of pages of listed results. This is the main loop of the lister.
- Returns:
an iterator of raw pages fetched from the platform currently being listed.
- get_origins_from_page(page: List[Dict[str, Any]]) Iterator[ListedOrigin] [source]#
Convert a page of HexLister repositories into a list of ListedOrigins
- commit_page(page: List[Dict[str, Any]]) None [source]#
Custom hook called after the current page has been committed in the scheduler backend.
This method can be used to update the state after a page of origins has been successfully recorded in the scheduler backend. If the new state should be recorded at the point the lister completes, the
updated
attribute must be set.
- finalize() None [source]#
Custom hook to finalize the lister state before returning from the main loop.
This method must set
updated
if the lister has done some work.If relevant, this method can use :meth`get_state_from_scheduler` to merge the current lister state with the one from the scheduler backend, reducing the risk of race conditions if we’re running concurrent listings.
This method is called in a finally block, which means it will also run when the lister fails.