swh.lister.golang.lister module#

class swh.lister.golang.lister.GolangStateType(last_seen: datetime.datetime | None = None)[source]#

Bases: object

last_seen: datetime | None = None#

Last timestamp of a package version we have saved. Used as a starting point for an incremental listing.

class swh.lister.golang.lister.GolangLister(scheduler: SchedulerInterface, url: str = 'https://index.golang.org/index', instance: str = 'golang', incremental: bool = False, credentials: Dict[str, Dict[str, List[Dict[str, str]]]] | None = None, max_origins_per_page: int | None = None, max_pages: int | None = None, enable_origins: bool = True)[source]#

Bases: Lister[GolangStateType, List[Dict[str, Any]]]

List all Golang modules and send associated origins to scheduler.

The lister queries the Golang module index, whose documentation can be found at https://index.golang.org

GOLANG_MODULES_INDEX_URL = 'https://index.golang.org/index'#
GOLANG_MODULES_INDEX_LIMIT = 2000#
LISTER_NAME: str = 'golang'#
state_from_dict(d: Dict[str, Any]) GolangStateType[source]#

Convert the state stored in the scheduler backend (as a dict), to the concrete StateType for this lister.

state_to_dict(state: GolangStateType) Dict[str, Any][source]#

Convert the StateType for this lister to its serialization as dict for storage in the scheduler.

Values must be JSON-compatible as that’s what the backend database expects.

finalize()[source]#

Custom hook to finalize the lister state before returning from the main loop.

This method must set updated if the lister has done some work.

If relevant, this method can use :meth`get_state_from_scheduler` to merge the current lister state with the one from the scheduler backend, reducing the risk of race conditions if we’re running concurrent listings.

This method is called in a finally block, which means it will also run when the lister fails.

api_request(url: str) List[str][source]#
get_single_page(since: datetime | None = None) Tuple[List[Dict[str, Any]], datetime | None][source]#

Return a page from the API and the timestamp of its last entry. Since all entries are sorted by chronological order, the timestamp is useful both for pagination and later for incremental runs.

get_pages() Iterator[List[Dict[str, Any]]][source]#

Retrieve a list of pages of listed results. This is the main loop of the lister.

Returns:

an iterator of raw pages fetched from the platform currently being listed.

get_origins_from_page(page: List[Dict[str, Any]]) Iterator[ListedOrigin][source]#

Iterate on all Golang projects and yield ListedOrigin instances.