swh.lister.fedora.lister module#

swh.lister.fedora.lister.FedoraPageType#

Each page is a list of packages from a given Fedora (release, edition) pair

alias of Type[Repo]

swh.lister.fedora.lister.get_editions(release: int) List[str][source]#

Get list of editions for a given release.

swh.lister.fedora.lister.get_last_modified(pkg: Package) datetime[source]#

Get timezone aware last modified time in UTC from RPM package metadata.

swh.lister.fedora.lister.get_checksums(pkg: Package) Dict[str, str][source]#

Get checksums associated to rpm archive.

class swh.lister.fedora.lister.FedoraListerState(package_versions: ~typing.Dict[str, ~typing.Set[str]] = <factory>)[source]#

Bases: object

State of Fedora lister

package_versions: Dict[str, Set[str]]#

Dictionary mapping a package name to all the versions found during last listing

class swh.lister.fedora.lister.FedoraLister(scheduler: SchedulerInterface, instance: str = 'fedora', url: str = 'https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/', releases: List[int] = [34, 35, 36], max_origins_per_page: Optional[int] = None, max_pages: Optional[int] = None, enable_origins: bool = True)[source]#

Bases: Lister[FedoraListerState, Type[Repo]]

List source packages for given Fedora releases.

The lister will create a snapshot for each package name from all its available versions.

If a package snapshot is different from the last listing operation, it will be sent to the scheduler that will create a loading task to archive newly found source code.

Parameters:
  • scheduler – instance of SchedulerInterface

  • url – fedora package archives mirror URL

  • releases – list of fedora releases to process

LISTER_NAME: str = 'fedora'#
listed_origins: Dict[FedoraOrigin, ListedOrigin]#

will hold all listed origins info

origins_to_send: Set[FedoraOrigin]#

will hold updated origins since last listing

package_versions: Dict[PkgName, Set[PkgVersion]]#

will contain the lister state after a call to run

state_from_dict(d: Dict[str, Any]) FedoraListerState[source]#

Convert the state stored in the scheduler backend (as a dict), to the concrete StateType for this lister.

state_to_dict(state: FedoraListerState) Dict[str, Any][source]#

Convert the StateType for this lister to its serialization as dict for storage in the scheduler.

Values must be JSON-compatible as that’s what the backend database expects.

page_request(release: int, edition: str) Type[Repo][source]#

Return parsed packages for a given fedora release.

get_pages() Iterator[Type[Repo]][source]#

Return an iterator on parsed fedora packages, one page per (release, edition) pair

origin_url_for_package(package_name: str) str[source]#

Return the origin url for the given package

get_origins_from_page(page: Type[Repo]) Iterator[ListedOrigin][source]#

Convert a page of fedora package sources into an iterator of ListedOrigin.

finalize()[source]#

Custom hook to finalize the lister state before returning from the main loop.

This method must set updated if the lister has done some work.

If relevant, this method can use :meth`get_state_from_scheduler` to merge the current lister state with the one from the scheduler backend, reducing the risk of race conditions if we’re running concurrent listings.

This method is called in a finally block, which means it will also run when the lister fails.