swh.lister.rpm.lister module#
- swh.lister.rpm.lister.RPMPageType#
Each page is a list of packages for a given (release, component) pair from a Red Hat based distribution.
- class swh.lister.rpm.lister.RPMSourceData[source]#
Bases:
TypedDict
Dictionary holding relevant data for listing RPM source packages.
See content of the lister config directory to get examples of RPM source data for famous RedHat based distributions.
- index_url_templates: List[str]#
List of URL templates to discover source packages metadata, the following variables can be substituted in them:
base_url
,release
andedition
, seestring.Template
for more details about the format. The generated URLs must target directories containing a sub-directory namedrepodata
, which contains packages metadata, in order to be successfully processed by the lister.
- class swh.lister.rpm.lister.RPMListerState(package_versions: ~typing.Dict[str, ~typing.Set[str]] = <factory>)[source]#
Bases:
object
State of RPM lister
- class swh.lister.rpm.lister.RPMLister(scheduler: SchedulerInterface, url: str, instance: str, rpm_src_data: List[RPMSourceData], incremental: bool = False, max_origins_per_page: int | None = None, max_pages: int | None = None, enable_origins: bool = True, credentials: Dict[str, Dict[str, List[Dict[str, str]]]] | None = None)[source]#
Bases:
Lister
[RPMListerState
,Tuple
[str
,str
,Repo
] |None
]List source packages for a Red Hat based linux distribution.
The lister creates a snapshot for each package from all its available versions.
In incremental mode, only packages with different snapshot since the last listing operation will be sent to the scheduler that will create loading tasks to archive newly found source code.
- Parameters:
scheduler – instance of SchedulerInterface
url – Red Hat based distribution info URL
instance – name of Red Hat based distribution
rpm_src_data – list of dictionaries holding data required to list RPM source packages, see examples in the config directory.
incremental – if
True
, only packages with new versions are sent to the scheduler when relisting
- state_from_dict(d: Dict[str, Any]) RPMListerState [source]#
Convert the state stored in the scheduler backend (as a dict), to the concrete StateType for this lister.
- state_to_dict(state: RPMListerState) Dict[str, Any] [source]#
Convert the StateType for this lister to its serialization as dict for storage in the scheduler.
Values must be JSON-compatible as that’s what the backend database expects.
- repo_request(index_url_template: Template, base_url: str, release: str, component: str) Tuple[str, str, Repo] | None [source]#
Return parsed packages for a given distribution release and component.
- get_pages() Iterator[Tuple[str, str, Repo] | None] [source]#
Return an iterator on parsed rpm packages, one page per (release, component) pair.
- origin_url_for_package(package_name: str) str [source]#
Return the origin url for the given package.
- get_origins_from_page(page: Tuple[str, str, Repo] | None) Iterator[ListedOrigin] [source]#
Convert a page of rpm package sources into an iterator of ListedOrigin.
- finalize()[source]#
Custom hook to finalize the lister state before returning from the main loop.
This method must set
updated
if the lister has done some work.If relevant, this method can use :meth`get_state_from_scheduler` to merge the current lister state with the one from the scheduler backend, reducing the risk of race conditions if we’re running concurrent listings.
This method is called in a finally block, which means it will also run when the lister fails.