swh.lister.gitweb.lister module#
- class swh.lister.gitweb.lister.GitwebLister(scheduler: SchedulerInterface, url: str | None = None, instance: str | None = None, base_git_url: str | None = None, credentials: Dict[str, Dict[str, List[Dict[str, str]]]] | None = None, max_origins_per_page: int | None = None, max_pages: int | None = None, enable_origins: bool = True)[source]#
Bases:
StatelessLister
[List
[Dict
[str
,Any
]]]Lister class for Gitweb repositories.
This lister will retrieve the list of published git repositories by parsing the HTML page(s) of the index retrieved at url.
Lister class for Gitweb repositories.
- Parameters:
url – Root URL of the Gitweb instance, i.e. url of the index of published git repositories on this instance. Defaults to
https://instance
if unset.instance – Name of gitweb instance. Defaults to url’s network location if unset.
base_git_url – Base URL to clone a git project hosted on the Gitweb instance, should only be used if the clone URLs cannot be found when scraping project page or cannot be easily derived from the root URL of the instance