swh.lister.arch.lister module#
- class swh.lister.arch.lister.ArchLister(scheduler: SchedulerInterface, url: str = 'https://archlinux.org', instance: str = 'arch', credentials: Dict[str, Dict[str, List[Dict[str, str]]]] | None = None, max_origins_per_page: int | None = None, max_pages: int | None = None, enable_origins: bool = True, flavours: Dict[str, Any] = {'arm': {'archs': ['armv7h', 'aarch64'], 'base_api_url': '', 'base_archive_url': '', 'base_info_url': 'https://archlinuxarm.org', 'base_mirror_url': 'https://uk.mirror.archlinuxarm.org', 'repos': ['core', 'extra', 'community']}, 'official': {'archs': ['x86_64'], 'base_api_url': 'https://archlinux.org', 'base_archive_url': 'https://archive.archlinux.org', 'base_info_url': 'https://archlinux.org', 'base_mirror_url': '', 'repos': ['core', 'extra', 'community']}})[source]#
Bases:
StatelessLister
[List
[Dict
[str
,Any
]]]List Arch linux origins from ‘core’, ‘extra’, and ‘community’ repositories
For ‘official’ Arch Linux it downloads core.tar.gz, extra.tar.gz and community.tar.gz from https://archive.archlinux.org/repos/last/ extract to a temp directory and then walks through each ‘desc’ files.
Each ‘desc’ file describe the latest released version of a package and helps to build an origin url from where scrapping artifacts metadata.
For ‘arm’ Arch Linux it follow the same discovery process parsing ‘desc’ files. The main difference is that we can’t get existing versions of an arm package because https://archlinuxarm.org does not have an ‘archive’ website or api.
- VISIT_TYPE = 'arch'#
- INSTANCE = 'arch'#
- BASE_URL = 'https://archlinux.org'#
- ARCH_PACKAGE_URL_PATTERN = '{base_url}/packages/{repo}/{arch}/{pkgname}'#
- ARCH_PACKAGE_VERSIONS_URL_PATTERN = '{base_url}/packages/{pkgname[0]}/{pkgname}'#
- ARCH_PACKAGE_DOWNLOAD_URL_PATTERN = '{base_url}/packages/{pkgname[0]}/{pkgname}/{filename}'#
- ARCH_API_URL_PATTERN = '{base_url}/packages/{repo}/{arch}/{pkgname}/json'#
- ARM_PACKAGE_URL_PATTERN = '{base_url}/packages/{arch}/{pkgname}'#
- ARM_PACKAGE_DOWNLOAD_URL_PATTERN = '{base_url}/{arch}/{repo}/{filename}'#
- scrap_package_versions(name: str, repo: str, base_url: str) List[Dict[str, Any]] [source]#
Given a package ‘name’ and ‘repo’, make an http call to origin url and parse its content to get package versions artifacts data. That method is suitable only for ‘official’ Arch Linux, not ‘arm’.
- Parameters:
name – Package name
repo – The repository the package belongs to (one of self.repos)
- Returns:
A list of dict of version
Example:
[ {"url": "https://archive.archlinux.org/packages/d/dialog/dialog-1:1.3_20190211-1-x86_64.pkg.tar.xz", # noqa: B950 "arch": "x86_64", "repo": "core", "name": "dialog", "version": "1:1.3_20190211-1", "filename": "dialog-1:1.3_20190211-1-x86_64.pkg.tar.xz", "last_modified": "2019-02-13T08:36:00"}, ]
- get_repo_archive(url: str, destination_path: Path) Path [source]#
Given an url and a destination path, retrieve and extract .tar.gz archive which contains ‘desc’ file for each package. Each .tar.gz archive corresponds to an Arch Linux repo (‘core’, ‘extra’, ‘community’).
- Parameters:
url – url of the .tar.gz archive to download
destination_path – the path on disk where to extract archive
- Returns:
a directory Path where the archive has been extracted to.
- parse_desc_file(path: Path, repo: str, base_url: str, dl_url_fmt: str) Dict[str, Any] [source]#
Extract package information from a ‘desc’ file. There are subtle differences between parsing ‘official’ and ‘arm’ des files
- Parameters:
path – A path to a ‘desc’ file on disk
repo – The repo the package belongs to
- Returns:
A dict of metadata
Example:
{'api_url': 'https://archlinux.org/packages/core/x86_64/dialog/json', 'arch': 'x86_64', 'base': 'dialog', 'builddate': '1650081535', 'csize': '203028', 'desc': 'A tool to display dialog boxes from shell scripts', 'filename': 'dialog-1:1.3_20220414-1-x86_64.pkg.tar.zst', 'isize': '483988', 'license': 'LGPL2.1', 'md5sum': '06407c0cb11c50d7bf83d600f2e8107c', 'name': 'dialog', 'packager': 'Evangelos Foutras <foutrelis@archlinux.org>', 'pgpsig': 'pgpsig content xxx', 'project_url': 'https://invisible-island.net/dialog/', 'provides': 'libdialog.so=15-64', 'repo': 'core', 'sha256sum': 'ef8c8971f591de7db0f455970ef5d81d5aced1ddf139f963f16f6730b1851fa7', 'url': 'https://archive.archlinux.org/packages/.all/dialog-1:1.3_20220414-1-x86_64.pkg.tar.zst', # noqa: B950 'version': '1:1.3_20220414-1'}