swh.loader.core.utils module#
- swh.loader.core.utils.clean_dangling_folders(dirpath: str, pattern_check: str, log=None) None [source]#
- Clean up potential dangling temporary working folder rooted at dirpath. Those
folders must match a dedicated pattern and not belonging to a live pid.
- Parameters:
dirpath – Path to check for dangling files
pattern_check – A dedicated pattern to check on first level directory (e.g swh.loader.mercurial., swh.loader.svn.)
log (Logger) – Optional logger
- swh.loader.core.utils.clone_with_timeout(src: str, dest: str, clone_func: Callable[[], None], timeout: float) None [source]#
Clone a repository with timeout.
- Parameters:
src – clone source
dest – clone destination
clone_func – callable that does the actual cloning
timeout – timeout in seconds
- swh.loader.core.utils.parse_visit_date(visit_date: datetime | str | None) datetime | None [source]#
Convert visit date from either None, a string or a datetime to either None or datetime.
- swh.loader.core.utils.compute_hashes(filepath: str, hash_names: List[str] = ['sha256']) Dict[str, str] [source]#
Compute checksums dict out of a filepath
- swh.loader.core.utils.compute_nar_hashes(filepath: Path, hash_names: List[str] = ['sha256'], is_tarball=True, top_level=True) Dict[str, str] [source]#
Compute nar checksums dict out of a filepath (tarball or plain file).
If it’s a tarball, this uncompresses the tarball in a temporary directory to compute the nar hashes (and then cleans it up).
- Parameters:
filepath – The tarball (if is_tarball is True) or a filepath
hash_names – The list of checksums to compute
is_tarball – Whether filepath represents a tarball or not
top_level – Whether we want to compute the top-level directory (of the tarball) hashes. This is only useful when used with ‘is_tarball’ at True.
- Returns:
The dict of checksums values whose keys are present in hash_names.
- swh.loader.core.utils.get_url_body(url: str, **extra_params) bytes [source]#
Basic HTTP client to retrieve information on software package, typically JSON metadata from a REST API.
- Parameters:
url (str) – An HTTP URL
- Raises:
NotFound in case of query failures (for some reasons – 404, …)
- Returns:
The associated response’s information
- swh.loader.core.utils.download(url: str, dest: str, hashes: Dict = {}, filename: str | None = None, auth: Tuple[str, str] | None = None, extra_request_headers: Dict[str, str] | None = None) Tuple[str, Dict] [source]#
Download a remote file from url, and compute swh hashes on it.
- Parameters:
url – Artifact uri to fetch and hash
dest – Directory to write the archive to
hashes – Dict of expected hashes (key is the hash algo) for the artifact to download (those hashes are expected to be hex string). The supported algorithms are defined in the
swh.model.hashutil.ALGORITHMS
set.auth – Optional tuple of login/password (for http authentication service, e.g. deposit)
- Raises:
ValueError in case of any error when fetching/computing (length, –
checksums mismatched...) –
- Returns:
Tuple of local (filepath, hashes of filepath)