swh.loader.core.utils module#

swh.loader.core.utils.clean_dangling_folders(dirpath: str, pattern_check: str, log=None) None[source]#
Clean up potential dangling temporary working folder rooted at dirpath. Those

folders must match a dedicated pattern and not belonging to a live pid.

  • dirpath – Path to check for dangling files

  • pattern_check – A dedicated pattern to check on first level directory (e.g swh.loader.mercurial., swh.loader.svn.)

  • log (Logger) – Optional logger

exception swh.loader.core.utils.CloneTimeout[source]#

Bases: Exception

exception swh.loader.core.utils.CloneFailure[source]#

Bases: Exception

swh.loader.core.utils.clone_with_timeout(src: str, dest: str, clone_func: Callable[[], None], timeout: float) None[source]#

Clone a repository with timeout.

  • src – clone source

  • dest – clone destination

  • clone_func – callable that does the actual cloning

  • timeout – timeout in seconds

swh.loader.core.utils.parse_visit_date(visit_date: datetime | str | None) datetime | None[source]#

Convert visit date from either None, a string or a datetime to either None or datetime.

swh.loader.core.utils.compute_hashes(filepath: str, hash_names: List[str] = ['sha256']) Dict[str, str][source]#

Compute checksums dict out of a filepath

swh.loader.core.utils.compute_nar_hashes(filepath: Path, hash_names: List[str] = ['sha256'], is_tarball=True, top_level=True) Dict[str, str][source]#

Compute nar checksums dict out of a filepath (tarball or plain file).

If it’s a tarball, this uncompresses the tarball in a temporary directory to compute the nar hashes (and then cleans it up).

  • filepath – The tarball (if is_tarball is True) or a filepath

  • hash_names – The list of checksums to compute

  • is_tarball – Whether filepath represents a tarball or not

  • top_level – Whether we want to compute the top-level directory (of the tarball) hashes. This is only useful when used with ‘is_tarball’ at True.


The dict of checksums values whose keys are present in hash_names.

swh.loader.core.utils.get_url_body(url: str, **extra_params) bytes[source]#

Basic HTTP client to retrieve information on software package, typically JSON metadata from a REST API.


url (str) – An HTTP URL


NotFound in case of query failures (for some reasons – 404, …)


The associated response’s information

swh.loader.core.utils.download(url: str, dest: str, hashes: Dict = {}, filename: str | None = None, auth: Tuple[str, str] | None = None, extra_request_headers: Dict[str, str] | None = None, timeout: int = 120) Tuple[str, Dict][source]#

Download a remote file from url, and compute swh hashes on it.

  • url – Artifact uri to fetch and hash

  • dest – Directory to write the archive to

  • hashes – Dict of expected hashes (key is the hash algo) for the artifact to download (those hashes are expected to be hex string). The supported algorithms are defined in the swh.model.hashutil.ALGORITHMS set.

  • auth – Optional tuple of login/password (for http authentication service, e.g. deposit)

  • extra_request_headers – Optional dict holding extra HTTP headers to be sent with the request

  • timeout – Value in seconds so the connection does not hang indefinitely (read/connection timeout)

  • ValueError in case of any error when fetching/computing (length,

  • checksums mismatched...)


Tuple of local (filepath, hashes of filepath)

swh.loader.core.utils.release_name(version: str, filename: str | None = None) str[source]#
swh.loader.core.utils.cached_method(f: Callable[[TSelf], TReturn]) Callable[[TSelf], TReturn][source]#