swh.loader.core.utils module#

swh.loader.core.utils.clean_dangling_folders(dirpath: str, pattern_check: str, log=None) None[source]#
Clean up potential dangling temporary working folder rooted at dirpath. Those

folders must match a dedicated pattern and not belonging to a live pid.

Parameters:
  • dirpath – Path to check for dangling files

  • pattern_check – A dedicated pattern to check on first level directory (e.g swh.loader.mercurial., swh.loader.svn.)

  • log (Logger) – Optional logger

exception swh.loader.core.utils.CloneTimeout[source]#

Bases: Exception

exception swh.loader.core.utils.CloneFailure[source]#

Bases: Exception

swh.loader.core.utils.clone_with_timeout(src: str, dest: str, clone_func: Callable[[], None], timeout: float) None[source]#

Clone a repository with timeout.

Parameters:
  • src – clone source

  • dest – clone destination

  • clone_func – callable that does the actual cloning

  • timeout – timeout in seconds

swh.loader.core.utils.parse_visit_date(visit_date: datetime | str | None) datetime | None[source]#

Convert visit date from either None, a string or a datetime to either None or datetime.

swh.loader.core.utils.compute_nar_hashes(filepath: Path, hash_names: List[str] = ['sha256'], is_tarball=True, top_level=True) Dict[str, str][source]#

Compute nar checksums dict out of a filepath (tarball or plain file).

If it’s a tarball, this uncompresses the tarball in a temporary directory to compute the nar hashes (and then cleans it up).

Parameters:
  • filepath – The tarball (if is_tarball is True) or a filepath

  • hash_names – The list of checksums to compute

  • is_tarball – Whether filepath represents a tarball or not

  • top_level – Whether we want to compute the top-level directory (of the tarball) hashes. This is only useful when used with ‘is_tarball’ at True.

Returns:

The dict of checksums values whose keys are present in hash_names.