swh.loader.mercurial.loader module

This document contains a SWH loader for ingesting repository data from Mercurial version 2 bundle files.

exception swh.loader.mercurial.loader.CloneTimeoutError[source]

Bases: Exception

class swh.loader.mercurial.loader.HgBundle20Loader(url, visit_date=None, directory=None, logging_class='swh.loader.mercurial.Bundle20Loader')[source]

Bases: swh.loader.core.loader.DVCSLoader

Mercurial loader able to deal with remote or local repository.

CONFIG_BASE_FILENAME = 'loader/mercurial'
ADDITIONAL_CONFIG = {'bundle_filename': ('str', 'HG20_none_bundle'), 'cache1_size': ('int', 838860800), 'cache2_size': ('int', 838860800), 'clone_timeout_seconds': ('int', 7200), 'reduce_effort': ('bool', False), 'temp_directory': ('str', '/tmp')}
visit_type = 'hg'
pre_cleanup()[source]

Cleanup potential dangling files from prior runs (e.g. OOM killed tasks)

cleanup()[source]

Clean temporary working directory

get_heads(repo)[source]

Read the closed branches heads (branch, bookmarks) and returns a dict with key the branch_name (bytes) and values the tuple (pointer nature (bytes), mercurial’s node id (bytes)). Those needs conversion to swh-ids. This is taken care of in get_revisions.

prepare_origin_visit(*args, **kwargs) → None[source]

First step executed by the loader to prepare origin and visit references. Set/update self.origin, and optionally self.origin_url, self.visit_date.

static clone_with_timeout(log, origin, destination, timeout)[source]
prepare(*args, **kwargs)[source]
Prepare the necessary steps to load an actual remote or local

repository.

To load a local repository, pass the optional directory parameter as filled with a path to a real local folder.

To load a remote repository, pass the optional directory parameter as None.

Parameters
  • origin_url (str) – Origin url to load

  • visit_date (str/datetime) – Date of the visit

  • directory (str/None) – The local directory to load

has_contents()[source]

Checks whether we need to load contents

has_directories()[source]

Checks whether we need to load directories

has_revisions()[source]

Checks whether we need to load revisions

has_releases()[source]

Checks whether we need to load releases

fetch_data()[source]

Fetch the data from the data source.

get_contents() → Iterable[swh.model.model.BaseContent][source]

Get the contents that need to be loaded.

load_directories()[source]

This is where the work is done to convert manifest deltas from the repository bundle into SWH directories.

get_directories() → Iterable[swh.model.model.Directory][source]

Compute directories to load

get_revisions() → Iterable[swh.model.model.Revision][source]

Compute revisions to load

get_releases() → Iterable[swh.model.model.Release][source]

Get the releases that need to be loaded.

get_snapshot()swh.model.model.Snapshot[source]

Get the snapshot that need to be loaded.

get_fetch_history_result()[source]

Return the data to store in fetch_history.

load_status()[source]

Detailed loading status.

Defaults to logging an eventful load.

Returns: a dictionary that is eventually passed back as the task’s

result to the scheduler, allowing tuning of the task recurrence mechanism.

class swh.loader.mercurial.loader.HgArchiveBundle20Loader(url, visit_date=None, archive_path=None)[source]

Bases: swh.loader.mercurial.loader.HgBundle20Loader

Mercurial loader for repository wrapped within archives.

prepare(*args, **kwargs)[source]
Prepare the necessary steps to load an actual remote or local

repository.

To load a local repository, pass the optional directory parameter as filled with a path to a real local folder.

To load a remote repository, pass the optional directory parameter as None.

Parameters
  • origin_url (str) – Origin url to load

  • visit_date (str/datetime) – Date of the visit

  • directory (str/None) – The local directory to load

cleanup()[source]

Clean temporary working directory