swh.deposit.loader package

Submodules

swh.deposit.loader.checker module

class swh.deposit.loader.checker.DepositChecker(client=None)[source]

Bases: object

Deposit checker implementation.

Trigger deposit’s checks through the private api.

check(deposit_check_url)[source]

swh.deposit.loader.loader module

class swh.deposit.loader.loader.DepositLoader(client=None)[source]

Bases: swh.loader.tar.loader.LegacyLocalTarLoader

Deposit loader implementation.

This is a subclass of the :class:TarLoader as the main goal of this class is to first retrieve the deposit’s tarball contents as one and its associated metadata. Then provide said tarball to be loaded by the TarLoader.

This will:

  • retrieves the deposit’s archive locally
  • provide the archive to be loaded by the tar loader
  • clean up the temporary location used to retrieve the archive locally
  • update the deposit’s status accordingly
CONFIG_BASE_FILENAME = 'loader/deposit'
ADDITIONAL_CONFIG = {'extraction_dir': ('str', '/tmp/swh.deposit.loader/')}
load(*, archive_url, deposit_meta_url, deposit_update_url)[source]

Loading logic for the loader to follow:

    1. Call prepare_origin_visit() to prepare the origin and visit we will associate loading data to
    1. Store the actual origin_visit to storage
    1. Call prepare() to prepare any eventual state
    1. Call get_origin() to get the origin we work with and store
  • while True:
      1. Call fetch_data() to fetch the data to store
      1. Call store_data() to store the data
    1. Call cleanup() to clean up any eventual state put in place in prepare() method.
prepare_origin_visit(*, deposit_meta_url, **kwargs)[source]

Prepare the origin visit information.

Parameters:
  • origin (dict) – Dict with keys {url, type}
  • visit_date (str) – Date representing the date of the visit. None by default will make it the current time during the loading process.
prepare(*, archive_url, deposit_meta_url, deposit_update_url)[source]

Prepare the loading by first retrieving the deposit’s raw archive content.

store_metadata()[source]

Storing the origin_metadata during the load processus.

Provider_id and tool_id are resolved during the prepare() method.

post_load(success=True)[source]

Updating the deposit’s status according to its loading status.

If not successful, we update its status to ‘failed’. Otherwise, we update its status to ‘done’ and pass along its associated revision.

cleanup()[source]

Clean up temporary directory where we retrieved the tarball.

swh.deposit.loader.tasks module

Module contents