swh.loader.bzr.loader module#

This document contains a SWH loader for ingesting repository data from Bazaar or Breezy.

exception swh.loader.bzr.loader.UnknownRepositoryFormat[source]#

Bases: Exception

The repository we’re trying to load is using an unknown format. It’s possible (though unlikely) that a new format has come out, we should check before dismissing the repository as broken or unsupported.

class swh.loader.bzr.loader.BzrDirectory(data=None)[source]#

Bases: Directory

A more practical directory to create missing parent directories when adding a path.

get(path: bytes, default: T | None = None) Content | BzrDirectory | T | None[source]#

Return the value for key if key is in the dictionary, else default.

swh.loader.bzr.loader.sort_changes(change: TreeChange) str[source]#

Key function for sorting the changes by path.

Sorting allows us to group the folders together (for example “b”, then “a/a”, then “a/b”). Reversing this sort in the sorted() call will make it so the files appear before the folder (“a/a”, then “a”) if the folder has changed. This removes a bug where the order of operations is:

  • “a” goes from directory to file, removing all of its subtree

  • “a/a” is removed, but our structure has already forgotten it

class swh.loader.bzr.loader.BazaarLoader(storage: StorageInterface, url: str, directory: str | None = None, visit_date: datetime | None = None, temp_directory: str = '/tmp', clone_timeout_seconds: int = 7200, check_revision: int = 0, **kwargs: Any)[source]#

Bases: BaseLoader

Loads a Bazaar repository

visit_type: str = 'bzr'#
pre_cleanup() None[source]#

As a first step, will try and check for dangling data to cleanup. This should do its best to avoid raising issues.

prepare() None[source]#

Second step executed by the loader to prepare some state needed by the loader.

load_status() Dict[str, str][source]#

Detailed loading status.

Defaults to logging an eventful load.

Returns: a dictionary that is eventually passed back as the task’s

result to the scheduler, allowing tuning of the task recurrence mechanism.

cleanup() None[source]#

Last step executed by the loader.

get_branch() Branch[source]#
run_upgrade()[source]#

Upgrade both repository and branch to the most recent supported version to be compatible with the loader.

fetch_data() bool[source]#

Fetch the data from the source the loader is currently loading

Returns:

a value that is interpreted as a boolean. If True, fetch_data needs to be called again to complete loading.

store_data() None[source]#

Store fetched data in the database.

store_revision(bzr_rev: Revision) None[source]#
store_directories(bzr_rev: Revision) bytes[source]#

Store a revision’s directories.

store_release(name: bytes, target: bytes) bytes[source]#

Store a release given its name and its target.

Parameters:
  • name – name of the release.

  • target – sha1_git of the target revision.

Returns:

the sha1_git of the stored release.

store_content(bzr_rev: Revision, file_path: str, kind: str, executable: bool, size: int, symlink_target: str | None = None) Content[source]#
property tags: Dict[bytes, bytes] | None#