swh.loader.svn.loader module¶
Loader in charge of injecting either new or existing svn mirrors to swh-storage.
-
class
swh.loader.svn.loader.
SvnLoader
(url, origin_url=None, visit_date=None, destination_path=None, swh_revision=None, start_from_scratch=False)[source]¶ Bases:
swh.loader.core.loader.BaseLoader
Swh svn loader.
The repository is either remote or local. The loader deals with update on an already previously loaded repository.
-
visit_type
: Optional[str] = 'svn'¶
-
swh_revision_hash_tree_at_svn_revision
(revision)[source]¶ Compute and return the hash tree at a given svn revision.
- Parameters
rev (int) – the svn revision we want to check
- Returns
The hash tree directory as bytes.
-
build_swh_revision
(rev, commit, dir_id, parents)[source]¶ Build the swh revision dictionary.
This adds:
the ‘synthetic’ flag to true
the ‘extra_headers’ containing the repository’s uuid and the svn revision number.
- Parameters
rev (dict) – the svn revision
commit (dict) – the commit metadata
dir_id (bytes) – the upper tree’s hash identifier
parents ([bytes]) – the parents’ identifiers
- Returns
The swh revision corresponding to the svn revision.
-
check_history_not_altered
(svnrepo, revision_start: int, swh_rev: swh.model.model.Revision) → bool[source]¶ Given a svn repository, check if the history was modified in between visits.
-
start_from
(start_from_scratch: bool = False) → Tuple[int, int, Dict[int, Tuple[bytes, …]]][source]¶ Determine from where to start the loading.
- Parameters
start_from_scratch – As opposed to start from the last snapshot
- Returns
tuple (revision_start, revision_end, revision_parents)
- Raises
SvnLoaderHistoryAltered – When a hash divergence has been detected (should not happen)
SvnLoaderUneventful – Nothing changed since last visit
-
process_svn_revisions
(svnrepo, revision_start, revision_end, revision_parents) → Iterator[Tuple[List[swh.model.model.Content], List[swh.model.model.SkippedContent], List[swh.model.model.Directory], swh.model.model.Revision]][source]¶ Process svn revisions from revision_start to revision_end.
At each svn revision, apply new diffs and simultaneously compute swh hashes. This yields those computed swh hashes as a tuple (contents, directories, revision).
Note that at every self.check_revision, a supplementary check takes place to check for hash-tree divergence (related T570).
- Yields
tuple (contents, directories, revision) of dict as a dictionary with keys, sha1_git, sha1, etc…
- Raises
ValueError in case of a hash divergence detection –
-
prepare_origin_visit
(*args, **kwargs)[source]¶ First step executed by the loader to prepare origin and visit references. Set/update self.origin, and optionally self.origin_url, self.visit_date.
-
prepare
(*args, **kwargs)[source]¶ Second step executed by the loader to prepare some state needed by the loader.
-
fetch_data
()[source]¶ Fetching svn revision information.
This will apply svn revision as patch on disk, and at the same time, compute the swh hashes.
In effect, fetch_data fetches those data and compute the necessary swh objects. It’s then stored in the internal state instance variables (initialized in _prepare_state).
This is up to store_data to actually discuss with the storage to store those objects.
- Returns
True to continue fetching data (next svn revision), False to stop.
- Return type
bool
-
store_data
()[source]¶ We store the data accumulated in internal instance variable. If the iteration over the svn revisions is done, we create the snapshot and flush to storage the data.
This also resets the internal instance variable state.
-
generate_and_load_snapshot
(revision: Optional[swh.model.model.Revision] = None, snapshot: Optional[swh.model.model.Snapshot] = None) → swh.model.model.Snapshot[source]¶ Create the snapshot either from existing revision or snapshot.
Revision (supposedly new) has priority over the snapshot (supposedly existing one).
- Parameters
revision (dict) – Last revision seen if any (None by default)
snapshot (dict) – Snapshot to use if any (None by default)
- Returns
Optional[Snapshot] The newly created snapshot
-
-
class
swh.loader.svn.loader.
SvnLoaderFromDumpArchive
(url, archive_path, origin_url=None, destination_path=None, swh_revision=None, start_from_scratch=None, visit_date=None)[source]¶ Bases:
swh.loader.svn.loader.SvnLoader
Uncompress an archive containing an svn dump, mount the svn dump as an svn repository and load said repository.
-
prepare
(*args, **kwargs)[source]¶ Second step executed by the loader to prepare some state needed by the loader.
-
visit_date
: Optional[datetime.datetime]¶
-
origin
: Optional[Origin]¶
-
origin_metadata
: Dict[str, Any]¶
-
loaded_snapshot_id
: Optional[Sha1Git]¶
-
-
class
swh.loader.svn.loader.
SvnLoaderFromRemoteDump
(url, origin_url=None, destination_path=None, swh_revision=None, start_from_scratch=False, visit_date=None)[source]¶ Bases:
swh.loader.svn.loader.SvnLoader
Create a subversion repository dump using the svnrdump utility, mount it locally and load the repository from it.
-
get_last_loaded_svn_rev
(svn_url: str) → int[source]¶ Check if the svn repository has already been visited and return the last loaded svn revision number or -1 otherwise.
-
dump_svn_revisions
(svn_url, last_loaded_svn_rev=- 1)[source]¶ Generate a subversion dump file using the svnrdump tool. If the svnrdump command failed somehow, the produced dump file is analyzed to determine if a partial loading is still feasible.
-
prepare
(*args, **kwargs)[source]¶ Second step executed by the loader to prepare some state needed by the loader.
-
visit_date
: Optional[datetime.datetime]¶
-
origin
: Optional[Origin]¶
-
origin_metadata
: Dict[str, Any]¶
-
loaded_snapshot_id
: Optional[Sha1Git]¶
-