swh.loader.git.from_disk module

class swh.loader.git.from_disk.GitLoaderFromDisk(storage: swh.storage.interface.StorageInterface, url: str, visit_date: Optional[datetime.datetime] = None, directory: Optional[str] = None, save_data_path: Optional[str] = None, max_content_size: Optional[int] = None)[source]

Bases: swh.loader.core.loader.DVCSLoader

Load a git repository from a directory.

visit_type: Optional[str] = 'git'
visit_date: Optional[datetime.datetime]
prepare_origin_visit()[source]

First step executed by the loader to prepare origin and visit references. Set/update self.origin, and optionally self.origin_url, self.visit_date.

prepare()[source]
Second step executed by the loader to prepare some state needed by

the loader.

Raises

NotFound exception if the origin to ingest is not found.

iter_objects()[source]
get_object(oid)[source]
Given an object id, return the object if it is found and not

malformed in some way.

Parameters

oid (bytes) – the object’s identifier

Returns

The object if found without malformation

fetch_data()[source]

Fetch the data from the data source

has_contents()[source]

Checks whether we need to load contents

get_content_ids()[source]

Get the content identifiers from the git repository

get_contents()[source]

Get the contents that need to be loaded

has_directories()[source]

Checks whether we need to load directories

get_directory_ids()[source]

Get the directory identifiers from the git repository

get_directories()[source]

Get the directories that need to be loaded

has_revisions()[source]

Checks whether we need to load revisions

get_revision_ids()[source]

Get the revision identifiers from the git repository

get_revisions()[source]

Get the revisions that need to be loaded

has_releases()[source]

Checks whether we need to load releases

get_release_ids()[source]

Get the release identifiers from the git repository

get_releases()[source]

Get the releases that need to be loaded

get_snapshot()[source]

Turn the list of branches into a snapshot to load

save_data()[source]

We already have the data locally, no need to save it

load_status()[source]

The load was eventful if the current occurrences are different to the ones we retrieved at the beginning of the run

origin: Optional[swh.model.model.Origin]
origin_metadata: Dict[str, Any]
loaded_snapshot_id: Optional[bytes]
class swh.loader.git.from_disk.GitLoaderFromArchive(*args, archive_path, **kwargs)[source]

Bases: swh.loader.git.from_disk.GitLoaderFromDisk

Load a git repository from an archive.

This loader ingests a git repository compressed into an archive. The supported archive formats are .zip and .tar.gz.

From an input tarball named my-git-repo.zip, the following layout is expected in it:

my-git-repo/
├── .git
│   ├── branches
│   ├── COMMIT_EDITMSG
│   ├── config
│   ├── description
│   ├── HEAD
...

Nevertheless, the loader is able to ingest tarballs with the following layouts too:

.
├── .git
│   ├── branches
│   ├── COMMIT_EDITMSG
│   ├── config
│   ├── description
│   ├── HEAD
...

or:

other-repo-name/
├── .git
│   ├── branches
│   ├── COMMIT_EDITMSG
│   ├── config
│   ├── description
│   ├── HEAD
...
project_name_from_archive(archive_path)[source]

Compute the project name from the archive’s path.

prepare()[source]
  1. Uncompress the archive in temporary location.

  2. Prepare as the GitLoaderFromDisk does

  3. Load as GitLoaderFromDisk does

cleanup()[source]

Cleanup the temporary location (if it exists).

visit_date: Optional[datetime.datetime]
origin: Optional[swh.model.model.Origin]
origin_metadata: Dict[str, Any]
loaded_snapshot_id: Optional[bytes]