swh.loader.git.from_disk module#

class swh.loader.git.from_disk.GitLoaderFromDisk(*args, **kwargs)[source]#

Bases: BaseGitLoader

Load a git repository from a directory.

visit_type: str = 'git'#
prepare()[source]#
Second step executed by the loader to prepare some state needed by

the loader.

Raises

NotFound exception if the origin to ingest is not found.

iter_objects()[source]#
get_object(oid)[source]#
Given an object id, return the object if it is found and not

malformed in some way.

Parameters:

oid (bytes) – the object’s identifier

Returns:

The object if found without malformation

fetch_data()[source]#

Fetch the data from the data source

has_contents()[source]#

Checks whether we need to load contents

get_content_ids()[source]#

Get the content identifiers from the git repository

get_contents()[source]#

Get the contents that need to be loaded

has_directories()[source]#

Checks whether we need to load directories

get_directory_ids()[source]#

Get the directory identifiers from the git repository

get_directories()[source]#

Get the directories that need to be loaded

has_revisions()[source]#

Checks whether we need to load revisions

get_revision_ids()[source]#

Get the revision identifiers from the git repository

get_revisions()[source]#

Get the revisions that need to be loaded

has_releases()[source]#

Checks whether we need to load releases

get_release_ids()[source]#

Get the release identifiers from the git repository

get_releases()[source]#

Get the releases that need to be loaded

get_snapshot()[source]#

Turn the list of branches into a snapshot to load

save_data()[source]#

We already have the data locally, no need to save it

load_status()[source]#

The load was eventful if the current occurrences are different to the ones we retrieved at the beginning of the run

class swh.loader.git.from_disk.GitLoaderFromArchive(*args, **kwargs)[source]#

Bases: GitLoaderFromDisk

Load a git repository from an archive.

This loader ingests a git repository compressed into an archive. The supported archive formats are .zip and .tar.gz.

From an input tarball named my-git-repo.zip, the following layout is expected in it:

my-git-repo/
├── .git
│   ├── branches
│   ├── COMMIT_EDITMSG
│   ├── config
│   ├── description
│   ├── HEAD
...

Nevertheless, the loader is able to ingest tarballs with the following layouts too:

.
├── .git
│   ├── branches
│   ├── COMMIT_EDITMSG
│   ├── config
│   ├── description
│   ├── HEAD
...

or:

other-repo-name/
├── .git
│   ├── branches
│   ├── COMMIT_EDITMSG
│   ├── config
│   ├── description
│   ├── HEAD
...
project_name_from_archive(archive_path)[source]#

Compute the project name from the archive’s path.

prepare()[source]#
  1. Uncompress the archive in temporary location.

  2. Prepare as the GitLoaderFromDisk does

  3. Load as GitLoaderFromDisk does

cleanup()[source]#

Cleanup the temporary location (if it exists).