swh.loader.package.debian.loader module

class swh.loader.package.debian.loader.DebianFileMetadata(md5sum: str, name: str, sha256: str, size: int, uri: str)[source]

Bases: object




URL of this specific file

class swh.loader.package.debian.loader.DebianPackageChangelog(person: Dict[str, str], date: str, history: List[Tuple[str, str]])[source]

Bases: object


A dict with fields like, model.Person, except they are str instead of bytes, and ‘email’ is optional.


Date of the changelog entry.


List of tuples (package_name, version)

class swh.loader.package.debian.loader.DebianPackageInfo(url: str, filename: Optional[str], raw_info: Dict[str, Any], files: Dict[str, swh.loader.package.debian.loader.DebianFileMetadata], name: str, version: str, *, directory_extrinsic_metadata: List[swh.loader.package.loader.RawExtrinsicMetadataCore] = [])[source]

Bases: swh.loader.package.loader.BasePackageInfo


Metadata of the files (.deb, .dsc, …) of the package.

classmethod from_metadata(a_metadata: Dict[str, Any], url: str)swh.loader.package.debian.loader.DebianPackageInfo[source]
class swh.loader.package.debian.loader.IntrinsicPackageMetadata(name: str, version: str, changelog: swh.loader.package.debian.loader.DebianPackageChangelog, maintainers: List[Dict[str, str]])[source]

Bases: object

Metadata extracted from a package’s .dsc file.


A list of dicts with fields like, model.Person, except they are str instead of bytes, and ‘email’ is optional.

class swh.loader.package.debian.loader.DebianLoader(url: str, date: str, packages: Mapping[str, Any])[source]

Bases: swh.loader.package.loader.PackageLoader[swh.loader.package.debian.loader.DebianPackageInfo]

Load debian origins into swh archive.

visit_type = 'deb'
get_versions() → Sequence[str][source]

Returns the keys of the packages input (e.g. stretch/contrib/0.7.2-3, etc…)

get_package_info(version: str) → Iterator[Tuple[str, swh.loader.package.debian.loader.DebianPackageInfo]][source]
Given a release version of a package, retrieve the associated

package information for such version.


version – Package version


(branch name, package metadata)

resolve_revision_from(known_package_artifacts: Mapping, p_info: swh.loader.package.debian.loader.DebianPackageInfo) → Optional[bytes][source]

Resolve the revision from a snapshot and an artifact metadata dict.

If the artifact has already been downloaded, this will return the existing revision targeting that uncompressed artifact directory. Otherwise, this returns None.

  • snapshot – Snapshot

  • p_info – Package information


None or revision identifier

download_package(p_info: swh.loader.package.debian.loader.DebianPackageInfo, tmpdir: str) → List[Tuple[str, Mapping]][source]

Contrary to other package loaders (1 package, 1 artifact), p_info.files represents the package’s datafiles set to fetch: - <package-version>.orig.tar.gz - <package-version>.dsc - <package-version>.diff.gz

This is delegated to the download_package function.

uncompress(dl_artifacts: List[Tuple[str, Mapping[str, Any]]], dest: str) → str[source]

Uncompress the artifact(s) in the destination folder dest.

Optionally, this could need to use the p_info dict for some more information (debian).

build_revision(p_info: swh.loader.package.debian.loader.DebianPackageInfo, uncompressed_path: str, directory: bytes) → Optional[swh.model.model.Revision][source]

Build the revision from the archive metadata (extrinsic artifact metadata) and the intrinsic metadata.

  • p_info – Package information

  • uncompressed_path – Artifact uncompressed path on disk


Revision object

swh.loader.package.debian.loader.resolve_revision_from(known_package_artifacts: Mapping, p_info: swh.loader.package.debian.loader.DebianPackageInfo) → Optional[bytes][source]

Given known package artifacts (resolved from the snapshot of previous visit) and the new artifact to fetch, try to solve the corresponding revision.

swh.loader.package.debian.loader.uid_to_person(uid: str) → Dict[str, str][source]

Convert an uid to a person suitable for insertion.


uid – an uid of the form “Name <email@ddress>”


  • name: the name associated to the uid

  • email: the mail associated to the uid

  • fullname: the actual uid input

Return type

a dictionary with the following keys

swh.loader.package.debian.loader.prepare_person(person: Mapping[str, str])swh.model.model.Person[source]

Prepare person for swh serialization…


person dict (A) –


A person ready for storage

swh.loader.package.debian.loader.download_package(p_info: swh.loader.package.debian.loader.DebianPackageInfo, tmpdir: Any) → Mapping[str, Any][source]

Fetch a source package in a temporary directory and check the checksums for all files.

  • p_info – Information on a package

  • tmpdir – Where to download and extract the files to ingest


Dict of swh hashes per filename key

swh.loader.package.debian.loader.dsc_information(p_info: swh.loader.package.debian.loader.DebianPackageInfo) → Tuple[Optional[str], Optional[str]][source]

Retrieve dsc information from a package.


p_info – Package metadata information


Tuple of dsc file’s uri, dsc’s full disk path

swh.loader.package.debian.loader.extract_package(dl_artifacts: List[Tuple[str, Mapping]], dest: str) → str[source]

Extract a Debian source package to a given directory.

Note that after extraction the target directory will be the root of the extracted package, rather than containing it.

  • package – package information dictionary

  • dest – directory where the package files are stored


Package extraction directory

swh.loader.package.debian.loader.get_intrinsic_package_metadata(p_info: swh.loader.package.debian.loader.DebianPackageInfo, dsc_path: str, extracted_path: str)swh.loader.package.debian.loader.IntrinsicPackageMetadata[source]

Get the package metadata from the source package at dsc_path, extracted in extracted_path.

  • p_info – the package information

  • dsc_path – path to the package’s dsc file

  • extracted_path – the path where the package got extracted


a dictionary with the following keys:

  • history: list of (package_name, package_version) tuples parsed from the package changelog

Return type