swh.loader.package.debian.loader module

exception swh.loader.package.debian.loader.DscCountError[source]

Bases: ValueError

Raised when an unexpected number of .dsc files is seen

class swh.loader.package.debian.loader.DebianFileMetadata(name: str, sha256: str, size: int, uri: str, md5sum: str = '', sha512: str = '')[source]

Bases: object

Method generated by attrs for class DebianFileMetadata.




URL of this specific file

class swh.loader.package.debian.loader.DebianPackageChangelog(person: Dict[str, str], date: str, history: List[Tuple[str, str]])[source]

Bases: object

Method generated by attrs for class DebianPackageChangelog.


A dict with fields like, model.Person, except they are str instead of bytes, and ‘email’ is optional.


Date of the changelog entry.


List of tuples (package_name, version)

class swh.loader.package.debian.loader.DebianPackageInfo(url: str, filename: Optional[str], raw_info: Dict[str, Any], files: Dict[str, swh.loader.package.debian.loader.DebianFileMetadata], name: str, version: str, *, directory_extrinsic_metadata: List[swh.loader.package.loader.RawExtrinsicMetadataCore] = [])[source]

Bases: swh.loader.package.loader.BasePackageInfo

Method generated by attrs for class DebianPackageInfo.


Metadata of the files (.deb, .dsc, …) of the package.

classmethod from_metadata(a_metadata: Dict[str, Any], url: str) swh.loader.package.debian.loader.DebianPackageInfo[source]
extid() Optional[Tuple[str, bytes]][source]

Returns a unique intrinsic identifier of this package info, or None if this package info is not ‘deduplicatable’ (meaning that we will always load it, instead of checking the ExtID storage to see if we already did)

class swh.loader.package.debian.loader.IntrinsicPackageMetadata(name: str, version: str, changelog: swh.loader.package.debian.loader.DebianPackageChangelog, maintainers: List[Dict[str, str]])[source]

Bases: object

Metadata extracted from a package’s .dsc file.

Method generated by attrs for class IntrinsicPackageMetadata.


A list of dicts with fields like, model.Person, except they are str instead of bytes, and ‘email’ is optional.

class swh.loader.package.debian.loader.DebianLoader(storage: swh.storage.interface.StorageInterface, url: str, date: str, packages: Mapping[str, Any], max_content_size: Optional[int] = None)[source]

Bases: swh.loader.package.loader.PackageLoader[swh.loader.package.debian.loader.DebianPackageInfo]

Load debian origins into swh archive.

Debian Loader implementation.

  • url – Origin url (e.g. deb://Debian/packages/cicero)

  • date – Ignored

  • packages

    versioned packages and associated artifacts, example:

      'stretch/contrib/0.7.2-3': {
        'name': 'cicero',
        'version': '0.7.2-3'
        'files': {
          'cicero_0.7.2-3.diff.gz': {
             'md5sum': 'a93661b6a48db48d59ba7d26796fc9ce',
             'name': 'cicero_0.7.2-3.diff.gz',
             'sha256': 'f039c9642fe15c75bed5254315e2a29f...',
             'size': 3964,
             'uri': 'http://d.d.o/cicero_0.7.2-3.diff.gz',
          'cicero_0.7.2-3.dsc': {
            'md5sum': 'd5dac83eb9cfc9bb52a15eb618b4670a',
            'name': 'cicero_0.7.2-3.dsc',
            'sha256': '35b7f1048010c67adfd8d70e4961aefb...',
            'size': 1864,
            'uri': 'http://d.d.o/cicero_0.7.2-3.dsc',
          'cicero_0.7.2.orig.tar.gz': {
            'md5sum': '4353dede07c5728319ba7f5595a7230a',
            'name': 'cicero_0.7.2.orig.tar.gz',
            'sha256': '63f40f2436ea9f67b44e2d4bd669dbab...',
            'size': 96527,
            'uri': 'http://d.d.o/cicero_0.7.2.orig.tar.gz',
      # ...

visit_type: Optional[str] = 'deb'
get_versions() Sequence[str][source]

Returns the keys of the packages input (e.g. stretch/contrib/0.7.2-3, etc…)

get_package_info(version: str) Iterator[Tuple[str, swh.loader.package.debian.loader.DebianPackageInfo]][source]
Given a release version of a package, retrieve the associated

package information for such version.


version – Package version


(branch name, package metadata)

download_package(p_info: swh.loader.package.debian.loader.DebianPackageInfo, tmpdir: str) List[Tuple[str, Mapping]][source]

Contrary to other package loaders (1 package, 1 artifact), p_info.files represents the package’s datafiles set to fetch: - <package-version>.orig.tar.gz - <package-version>.dsc - <package-version>.diff.gz

This is delegated to the download_package function.

uncompress(dl_artifacts: List[Tuple[str, Mapping[str, Any]]], dest: str) str[source]

Uncompress the artifact(s) in the destination folder dest.

Optionally, this could need to use the p_info dict for some more information (debian).

build_revision(p_info: swh.loader.package.debian.loader.DebianPackageInfo, uncompressed_path: str, directory: bytes) Optional[swh.model.model.Revision][source]

Build the revision from the archive metadata (extrinsic artifact metadata) and the intrinsic metadata.

  • p_info – Package information

  • uncompressed_path – Artifact uncompressed path on disk


Revision object

visit_date: datetime.datetime
swh.loader.package.debian.loader.uid_to_person(uid: str) Dict[str, str][source]

Convert an uid to a person suitable for insertion.


uid – an uid of the form “Name <email@ddress>”


  • name: the name associated to the uid

  • email: the mail associated to the uid

  • fullname: the actual uid input

Return type

a dictionary with the following keys

swh.loader.package.debian.loader.prepare_person(person: Mapping[str, str]) swh.model.model.Person[source]

Prepare person for swh serialization…


dict (A person) –


A person ready for storage

swh.loader.package.debian.loader.download_package(p_info: swh.loader.package.debian.loader.DebianPackageInfo, tmpdir: Any) Mapping[str, Any][source]

Fetch a source package in a temporary directory and check the checksums for all files.

  • p_info – Information on a package

  • tmpdir – Where to download and extract the files to ingest


Dict of swh hashes per filename key

swh.loader.package.debian.loader.dsc_information(p_info: swh.loader.package.debian.loader.DebianPackageInfo) Tuple[Optional[str], Optional[str]][source]

Retrieve dsc information from a package.


p_info – Package metadata information


Tuple of dsc file’s uri, dsc’s full disk path

swh.loader.package.debian.loader.extract_package(dl_artifacts: List[Tuple[str, Mapping]], dest: str) str[source]

Extract a Debian source package to a given directory.

Note that after extraction the target directory will be the root of the extracted package, rather than containing it.

  • package – package information dictionary

  • dest – directory where the package files are stored


Package extraction directory

swh.loader.package.debian.loader.get_intrinsic_package_metadata(p_info: swh.loader.package.debian.loader.DebianPackageInfo, dsc_path: str, extracted_path: str) swh.loader.package.debian.loader.IntrinsicPackageMetadata[source]

Get the package metadata from the source package at dsc_path, extracted in extracted_path.

  • p_info – the package information

  • dsc_path – path to the package’s dsc file

  • extracted_path – the path where the package got extracted


a dictionary with the following keys:

  • history: list of (package_name, package_version) tuples parsed from the package changelog

Return type