swh.loader.package.pypi.loader module#

class swh.loader.package.pypi.loader.PyPIPackageInfo(url: str, filename: Optional[str], version: str, raw_info: Dict[str, Any], name: str, comment_text: Optional[str], sha256: str, upload_time: str, *, directory_extrinsic_metadata: List[RawExtrinsicMetadataCore] = [], checksums: Dict[str, str] = {})[source]#

Bases: BasePackageInfo

Method generated by attrs for class PyPIPackageInfo.

classmethod from_metadata(metadata: Dict[str, Any], name: str, version: str) PyPIPackageInfo[source]#
extid() Tuple[str, int, bytes][source]#

Returns a unique intrinsic identifier of this package info, or None if this package info is not ‘deduplicatable’ (meaning that we will always load it, instead of checking the ExtID storage to see if we already did)

class swh.loader.package.pypi.loader.PyPILoader(storage: StorageInterface, url: str, **kwargs)[source]#

Bases: PackageLoader[PyPIPackageInfo]

Load pypi origin’s artifact releases into swh archive.

Loader’s constructor. This raises exception if the minimal required

configuration is missing (cf. fn:check method).

  • storage – Storage instance

  • url – Origin url to load data from

visit_type: str = 'pypi'#
info() Dict[source]#

Return the project metadata information (fetched from pypi registry)

get_versions() Sequence[str][source]#

Return the list of all published package versions.


classswh.loader.exception.NotFound error when failing to read the published package versions.


Sequence of published versions

get_default_version() str[source]#

Retrieve the latest release version if any.


Latest version


For package loaders that get extrinsic metadata, returns the authority the metadata are coming from.

get_package_info(version: str) Iterator[Tuple[str, PyPIPackageInfo]][source]#
Given a release version of a package, retrieve the associated

package information for such version.


version – Package version


(branch name, package metadata)

build_release(p_info: PyPIPackageInfo, uncompressed_path: str, directory: bytes) Optional[Release][source]#

Build the release from the archive metadata (extrinsic artifact metadata) and the intrinsic metadata.

  • p_info – Package information

  • uncompressed_path – Artifact uncompressed path on disk

origin: Origin#
loaded_snapshot_id: Optional[bytes]#
parent_origins: Optional[List[Origin]]#

If the given origin is a “forge fork” (ie. created with the “Fork” button of GitHub-like forges), build_extrinsic_origin_metadata() sets this to a list of origins it was forked from; closest parent first.

swh.loader.package.pypi.loader.pypi_api_url(url: str) str[source]#

Compute api url from a project url

  • url (str) – PyPI instance’s url (e.g: https://pypi.org/project/requests)

  • (e.g (This deals with correctly transforming the project's api url) –

  • https – //pypi.org/pypi/requests/json)


api url

swh.loader.package.pypi.loader.extract_intrinsic_metadata(dir_path: str) Dict[source]#
Given an uncompressed path holding the pkginfo file, returns a

pkginfo parsed structure as a dict.

The release artifact contains at their root one folder. For example: $ tar tvf zprint-0.0.6.tar.gz drwxr-xr-x root/root 0 2018-08-22 11:01 zprint-0.0.6/ …


dir_path (str) – Path to the uncompressed directory representing a release artifact from pypi.


the pkginfo parsed structure as a dict if any or None if none was present.

swh.loader.package.pypi.loader.author(data: Dict) Person[source]#
Given a dict of project/release artifact information (coming from

PyPI), returns an author subset.


data (dict) – Representing either artifact information or release information.


swh-model dict representing a person.