swh.loader.package.deposit.loader module¶
-
class
swh.loader.package.deposit.loader.
DepositPackageInfo
(url: str, filename: str, raw_info: Dict[str, Any], author_date: datetime.datetime, commit_date: datetime.datetime, client: str, id: int, collection: str, author: swh.model.model.Person, committer: swh.model.model.Person, revision_parents: Tuple[bytes, …], *, directory_extrinsic_metadata: List[swh.loader.package.loader.RawExtrinsicMetadataCore] = [])[source]¶ Bases:
swh.loader.package.loader.BasePackageInfo
dateCreated if any, deposit completed_date otherwise
- Type
codemeta
-
commit_date
¶ datePublished if any, deposit completed_date otherwise
- Type
codemeta
-
id
¶ Internal ID of the deposit in the deposit DB
-
collection
¶ The collection in the deposit; see SWORD specification.
-
revision_parents
¶ Revisions created from previous deposits, that will be used as parents of the revision created for this deposit.
-
classmethod
from_metadata
(metadata: Dict[str, Any], url: str, filename: str) → swh.loader.package.deposit.loader.DepositPackageInfo[source]¶
-
class
swh.loader.package.deposit.loader.
DepositLoader
(url: str, deposit_id: str)[source]¶ Bases:
swh.loader.package.loader.PackageLoader
[swh.loader.package.deposit.loader.DepositPackageInfo
]Load pypi origin’s artifact releases into swh archive.
-
visit_type
= 'deposit'¶
-
get_versions
() → Sequence[str][source]¶ Return the list of all published package versions.
- Returns
Sequence of published versions
For package loaders that get extrinsic metadata, returns the authority the metadata are coming from.
-
get_metadata_fetcher
() → swh.model.model.MetadataFetcher[source]¶ Returns a MetadataFetcher instance representing this package loader; which is used to for adding provenance information to extracted extrinsic metadata, if any.
-
get_package_info
(version: str) → Iterator[Tuple[str, swh.loader.package.deposit.loader.DepositPackageInfo]][source]¶ - Given a release version of a package, retrieve the associated
package information for such version.
- Parameters
version – Package version
- Returns
(branch name, package metadata)
-
download_package
(p_info: swh.loader.package.deposit.loader.DepositPackageInfo, tmpdir: str) → List[Tuple[str, Mapping]][source]¶ Override to allow use of the dedicated deposit client
-
build_revision
(p_info: swh.loader.package.deposit.loader.DepositPackageInfo, uncompressed_path: str, directory: bytes) → Optional[swh.model.model.Revision][source]¶ Build the revision from the archive metadata (extrinsic artifact metadata) and the intrinsic metadata.
- Parameters
p_info – Package information
uncompressed_path – Artifact uncompressed path on disk
- Returns
Revision object
-
get_extrinsic_origin_metadata
() → List[swh.loader.package.loader.RawExtrinsicMetadataCore][source]¶ Returns metadata items, used by build_extrinsic_origin_metadata.
-
load
() → Dict[source]¶ Load for a specific origin the associated contents.
for each package version of the origin
Fetch the files for one package version By default, this can be implemented as a simple HTTP request. Loaders with more specific requirements can override this, e.g.: the PyPI loader checks the integrity of the downloaded files; the Debian loader has to download and check several files for one package version.
Extract the downloaded files By default, this would be a universal archive/tarball extraction.
Loaders for specific formats can override this method (for instance, the Debian loader uses dpkg-source -x).
Convert the extracted directory to a set of Software Heritage objects Using swh.model.from_disk.
Extract the metadata from the unpacked directories This would only be applicable for “smart” loaders like npm (parsing the package.json), PyPI (parsing the PKG-INFO file) or Debian (parsing debian/changelog and debian/control).
On “minimal-metadata” sources such as the GNU archive, the lister should provide the minimal set of metadata needed to populate the revision/release objects (authors, dates) as an argument to the task.
Generate the revision/release objects for the given version. From the data generated at steps 3 and 4.
end for each
Generate and load the snapshot for the visit
Using the revisions/releases collected at step 5., and the branch information from step 0., generate a snapshot and load it into the Software Heritage archive
-
See prior fixme
-
class
swh.loader.package.deposit.loader.
ApiClient
(url, auth: Optional[Mapping[str, str]])[source]¶ Bases:
object
Private Deposit Api client
-
do
(method: str, url: str, *args, **kwargs)[source]¶ - Internal method to deal with requests, possibly with basic http
authentication.
- Parameters
method (str) – supported http methods as in get/post/put
- Returns
The request’s execution output
-
archive_get
(deposit_id: Union[int, str], tmpdir: str, filename: str) → Tuple[str, Dict][source]¶ Retrieve deposit’s archive artifact locally
-
metadata_get
(deposit_id: Union[int, str]) → Dict[str, Any][source]¶ Retrieve deposit’s metadata artifact as json
-
status_update
(deposit_id: Union[int, str], status: str, revision_id: Optional[str] = None, directory_id: Optional[str] = None, snapshot_id: Optional[str] = None, origin_url: Optional[str] = None)[source]¶ Update deposit’s information including status, and persistent identifiers result of the loading.
-