swh.loader.metadata.base module#

Base module for all metadata fetchers, which are called by the Git loader to get metadata from forges on origins being loaded.

exception swh.loader.metadata.base.InvalidOrigin[source]#

Bases: Exception

class swh.loader.metadata.base.BaseMetadataFetcher(origin: Origin, credentials: Dict[str, Dict[str, List[Dict[str, str]]]] | None, lister_name: str, lister_instance_name: str)[source]#

Bases: object

The base class for a Software Heritage metadata fetchers

Fetchers are hooks used by loader to retrieve extrinsic metadata from forges before archiving repositories.

Each fetcher handles a specific type of forge (not VCS); each fetcher class generally matches a lister class, as they use the same APIs.

Parameters:
  • origin – the origin to retrieve metadata from

  • credentials – This is the same format as for swh.lister.pattern.Lister: dictionary of credentials for all fetchers. The first level identifies the fetcher’s name, the second level the lister instance. The final level is a list of dicts containing the expected credentials for the given instance of that fetcher.

  • session – optional HTTP session to use to send HTTP requests

FETCHER_NAME: str#

The config-friendly name of this fetcher, used to retrieve the first level of credentials.

SUPPORTED_LISTERS: Set[str]#

Set of forge types this metadata fetcher supports. The type names are the same as the names used by listers themselves.

Generally, fetchers have a one-to-one matching with listers, in which case this is set of {FETCHER_NAME}.

session() Session[source]#
metadata_authority() MetadataAuthority[source]#

Return information about the metadata authority that issued metadata we extract from the given origin

get_origin_metadata() List[RawExtrinsicMetadata][source]#

Return a list of metadata objects for the given origin.

get_parent_origins() List[Origin][source]#

If the given origin is a “forge fork” (ie. created with the “Fork” button of GitHub-like forges), returns a list of origins it was forked from; closest parent first.