swh.storage.algos.origin module#

swh.storage.algos.origin.iter_origins(storage: StorageInterface, limit: int = 10000) Iterator[Origin][source]#

Iterates over origins in the storage.

Parameters:
  • storage – the storage object used for queries.

  • limit – maximum number of origins per page

Yields:

origin model objects from the storage in page of limit origins

swh.storage.algos.origin.origin_get_latest_visit_status(storage: StorageInterface, origin_url: str, type: str | None = None, allowed_statuses: List[str] | None = None, require_snapshot: bool = False) OriginVisitStatus | None[source]#

Get the latest origin visit (and status) of an origin. Optionally, a combination of criteria can be provided, origin type, allowed statuses or if a visit has a snapshot.

If no visit matching the criteria is found, returns None. Otherwise, returns a tuple of origin visit, origin visit status.

Parameters:
  • storage – A storage backend

  • origin – origin URL

  • type – Optional visit type to filter on (e.g git, tar, dsc, svn, hg, npm, pypi, …)

  • allowed_statuses – list of visit statuses considered to find the latest visit. For instance, allowed_statuses=['full'] will only consider visits that have successfully run to completion.

  • require_snapshot – If True, only a visit with a snapshot will be returned.

Returns:

a tuple of (visit, visit_status) model object if the visit and the visit status exist (and match the search criteria), None otherwise.

swh.storage.algos.origin.iter_origin_visits(storage: StorageInterface, origin: str, order: ListOrder = ListOrder.ASC) Iterator[OriginVisit][source]#

Iter over origin visits from an origin

swh.storage.algos.origin.iter_origin_visit_statuses(storage: StorageInterface, origin: str, visit: int, order: ListOrder = ListOrder.ASC) Iterator[OriginVisitStatus][source]#

Iter over origin visit status from an origin visit