swh.scrubber.origin_locator module#

Lists corrupt objects in the scrubber database, and lists candidate origins to recover them from.

swh.scrubber.origin_locator.get_origins(graph: RemoteGraphClient, storage: StorageInterface, swhid: CoreSWHID) Iterable[str][source]#
class swh.scrubber.origin_locator.OriginLocator(db: ScrubberDb, graph: RemoteGraphClient, storage: StorageInterface, start_object: CoreSWHID, end_object: CoreSWHID)[source]#

Bases: object

Reads a chunk of corrupt objects in the swh-scrubber database, then writes to the same database a list of origins they might be recovered from.

db: ScrubberDb#

Database to read from and write to.

graph: RemoteGraphClient#
storage: StorageInterface#

Used to resolve origin SHA1s to URLs.

start_object: CoreSWHID#

Minimum SWHID to check (in alphabetical order)

end_object: CoreSWHID#

Maximum SWHID to check (in alphabetical order)

handle_corrupt_object(corrupt_object: CorruptObject, cur: cursor) None[source]#