swh.scanner.policy module#
- swh.scanner.policy.source_size(source_tree: Directory)[source]#
return the size of a source tree as the number of nodes it contains
- class swh.scanner.policy.Policy(source_tree: Directory, data: MerkleNodeInfo)[source]#
Bases:
object
- data: MerkleNodeInfo#
information about contents and directories of the merkle tree
- class swh.scanner.policy.WebAPIConnection(contents: List[Content], skipped_contents: List[SkippedContent], directories: List[Directory], client: WebAPIClient)[source]#
Bases:
ArchiveDiscoveryInterface
Use the web APIs to query the archive
- content_missing(contents: List[bytes]) List[bytes] [source]#
List content missing from the archive by sha1
- class swh.scanner.policy.RandomDirSamplingPriority(source_tree: Directory, data: MerkleNodeInfo)[source]#
Bases:
Policy
Check the Merkle tree querying random directories. Set all ancestors to unknown for unknown directories, otherwise set all descendants to known. Finally check all the remaining file contents.