swh.scanner.data module#

class swh.scanner.data.MerkleNodeInfo[source]#

Bases: dict

Store additional information about Merkle DAG nodes, using SWHIDs as keys

swh.scanner.data.init_merkle_node_info(source_tree: Directory, data: MerkleNodeInfo, provenance: bool) None[source]#

Populate the MerkleNodeInfo with the SWHIDs of the given source tree

The dictionary value are pre-filed with dictionary holding the information about the nodes.

The “known” key is always stored as it is always fetched. The “provenance” key is stored if the provenance parameter is True.

exception swh.scanner.data.NoProvenanceAPIAccess[source]#

Bases: RuntimeError

Raise when the user have not Access to the Provenance API

swh.scanner.data.add_provenance(source_tree: ~swh.model.from_disk.Directory, data: ~swh.scanner.data.MerkleNodeInfo, client: ~swh.web.client.client.WebAPIClient, update_progress: ~typing.Callable[[int, int], None] | None = <function _no_update_progress>)[source]#

Store provenance information about software artifacts retrieved from the Software Heritage graph service.

swh.scanner.data.has_dirs(node: Directory) bool[source]#

Check if the given directory has other directories inside.

swh.scanner.data.get_content_from(node_path: bytes, source_tree: Directory, nodes_data: MerkleNodeInfo) Dict[bytes, dict][source]#

Get content information from the given directory node.

swh.scanner.data.get_git_ignore_patterns(cwd: Path | None)[source]#
swh.scanner.data.get_hg_ignore_patterns(cwd: Path | None)[source]#
swh.scanner.data.get_svn_ignore_patterns(cwd: Path | None)[source]#
swh.scanner.data.vcs_detected(folder_path: Path) bool[source]#
swh.scanner.data.get_vcs_ignore_patterns(cwd: Path | None = None) List[bytes][source]#

Return a list of all patterns to ignore according to the VCS used for the project being scanned, if any.

swh.scanner.data.get_ignore_patterns_templates() Dict[str, Path][source]#

Return a dict where keys are ignore templates names and value a path to the ignore definition file.

swh.scanner.data.parse_ignore_patterns_template(source: Path) List[bytes][source]#

Given a file path to a gitignore template, return an ignore patterns list