swh.scanner.model module

class swh.scanner.model.Color(value)[source]

Bases: enum.Enum

An enumeration.

blue = '\x1b[94m'
green = '\x1b[92m'
red = '\x1b[91m'
end = '\x1b[0m'
swh.scanner.model.colorize(text: str, color: swh.scanner.model.Color)[source]
class swh.scanner.model.Tree(path: pathlib.PosixPath, father: Optional[swh.scanner.model.Tree] = None)[source]

Bases: object

Representation of a file system structure

addNode(path: pathlib.PosixPath, swhid: str, known: bool) → None[source]

Recursively add a new path.

show(format) → None[source]

Show tree in different formats

printChildren(isatty: bool, inc: int = 1) → None[source]
printNode(node: Any, isatty: bool, inc: int) → None[source]
property attributes

Get the attributes of the current node grouped by the relative path.


a dictionary containing a path as key and its known/unknown status and the Software Heritage persistent identifier as values.

toDict(dict_nodes={}) → Dict[str, Dict[str, Dict]][source]

Recursively groups the current child nodes inside a dictionary.

For example, if you have the following structure:

root {
subdir: {




The generated dictionary will be:

“root”: {

“swhid”: “…”, “known”: True/False

} “root/subdir”: {

“swhid”: “…”, “known”: True/False

} “root/subdir/file.txt”: {

“swhid”: “…”, “known”: True/False



iterate() → Iterable[Dict[str, Dict]][source]

Recursively iterate through the children of the current node


a dictionary containing a path with its known/unknown status and the Software Heritage persistent identifier

getDirectoriesInfo(root: pathlib.PosixPath) → Dict[pathlib.PosixPath, Tuple[int, int]][source]

Get information about all directories under the given root.


A dictionary with a directory path as key and the relative contents information (the result of count_contents) as values.

count_contents() → Tuple[int, int][source]
Count how many contents are present inside a directory.

If a directory has a pid returns as it has all the contents.


A tuple with the total number of the contents and the number of contents known (the ones that have a persistent identifier).

has_dirs() → bool[source]

Checks if node has directories