swh.model.merkle module#
Merkle tree data structure
- class swh.model.merkle.MerkleNode(data=None)[source]#
Bases:
dict
Representation of a node in a Merkle Tree.
A (generalized) Merkle Tree is a tree in which every node is labeled with a hash of its own data and the hash of its children.
In pseudocode:
node.hash = hash(node.data + sum(child.hash for child in node.children))
This class efficiently implements the Merkle Tree data structure on top of a Python
dict
, minimizing hash computations and new data collections when updating nodes.Node data is stored in the
data
attribute, while (named) children are stored as items of the underlying dictionary.Addition, update and removal of objects are instrumented to automatically invalidate the hashes of the current node as well as its registered parents; It also resets the collection status of the objects so the updated objects can be collected.
The collection of updated data from the tree is implemented through the
collect()
function and associated helpers.- update_hash(*, force=False) Any [source]#
Recursively compute the hash of the current node.
- Parameters:
force (bool) – invalidate the cache and force the computation for this node and all children.
- property hash: Any#
The hash of the current node, as calculated by
compute_hash()
.
- abstract compute_hash() Any [source]#
Compute the hash of the current node.
The hash should depend on the data of the node, as well as on hashes of the children nodes.
- get_data(**kwargs)[source]#
Retrieve and format the collected data for the current node, for use by
collect()
.Can be overridden, for instance when you want the collected data to contain information about the child nodes.
- collect_node() Set[MerkleNode] [source]#
Collect the current node if it has not been yet, for use by
collect()
.
- collect() Set[MerkleNode] [source]#
Collect the added and modified nodes in the subtree rooted at self since the last collect operation.
- Returns:
A
set
of collected nodes
- reset_collect()[source]#
Recursively unmark collected nodes in the subtree rooted at self.
This lets the caller use
collect()
again.
- iter_tree(dedup=True) Iterator[MerkleNode] [source]#
Yields all children nodes, recursively. Common nodes are deduplicated by default (deduplication can be turned off setting the given argument ‘dedup’ to False).
- class swh.model.merkle.MerkleLeaf(data=None)[source]#
Bases:
MerkleNode
A leaf to a Merkle tree.
A Merkle leaf is simply a Merkle node with children disabled.