swh.storage.algos.dir_iterators module#
- class swh.storage.algos.dir_iterators.DirectoryIterator(storage: StorageInterface, dir_id: bytes | None, base_path: bytes = b'')[source]#
Bases:
object
Helper class used to iterate on a directory tree in a depth-first search way with some additional features:
sibling nodes are iterated in lexicographic order by name
it is possible to skip the visit of sub-directories nodes for efficiency reasons when comparing two trees (no need to go deeper if two directories have the same hash)
- Parameters:
storage (swh.storage.interface.StorageInterface) – instance of swh storage (either local or remote)
dir_id (bytes) – identifier of a root directory
base_path (bytes) – optional base path used when traversing a sub-directory
- current() Dict[str, Any] | None [source]#
- Returns:
The current visited directory entry, i.e. the top element from the top frame
- Return type:
- current_perms() int [source]#
- Returns:
The permissions value of the currently visited directory entry
- current_path() bytes | None [source]#
- Returns:
The absolute path from the root directory of the currently visited directory entry
- next() Dict[str, Any] | None [source]#
Advance the tree iteration by dropping the current visited directory entry from the top frame. If the top frame ends up empty, the operation is recursively applied to remove all empty frames as the tree is climbed up towards its root.
- Returns:
The description of the newly visited directory entry
- swh.storage.algos.dir_iterators.dir_iterator(storage: StorageInterface, dir_id: bytes) DirectoryIterator [source]#
Return an iterator for recursively visiting a directory and its sub-directories. The associated paths are visited in lexicographic depth-first search order.
- Parameters:
storage – an instance of a swh storage
dir_id – a directory identifier
- Returns:
an iterator returning a dict at each iteration step describing a directory entry. A
path
field is added in that dict to store the absolute path of the entry.
- class swh.storage.algos.dir_iterators.Remaining(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
Enum to represent the current state when iterating on both directory trees at the same time.
- NoMoreFiles = 0#
- OnlyToFilesRemain = 1#
- OnlyFromFilesRemain = 2#
- BothHaveFiles = 3#
- class swh.storage.algos.dir_iterators.DoubleDirectoryIterator(storage: StorageInterface, dir_from: bytes | None, dir_to: bytes)[source]#
Bases:
object
Helper class to traverse two directory trees at the same time and compare their contents to detect changes between them.
- Parameters:
storage – instance of swh storage
dir_from – hash identifier of the from directory
dir_to – hash identifier of the to directory
- compare() Dict[str, Any] [source]#
Compare the current iterated directory entries in both iterators and return the comparison status.
- Returns:
same_hash
: indicates if the two entries have the same hashsame_perms
: indicates if the two entries have the same permissionsboth_are_dirs
: indicates if the two entries are directoriesboth_are_files
: indicates if the two entries are regular filesfile_and_dir
: indicates if one of the entry is a directory and the other a regular filefrom_is_empty_dir
: indicates if the from entry is the empty directoryto_is_empty_dir
: indicates if the to entry is the empty directory
- Return type:
The status of the comparison with the following bool values