swh.vault.to_disk module#
- swh.vault.to_disk.get_filtered_files_content(storage: StorageInterface, files_data: List[Dict]) Iterator[Dict[str, Any]] [source]#
Retrieve the files specified by files_data and apply filters for skipped and missing contents.
- Parameters:
storage – the storage from which to retrieve the objects
files_data – list of file entries as returned by directory_ls()
- Yields:
The entries given in files_data with a new ‘content’ key that points to the file content in bytes.
The contents can be replaced by a specific message to indicate that they could not be retrieved (either due to privacy policy or because their sizes were too big for us to archive it).
- swh.vault.to_disk.apply_chunked(func, input_list, chunk_size)[source]#
Apply func on input_list divided in chunks of size chunk_size
- class swh.vault.to_disk.DirectoryBuilder(storage: StorageInterface, root: bytes, dir_id: bytes)[source]#
Bases:
object
Reconstructs the on-disk representation of a directory in the storage.
Initialize the directory builder.
- Parameters:
storage – the storage object
root – the path where the directory should be reconstructed
dir_id – the identifier of the directory in the storage