swh.dataset.utils module#
- class swh.dataset.utils.ZSTFile(path: str, mode: str = 'r')[source]#
Bases:
object
Object-like wrapper around a ZST file. Uses a subprocess of the “zstd” command to compress and deflate the objects.
- class swh.dataset.utils.SQLiteSet(db_path)[source]#
Bases:
object
On-disk Set object for hashes using SQLite as an indexer backend. Used to deduplicate objects when processing large queues with duplicates.