swh.objstorage.backends.pathslicing module#
- swh.objstorage.backends.pathslicing.is_valid_filename(filename: str, algo: Literal['sha1', 'sha256'] = 'sha1')[source]#
Checks that the file points to a valid hexdigest for the given algo.
- class swh.objstorage.backends.pathslicing.PathSlicer(root: str, slicing: str)[source]#
Bases:
object
Helper class to compute a path based on a hash.
Used to compute a directory path based on the object hash according to a given slicing. Each slicing correspond to a directory that is named according to the hash of its content.
For instance a file with SHA1 34973274ccef6ab4dfaaf86599792fa9c3fe4689 will have the following computed path:
0:2/2:4/4:6 : 34/97/32/34973274ccef6ab4dfaaf86599792fa9c3fe4689
0:1/0:5/ : 3/34973/34973274ccef6ab4dfaaf86599792fa9c3fe4689
- Args:
root (str): path to the root directory of the storage on the disk. slicing (str): the slicing configuration.
- check_config()[source]#
Check the slicing configuration is valid.
- Raises:
ValueError – if the slicing configuration is invalid.
- get_directory(hex_obj_id: str) str [source]#
Compute the storage directory of an object.
See also: PathSlicer::get_path
- Parameters:
hex_obj_id – object id as hexlified string.
- Returns:
Absolute path (including root) to the directory that contains the given object id.
- class swh.objstorage.backends.pathslicing.PathSlicingObjStorage(*, root: str = '', compression: Literal['bz2', 'lzma', 'gzip', 'zlib', 'none'] = 'gzip', slicing: str = '', **kwargs)[source]#
Bases:
ObjStorage
Implementation of the ObjStorage API based on the hash of the content.
On disk, an object storage is a directory tree containing files named after their object IDs. An object ID is a checksum of its content, depending on the value of the ID_HASH_ALGO constant (see swh.model.hashutil for its meaning).
To avoid directories that contain too many files, the object storage has a given slicing. Each slicing correspond to a directory that is named according to the hash of its content.
So for instance a file with SHA1 34973274ccef6ab4dfaaf86599792fa9c3fe4689 will be stored in the given object storages :
0:2/2:4/4:6 : 34/97/32/34973274ccef6ab4dfaaf86599792fa9c3fe4689
0:1/0:5/ : 3/34973/34973274ccef6ab4dfaaf86599792fa9c3fe4689
The files in the storage are stored in gzipped compressed format.
- Parameters:
- name: str = 'pathslicing'#
Default objstorage name; can be overloaded at instantiation time giving a ‘name’ argument to the constructor
- get(obj_id: bytes | CompositeObjId) bytes [source]#
- delete(obj_id: bytes | CompositeObjId)[source]#
- list_content(last_obj_id: bytes | CompositeObjId | None = None, limit: int | None = 10000) Iterator[CompositeObjId] [source]#